-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hibernate Reactive hangs on insert #16255
Comments
/cc @DavideD, @Sanne, @gavinking |
org.hibernate.reactive.pool.impl.ProxyConnection#withConnection hits the line Because this exception is thrown rather than returning a ComplectionStage that has failed with IllegalStateException everything locks up permanently as the generator CS never completes. |
I will have a look |
I cannot replicate the issue with the latest quickstart in Could you try again and let me know the |
I'll let Davide look, but a suggestion: the error message smells of threading issues again. |
So I had another look and I am not sure if it is that IllegalStateException I mentioned, I saw it hit that when I stepped through in a debugger but when I just set a breakpoint there it does not seem to be hit. It still locks up the the queue just keeps growing indefinitely though. |
Could this be the cause: https://github.com/hibernate/hibernate-reactive/blob/eceb57100021ea3b72f8ee78580b993349358b5f/hibernate-reactive-core/src/main/java/org/hibernate/reactive/id/impl/BlockingIdentifierGenerator.java#L82 While identifiers are being fetched connections from every event loop thread are added to the list. When the CS completes then they all call generate again from the event loop thread that fetched the identifier, which may be different to the actual event loop thread of the connection. I am pretty sure having multiple threads involved will not end well, although I am not sure if the exact failure mode (assuming this is even correct). Also the current design does not seem great when you are defaulting to an increment of 1. |
I had the same problem. The moment more than one query ends in a database insert, the query and the http-request hangs. |
So the question is why, precisely, is this wrong? Perhaps we do the work in an incorrect tx context? |
Wait, @stuartwdouglas are you saying that somewhere we are maintaining an association of threads to vert.x connections? Is this to do with the workarounds @Sanne had to do for the issues with the vert.x connection pool (prior to 4.1)? |
(I don't now remember precisely what it was we decided we had to do as a (temporary) solution for that issue.) |
Every Vert.x connection has an associated thread, which may be different to the other connections threads. In this code when the future is complete it dispatches on the current thread, which will always be a vert.x thread but not necessarily the vert.x thread for the connection that was waiting. Usually this causes problems in the form of IO issues, where the connections actual thread receives an IO event at the same time this other thread is still working and you now have two threads working on stuff that is designed to be single threaded. |
Right. So this is the thing that's supposed to be fixed with the new connection pool in vert.x 4.1. (I had thought it was coming in 4.0 but Sanne corrected me when we chatted last week.) So I'm actually struggling to see how we can even in principle implement the behavior we need on 4.0. |
I guess either dispatch back to the original thread, or have one ID generator per thread, so each thread allocates from its own ID range (which will perform better, but means ID's won't be sequential). |
Well I personally have no idea how to despatch to a specific thread. I guess maybe there is some API for that somewhere. As for your second suggestion: that was how I wrote it originally (twice). But the thing is it means that in highly concurrent systems you waste a lot of hi values, and everyone hates that. So I changed it. |
Excuse me, that's wrong, I didn't have a generator per thread, I just let the retrieval of hi values happen concurrently. Your suggestion is a little better, perhaps, but kinda hard to implement in HR itself, which doesn't really know anything about threads. |
My suggestion is that we can't have users of HR which aren't already running on the vertx thread, so making "the current thread" and the "io thread" the same. I've said this before but others said that such restriction is too strong, but I believe we have to as such issues are otherwise going to manifest in many unpredictable ways. |
But isn't this just a temporary issue until vert.x 4.1? |
'an IO thread' is a kinda useless restriction, it needs to be 'the IO thread' that belongs to the current connection. It also needs to be the same as the HTTP and messaging IO threads. AFAIK I think Vert.x will mostly handle this, so it's only situations like this where it is an issue. In the HTTP stuff I do with Vert.x you can cast to the underlying connection impl to get the event loop, and I thought Vert.x 4 had a more generic Context concept that would help with this, although it is late and I won't be able to look into the details until tomorrow. |
Also it is a pretty serious issue, as even a tiny bit of load break HR and then inserts just stop working, and I am not 100% sure that my analysis is correct because I did not spend a lot of time on it. It is definitely hanging in the ID generator under even tiny amounts of load, but I have not verified that my thread theory is correct, it could be something else. |
I suspect not, but we can see how it behaves when we have 4.1. @DavideD is going to work on more extensive integration tests, this needs to be bullet proof. |
I might be being dense, but i'm still not convinced that this issue is about threads. I don't think a The thing that is especially confusing me is that in principle the reactive session is keeping a hard reference to the connection it's using. We're not relying on the thread to obtain our connection. So that's why I keep asking: isn't this a problem with the Vert.x pre-4.1 approach to connection pooling? I mean who, precisely, is making the assumption that thread=connection here? Quarkus? HR? vert.x? I think it's vert.x to blame, not HR. But i'm not sure. |
I just hit this in the debugger again and I don't think this is about threads as such, although if the model is that you can only operate on the IO thread for a given connection I do think this has the potential to cause problems. I see the following stack trace being the problem when it hits that IllegalStateException:
|
So it his this because the connection is closed, and the closed connection is still awaiting an ID. Because the calling code does not guard again this throwing (rather than returning a failed CompletionStage) the whole system locks up. |
The EM is closed because the RESTEasy Reactive request is considered completed, before the ID has been generated, which seems very odd. |
Ok, the bug is actually ridiculously simple. When the queue is processed 'session' refers to the session that just generated the ID, not the session that is asking for the ID. Because we call result.complete() earlier if this finishes off the request it can close the EM, so you are calling generate with a closed session. |
Haha. OMG. Oooops. I really did stare at that code for a long time without noticing that. How embarrassing! |
I have opened hibernate/hibernate-reactive#707. |
If it can be of any consolation, you are not the only one who looked at that code. |
Not sure if it's the same issue or not, but I had a different situation with the same error. If I have the following:
where
|
That is likely the same error |
Actually your code does not seem to be inserting into the DB, so it is likely a different error. |
I think you might need to wrap your code in a transaction. |
@stuartwdouglas Right, it's purely reading from two different entities. I changed the |
I think this issue my be related to this one: #14709 (comment) |
Shouldn't we close this issue? The fix was released with 2.0.0.Alpha1: #16588 |
Describe the bug
Hibernate reactive hangs when inserting, the sequence generator stops returning numbers and requests just hang.
To Reproduce
Take the hibernate reactive quickstart, and remove the unique constraint for the name field (it will still reproduce even with this, this just simplifies things).
Then run
wrk -s post.lua http://localhost:8080/fruits
where post.lua is:
Everything will hang very quickly, locally I set breapoints in BlockingIdentifierGenerator and I see the following output:
Basically the 3rd time generate() is called and attempts to allocate more numbers the call just never returns.
The text was updated successfully, but these errors were encountered: