-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Persistence units" sometimes hangs/is very slow to load in dev ui for trivial app #39930
Comments
/cc @cescoffier (devui), @phillip-kruger (devui) |
It could be that the websocket is not connected yet, check the bottom left corner. If your browser was open but the server was down, the timeout for reconnect happens, and then it can take a while to reconnect. I am not sure how to fix this. We want to do the auto-reconnect (that is nice for hot-reload) but maybe we should stop trying after a certain amount / time and just show a pop-up that will retry ? w.d.y.t ? |
I think it probably is related to a restart in some way; the dev ui is working, but then I add the I'd be cautious about displaying a pop-up, though. This behaviour is in the Should Not Happen category, and if we display a pop-up when it happens, it might just make the problem more obvious, without actually helping much. A pop-up would be extra-bad if it happens as often for others as it seems to happen to me; the dev ui would just be flashing up pop-ups all the time, and looking pretty dysfunctional. Whereas probably some of the time, users might not even notice, or might think it's just them. I'd be more inclined to do what we can to pretend it didn't happen and do a sneaky recovery. :) |
I'll have a look and see if I can recreate it |
I did not manage to recreate the issue you described. If you get that again, please can you see if there is any errors in the browser's console log ? I had a look at the code and there everything look fine. The loading is basically displayed until the persistence units are loaded from the json-rpc backend. (That should be fairly instant) So what I thought the issues could be in my previous response is not correct. This seems to be an issue specifically to this extension. When I shut down the server and then navigate to the page I can see the loading, but as soon as the server is back up, the page display, even without refreshing. So to fix this we need to be able to recreate it so that we can look in the console for errors . Please post if you have something or have a way to definitely recreate the issue. Thanks :) |
Understood about no quick fix. Intermittent issues are the worst! I often see the problem when I'm doing a live demo (see "audience-driven testing" in the reproducer steps) ... and in those cases, it's not ideal to pause and divert to gathering diagnostics. :) |
``I was actually facing the same issue, depending of the project sometime i can't access the persistent unit info. Sometime the bug appear 100% for my programming session and it is very annoying (I depend so much on the ddl produced by this tool) We saw clearly that the "io.quarkus.quarkus-hibernate-orm.getInfo" json-rpc call is never answered by the server. Actually the problem come from the ReactiveImprovedExtractionContextImp.getQueryResultSet and the join call that never return. The bug also show (but in a secondary manner) a DeadLock on the HibernateOrmDevInfo$PersistenceUnit, this class have 3 methode synchronized on the instance that called external supplier for the ddl generation. In this class:
we can, at least, replace the 3 getters by:
With this change the getDDL are not blocking each other if one of the supplier is stuck or have a long process to complete. It seem that the root problem of the bug is comming from the Pooled connections or the reactive way to access/release connection.
If this bug is related to the PgPool connections and not at all with the DevInfo (the query are sent outside a transaction and maybe the bug is confined in this logic only) maybe it can explain some beahvior that i see with my applications randomly. I hope this "analyze" will help. Workaround:
|
Thanks for your analysis @ITrium-Salah. I understand that you have potential improvements in mind; they make a lot of sense, so feel free to send a pull request! Do I understand correctly that the root of the issue is still unclear? I doubt the bug is in
Please do, I'll take any source of information in case I can't reproduce the problem. |
Yes the root cause is still unknown for me, the only think that make me think about a problem with connections pool is the fact that i can make this bug systematic with For the improvement of devinfo yes it seems to me a shame to treat these tasks sequentially even if they are cached afterwards. I also wonder if other parts of the codes use instance or class synchronization. In my experience, I was confronted with some very efficient competitive code which ended up being sequential because of a call to a function whose synchronization was too brutal (synchronized this or synchronized class). Threads Dump with deadlock:
Thread dump without the deadlock:
|
Thanks for the info, I'll try to find time to have a closer look.
Yes: https://github.com/quarkusio/quarkus/blob/main/CONTRIBUTING.md |
@ITrium-Salah Unless I'm mistaken, your thread dump does not demonstrate a deadlock, but simply a connection starvation... ? Now of course, if generating a single update script involves multiple concurrent connections, in extreme cases it's possible that multiple concurrent calls to That being said, @holly-cummins is telling us this happens with a very simple demo, so I assume there's a single persistence unit, and most likely the connection pool size is the default of 20, which should be more than enough to avoid such a problem... Some notes in case this really is the problem:
Anyway, I had a look, and as expected, I can't reproduce the problem (unless I force a connection starvation) with Holly's reproducer, and as expected, guesswork is not leading anywhere:
If anyone encounters this again, please take a thread dump! Especially if using Hibernate ORM. Hibernate Reactive is nice and all, but thread dumps of reactive applications miss a lot of information :/ @holly-cummins, any talk in front of 400 people planned in the near future? :-) |
The deadlock is NOT the cause of the bug but just an observation when i was investigating on it.
The thread 7 is blocking thread 11 and 16 and the bug causing an infinte wait the thread 7 will never release is lock resource. My remark on the PersitenceUnit code of devmode, was simply to note a synchronization that was much too broad with regard to the functional need for synchronization.
Developers think that:
point n°5 is of course false, and 1,2,3,4 true but partially.
We see very clearly with this writing the cost of the synchronized keyword on the methods that I personally avoid using, especially in object-oriented programming where it is difficult to know what the parent or child classes do with the object "this" or "Object.class" that said my proposed modification of the HibernateOrmDevInfo$PersistenceUnit was simply to remove the excess synchronization and the bug is not related with this. (i will try to find time for the pull request) @yrodiere I don't understand why executing an SQL query would require a new connection if others are available. If no new connection is available the thread should be blocked while the current requests are completed and then resume execution? I don't know how hibernate is implemented but it would seem logical to me that it works like this being able to handle a large number of requests. I have created a projet https://github.com/ITrium-Salah/quarkus-orm-issue-39930 NB: My english is bad, i'm sorry by advance for this long post but i want to be sure to be as much clear i can. |
It happens in situations where Hibernate ORM needs to run SQL outside of the current transaction, and cannot or has no way to suspend the current transaction. Then the only solution is to use a separate transaction. Don't ask me why it's necessary, but it happens -- probably because of quirks specific to some databases or JDBC drivers. See https://github.com/yrodiere/hibernate-reactive/blob/aecefba2ddb08ca5c2d609c9ad2a179f49d7d6c0/hibernate-reactive-core/src/main/java/org/hibernate/reactive/pool/ReactiveConnection.java#L78, https://github.com/hibernate/hibernate-orm/blob/81a3541d2624d546efac71c75d8e0335bc1b7ee3/hibernate-core/src/main/java/org/hibernate/resource/transaction/spi/IsolationDelegate.java#L14. Now I certainly agree that looks like a bug:
I've reported it as a bug: hibernate/hibernate-reactive#1909 But that's limited to Hibernate Reactive, so not the bug @holly-cummins was experiencing. I'll see if I can reproduce the same problem by changing your reproducer to use Hibernate ORM. |
I can't. With So, while the problem @ITrium-Salah is experiencing can be explained by hibernate/hibernate-reactive#1909, we still lack a reproducer for Holly's problem with Hibernate ORM, which is different. |
I just experienced this again ... with an audience. I tried it in my hotel room later, and the persistence units view worked fine. |
I think it was in the same session, I visited the dev services view and got a "You do not have any Dev Services running." message. I definitely did have a dev service running (I'd just been showing it off!). I'm not sure if the same connection issue could cause both glitches, or if my tech was just cursed today. |
@holly-cummins if this is due to dev-ui's websocket not connected, this PR should make this better in the future: #43841 |
Fingers crossed. I guess next time it happens I should try a different window or browser, too. There's no way I can do a debug sequence in front of an audience, but discreetly popping across to another window might be do-able. If that sorts it out, it gives us useful diagnostic information. |
Describe the bug
I've noticed that sometimes when walking through a simple Quarkus demo (the same one in https://quarkus.io/guides/getting-started-dev-services), when I try and show how the persistence units can be read off the dev ui, I just get a
Loading...
page on http://localhost:8080/q/dev-ui/io.quarkus.quarkus-hibernate-orm/persistence-units.Expected behavior
We expect something like this, of course.
Actual behavior
http://localhost:8080/q/dev-ui/io.quarkus.quarkus-hibernate-orm/persistence-units just shows
Loading ...
This is an intermittent issue (sorry!). When I tried to reproduce just now, I couldn't get
Loading
, but I could get a blank page if I changed my persistent units and then reloaded the persistence units url:How to Reproduce?
As mentioned above, this is an intermittent issue, so I don't know exactly why it happens. It's about 50% of the time for me, but of course when I was raising this issue and trying to get a screencap, it was about 0%.
Output of
uname -a
orver
No response
Output of
java -version
No response
Quarkus version or git rev
3.9.2, seen on 3.9.0 and probably earlier
Build tool (ie. output of
mvnw --version
orgradlew --version
)No response
Additional information
No response
The text was updated successfully, but these errors were encountered: