Hibernate Reactive: NamedQuery fails after hot reload #15164

markusdlugi · 2021-02-18T10:17:36Z

Describe the bug
Same issue as already described by @cescoffier in #13355, but I was actually able to reproduce it reliably 😄

After the application is reloaded in dev mode due to a code change, named queries do not work anymore. You don't even need to change something in the queried entity itself, just any change that leads to an entire reload of the application will be sufficient.

Expected behavior
NamedQuery should work just as before hot reload.

Actual behavior
Executing a NamedQuery leads to the following exception:

2021-02-18 10:46:11,037 ERROR [org.jbo.res.rea.ser.cor.ExceptionMapping] (vert.x-eventloop-thread-11) Request failed : java.lang.IllegalArgumentException: Type specified for TypedQuery [com.example.Fruit] is incompatible with query return type [class com.example.Fruit]
	at org.hibernate.internal.AbstractSharedSessionContract.resultClassChecking(AbstractSharedSessionContract.java:863)
	at org.hibernate.reactive.session.impl.ReactiveSessionImpl.createReactiveQuery(ReactiveSessionImpl.java:461)
	at org.hibernate.reactive.session.impl.ReactiveSessionImpl.buildReactiveQueryFromName(ReactiveSessionImpl.java:443)
	at org.hibernate.reactive.session.impl.ReactiveSessionImpl.createReactiveNamedQuery(ReactiveSessionImpl.java:430)
	at org.hibernate.reactive.mutiny.impl.MutinySessionImpl.createNamedQuery(MutinySessionImpl.java:222)
	at io.quarkus.hibernate.reactive.runtime.ReactiveSessionProducer_ProducerMethod_createMutinySession_1321d110ee9e92bda147899150401e0a136779c7_ClientProxy.createNamedQuery(ReactiveSessionProducer_ProducerMethod_createMutinySession_1321d110ee9e92bda147899150401e0a136779c7_ClientProxy.zig:997)
	at com.example.FruitDao.get(FruitDao.java:20)
	at com.example.FruitDao_ClientProxy.get(FruitDao_ClientProxy.zig:217)
	at com.example.FruitResource.get(FruitResource.java:32)
	at com.example.FruitResource_ClientProxy.get(FruitResource_ClientProxy.zig:217)
	at com.example.FruitResource$quarkusrestinvoker$get_2420412d5b210b3632823e74ede45d605c6ff069.invoke(FruitResource$quarkusrestinvoker$get_2420412d5b210b3632823e74ede45d605c6ff069.zig:33)
	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:7)
	at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:117)
	at org.jboss.resteasy.reactive.server.handlers.RestInitialHandler.beginProcessing(RestInitialHandler.java:47)
	at org.jboss.resteasy.reactive.server.vertx.ResteasyReactiveVertxHandler.handle(ResteasyReactiveVertxHandler.java:17)
	at org.jboss.resteasy.reactive.server.vertx.ResteasyReactiveVertxHandler.handle(ResteasyReactiveVertxHandler.java:7)
	at io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1038)
	at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:137)
	at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:132)
	at io.quarkus.vertx.http.runtime.StaticResourcesRecorder.lambda$start$1(StaticResourcesRecorder.java:62)
	at io.vertx.ext.web.impl.RouteState.handleContext(RouteState.java:1038)
	at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:101)
	at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:132)
	at io.vertx.ext.web.handler.impl.StaticHandlerImpl.lambda$sendStatic$1(StaticHandlerImpl.java:206)
	at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:327)
	at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
	at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)

To Reproduce

Link to a small reproducer (preferably a Maven project if the issue is not Gradle-specific).

https://github.com/markusdlugi/hibernate-reactive-hot-swap

Steps to reproduce the behavior:

Start PostgreSQL database using infrastructure/docker-compose.yml
Start application in dev mode using mvn quarkus:dev
Execute GET http://localhost:8080/fruits - should still work with HTTP 200
Make a code change to trigger hot reload, e.g. by removing line 49 in com.example.FruitDao
Execute GET http://localhost:8080/fruits again - application will reload in the background and fail with exception

Configuration

# Add your application.properties here, if applicable.
quarkus.datasource.db-kind=postgresql
quarkus.datasource.username=test
quarkus.datasource.password=test
quarkus.datasource.reactive.url=vertx-reactive:postgresql://localhost:5432/hibernate-reactive-hot-swap
quarkus.hibernate-orm.database.generation=create

Screenshots
(If applicable, add screenshots to help explain your problem.)

Environment (please complete the following information):

Output of uname -a or ver: Microsoft Windows [Version 10.0.17763.1757]
Output of java -version: OpenJDK 64-Bit Server VM Corretto-11.0.10.9.1 (build 11.0.10+9-LTS, mixed mode)
GraalVM version (if different from Java): N/A
Quarkus version or git rev: 1.11.3.Final
Build tool (ie. output of mvnw --version or gradlew --version): Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)

Additional context
For some reason, the issue only appears if both of the following conditions are met:

The application has a dependency to io.quarkus:quarkus-smallrye-health
The folder META-INF/resources exists (can be empty)

If either of these is removed, the issue just disappears. No idea why.

I don't think that these are the only conditions where this issue appears, most likely this can also happen under different circumstances, but those were the conditions where it failed in our application.

The text was updated successfully, but these errors were encountered:

quarkus-bot · 2021-02-18T10:17:38Z

/cc @DavideD, @Sanne, @gavinking

geoand · 2021-02-18T12:59:27Z

~~Can you try this while having set quarkus.dev.instrumentation=false in application.properties or as system property when running mvn quarkus:dev please?~~

I tried your example and I can reproduce the issue when a structural change is made to a class (or some new class is added).
On the other hand, when Quarkus can use instrumentation to update changed classes (i.e. when no structural changes have been made), the problem does not manifest

markusdlugi · 2021-02-18T13:07:51Z

Just tested setting that in both application.properties and as system property. Unfortunately it didn't seem to have any effect, issue still exists.

geoand · 2021-02-18T13:10:07Z

So this does indeed look like a Hibernate (Reactive) issue and not strictly a Quarkus issue

markusdlugi · 2021-02-18T14:16:31Z

I think this might also be related to hibernate/hibernate-reactive#537, what do you think? @Sanne, @gavinking

gsmet · 2021-02-19T23:42:33Z

Yeah I suspect one of the class is still loaded with the previous classloader. That would explain why you need as structural change to trigger it as the new instrumentation mode doesn’t change the class loader.

markusdlugi · 2021-02-24T07:45:03Z

@gsmet, I checked in a heap dump and you are of course right - the problematic entity class is present twice, with each instance of the class being loaded with a different classloader. So seems like something in SmallRye Health leads to the old classloader leaking.

Now from my point of view, the question is: Would it make more sense to

Fix the classloader leak in SR Health, which would fix this particular issue but wouldn't help us if a different component also leads to a classloader leak, or
Should we adjust Hibernate to be able to handle entities even if they are loaded from different classloaders.

gsmet · 2021-03-02T22:41:25Z

@DavideD any chance you could have a look to this one?

DavideD · 2021-03-03T10:01:50Z

I will have a look

…

On Tue, Mar 2, 2021 at 10:41 PM Guillaume Smet ***@***.***> wrote: @DavideD <https://github.com/DavideD> any chance you could have a look to this one? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15164 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEIQ5ORHNV5K4SIR2H346DTBVSSLANCNFSM4X2AQFAA> .

Sanne · 2021-03-03T10:13:03Z

Well while it's true that we might be able to workaround things in Hibernate, if SmallRye Health is leaking classloaders that needs to be fixed at higher priority.

DavideD · 2021-03-03T11:20:13Z

When I try to run the reproducer, I have this error:

 java.util.NoSuchElementException: SRCFG00011: Could not expand value IFS+x in property BASH_FUNC__module_raw%%
        at io.smallrye.config.ExpressionConfigSourceInterceptor.lambda$getValue$0(ExpressionConfigSourceInterceptor.java:44)
        at io.smallrye.common.expression.ExpressionNode.emit(ExpressionNode.java:22)
        at io.smallrye.common.expression.CompositeNode.emit(CompositeNode.java:22)

markusdlugi · 2021-03-03T11:27:10Z

@DavideD That's interesting, haven't seen that one myself. Quick search brought up #15269. Maybe updating the reproducer project to 1.12.0.Final as suggested there might help?

DavideD · 2021-03-03T11:35:24Z

I've added the dependency to the a quickstart and I can recreate the exception

DavideD · 2021-03-03T15:45:24Z

I gave a look at this issue and I'm not sure exactly what we can do about it from the Hibernate Reactive side.

The code we are using is similar to Hibernate ORM and the cause of the problem seems to be this variable in EntityType:

private transient Class returnedClass;

In ORM this variable is lazily initialised and during the hot reload, it gets set to null. When reading it the second time for the check, Hibernate ORM needs to re-initialise it and therefore, everything works.

For some reasons, this doesn't happen when using Hibernate Reactive. The variable still reference the old type and therefore keeps track of the old classloader.

I suppose we could read the returnedClass property each time, but it seems something is missing somewhere else.
I have no idea where to look for it.

markusdlugi · 2021-03-04T11:36:09Z

@DavideD Thanks a lot for investigating!

I was under the assumption that SmallRye Health must be responsible for the leak because the exception isn't thrown when you remove the dependency to quarkus-smallrye-health. But from your debugging it sounds more like a Hibernate Reactive issue. Maybe you could verify if the behavior in Hibernate changes if you remove the dependency to SR Health? Or does the EntityType still contain a reference to the old class even when that dependency is not present?

DavideD · 2021-03-04T11:37:49Z

The error only happens if I add the dependency to the classpath.

DavideD · 2021-03-04T11:39:23Z

By the way, Hibernate Reactive is still using the same class EntityType available in ORM

DavideD · 2021-03-04T11:49:12Z

The error only happens if I add the dependency to the classpath.

I think I was wrong about this... let me check again

DavideD · 2021-03-04T11:59:00Z

Argh... it seems it happens even without the dependency.
Maybe someone else can double check this? It's possible to replicate the issue using a quickstart: https://github.com/quarkusio/quarkus-quickstarts/tree/master/hibernate-reactive-quickstart

Changing Fruit between two calls to the rest API.

DavideD · 2021-03-04T13:46:37Z

Still...not sure what Hibernate Reactive should do differently

markusdlugi · 2021-03-09T12:10:41Z

Thanks once again for checking.

I tried the reproducer project again and it seems like you are right, removing the dependency to quarkus-smallrye-health does not change the behavior. That's strange because I was fairly sure it did back when I created the issue, but seems like I was wrong. Sorry about that.

What still does affect the behavior is removing the META-INF/resources folder - once you remove that, the issue cannot be reproduced anymore, NamedQueries are working fine, no exceptions.

I also tried the quickstart you mentioned and can confirm that the issue is also reproducible there. Similarly, if you remove the META-INF/resources folder there, issue also disappears for the quickstart.

I had another look at some heap dumps, and actually noticed that the Fruit entity being present twice is true both in the failing case and in the working case (so without META-INF/resources). So the classloader leak does not seem to be the main root cause of the exception - Quarkus is leaking the classloaders in any case (mainly due to the contextClassLoaders of the Vert.x threads, as it seems). But in one case, Hibernate is able to handle that, while in the other case, for some reason, it isn't. I still haven't found a reason why Hibernate behaves differently in the two cases.

markusdlugi · 2021-03-09T14:31:59Z

I did some more debugging.

I found out that Hibernate will always try to look up the entity class at EntityType.determineAssociatedEntityClass() because in fact the returnedClass attribute which @DavideD mentioned was always null after the hot reload, because a new instance of ManyToOneType (subclass of EntityType) is correctly created during hot reload.

When trying to look up the class, Hibernate will use Thread.currentThread().getContextClassLoader() (implemented in ReflectHelper.classForName()). It just so happens that for some reason, the current thread uses the same contextClassLoader in the good case as is used inside the HQLQueryPlan as return type of the NamedQuery. In the bad case (when META-INF/resources is present), the class loaders differ and everything goes to hell.

I still have no clue why it just magically works when META-INF/resources is not present and how that influences the threads and class loaders, but I don't think it matters because that doesn't seem to be something that we can rely on. I think we have to approach this issue from another angle.

During debugging, I noticed that the Vert.x eventloop threads are not recreated in dev mode when a hot reload (with new classloader) is performed. Therefore, the old threads are still used which still reference the old contextClassLoader. Maybe we should instead ensure that all old threads are stopped and new ones are created. They would then use the new class loader and we most likely wouldn't run into this issue in the first place because we start from a completely clean plate.

TL;DR: Instead of trying to understand what exactly differs in good case and bad case, change Quarkus dev mode to create new eventloop threads when a structural change (with new classloader) is performed.

@geoand, @gsmet, what do you think?

geoand · 2021-03-09T15:55:43Z

TL;DR: Instead of trying to understand what exactly differs in good case and bad case, change Quarkus dev mode to create new eventloop threads when a structural change (with new classloader) is performed.

@geoand, @gsmet, what do you think?

That will likely cause massive problems.
Although I haven't looked at any of the specifics of this case, maybe something like https://github.com/quarkusio/quarkus/blob/master/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/VertxHttpRecorder.java#L345
or https://github.com/quarkusio/quarkus/blob/master/extensions/scheduler/runtime/src/main/java/io/quarkus/scheduler/runtime/devconsole/SchedulerDevConsoleRecorder.java#L25
would help?
What those pieces of code do is grab the initial TCCL and reset it when they actually execute the Vert.x handler

markusdlugi · 2021-03-16T08:11:12Z

Okay, so if we can't recreate the threads in dev mode during hot reload, what about at least resetting the TCCL of all Vert.x threads on reload? I don't think the approach you mentioned of temporarily setting the TCCL will help us here, because the TCCL is wrong to begin with after hot reload, since it is still pointing to an old classloader that shouldn't be used anymore.

Can you maybe point me to the part of the code where the new classloader is created during reload? Then maybe I could try and test a possible solution.

geoand · 2021-03-16T08:24:00Z

Okay, so if we can't recreate the threads in dev mode during hot reload, what about at least resetting the TCCL of all Vert.x threads on reload? I don't think the approach you mentioned of temporarily setting the TCCL will help us here, because the TCCL is wrong to begin with after hot reload, since it is still pointing to an old classloader that shouldn't be used anymore.

It's not wrong the first time is it?

markusdlugi · 2021-03-16T09:12:42Z

No, at first application start it's correct of course. But after the reload the TCCL still seems to point to the old classloaders, which leads to these issues with Hibernate (and potentially other issues?).

geoand · 2021-03-16T10:40:55Z

But that is exactly the issue that the code above works around, isn't it?

markusdlugi · 2021-03-16T11:11:05Z

Yes, but I'd argue that in the case of this issue, you can't easily set the TCCL just for that use case and then reset it afterwards as it is done in your examples, because

you would have to intercept the entire application code this way, since the Hibernate call doing the class lookup is inside the normal application context
you would have to intercept all kinds of entrypoints to the application code (REST, Scheduler, Messaging, ...)
you most likely won't be able to get a reference to the correct TCCL at that point in time.

Therefore I think it might be easier to actually change the TCCL of all the threads in dev mode when the application is reloaded.

I'm of course not an expert on the architecture of Quarkus, so that's just what I think would be the most sensible option from an outsider's perspective. I would be happy to be proven wrong 😄 Just trying to move this issue forward.

geoand · 2021-03-16T11:23:34Z

Therefore I think it might be easier to actually change the TCCL of all the threads in dev mode when the application is reloaded.

That's understandable, but the potential impact of such a change is pretty big. I'd like to avoid that.

geoand · 2021-03-16T22:27:10Z

Also, I wanted to clarify that we can't fix the issue in Quarkus itself.
The problem stems from the fact that Vert.x hangs on to original TCCL of the thread and resets it's quite often.
This leads the us having to implement ugly hacks like #15779

markusdlugi · 2021-03-17T09:32:18Z

@geoand Wow, that's funny. You actually just fixed this issue with #15779 😄

I had a look at your changes in that PR and got suspicious because here the behavior was similar, because it only failed if some static resource was present in META-INF/resources, but worked otherwise. And since you explicitly set the TCCL in your PR in case there are static resources, I figured this might actually have an impact on this issue as well. And sure enough, after making those same changes locally and rerunning the reproducer, the NamedQuery now works reliably even after a hot reload! 👍

I guess that wasn't your intention (or maybe it was? 😉), but thanks a lot!

geoand · 2021-03-17T09:36:47Z

I'm glad it worked!

To be honest, I was kinda hoping it would fix this issue as well, but I had no idea :)

markusdlugi added the kind/bug Something isn't working label Feb 18, 2021

quarkus-bot bot added area/hibernate-reactive Hibernate Reactive area/persistence OBSOLETE, DO NOT USE env/windows Impacts Windows machines labels Feb 18, 2021

geoand mentioned this issue Mar 17, 2021

Make sure the proper TCCL isn't lost in dev mode when static resource exist #15779

Merged

geoand closed this as completed in #15779 Mar 17, 2021

quarkus-bot bot added this to the 1.13 - main milestone Mar 17, 2021

markusdlugi mentioned this issue Jul 1, 2021

Hibernate Reactive withSession() breaks named queries & context after hot reload #18301

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hibernate Reactive: NamedQuery fails after hot reload #15164

Hibernate Reactive: NamedQuery fails after hot reload #15164

markusdlugi commented Feb 18, 2021

quarkus-bot bot commented Feb 18, 2021

geoand commented Feb 18, 2021 •

edited

Loading

markusdlugi commented Feb 18, 2021

geoand commented Feb 18, 2021 •

edited

Loading

markusdlugi commented Feb 18, 2021

gsmet commented Feb 19, 2021

markusdlugi commented Feb 24, 2021

gsmet commented Mar 2, 2021

DavideD commented Mar 3, 2021 via email

Sanne commented Mar 3, 2021

DavideD commented Mar 3, 2021 •

edited

Loading

markusdlugi commented Mar 3, 2021

DavideD commented Mar 3, 2021

DavideD commented Mar 3, 2021 •

edited

Loading

markusdlugi commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

markusdlugi commented Mar 9, 2021

markusdlugi commented Mar 9, 2021

geoand commented Mar 9, 2021 •

edited

Loading

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 17, 2021

geoand commented Mar 17, 2021

Hibernate Reactive: NamedQuery fails after hot reload #15164

Hibernate Reactive: NamedQuery fails after hot reload #15164

Comments

markusdlugi commented Feb 18, 2021

quarkus-bot bot commented Feb 18, 2021

geoand commented Feb 18, 2021 • edited Loading

markusdlugi commented Feb 18, 2021

geoand commented Feb 18, 2021 • edited Loading

markusdlugi commented Feb 18, 2021

gsmet commented Feb 19, 2021

markusdlugi commented Feb 24, 2021

gsmet commented Mar 2, 2021

DavideD commented Mar 3, 2021 via email

Sanne commented Mar 3, 2021

DavideD commented Mar 3, 2021 • edited Loading

markusdlugi commented Mar 3, 2021

DavideD commented Mar 3, 2021

DavideD commented Mar 3, 2021 • edited Loading

markusdlugi commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

DavideD commented Mar 4, 2021

markusdlugi commented Mar 9, 2021

markusdlugi commented Mar 9, 2021

geoand commented Mar 9, 2021 • edited Loading

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 16, 2021

geoand commented Mar 16, 2021

geoand commented Mar 16, 2021

markusdlugi commented Mar 17, 2021

geoand commented Mar 17, 2021

geoand commented Feb 18, 2021 •

edited

Loading

geoand commented Feb 18, 2021 •

edited

Loading

DavideD commented Mar 3, 2021 •

edited

Loading

DavideD commented Mar 3, 2021 •

edited

Loading

geoand commented Mar 9, 2021 •

edited

Loading