-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPC / Panache Reactive Combination causes issues after a couple requests #14820
Comments
@Sanne this seems related to session/thread discussions, no? Can you route this to someone who can take a look? |
Sure. @FroMage , could you have a look? :) Jokes aside, @0SkillAllLuck : currently you can only use Panache Reactive if either of these are true: a. you're running in a reactive context I apologize for the confusion, we need to work on detecting this and throwing a better error. |
@Sanne I just tried to wrap my call with Panache.withTransaction( ... ). Either I am stupid enough to not use it correctly or it still doesn't work. I updated the reproducer repo with the wrapping. It's still doing the same behaviour. I also thought using the Mutiny GRPC Service should be reactive or aren't they? |
It looks like the issue is gone with version 1.12.0 of quarkus. |
excellent, thanks @0SkillAllLuck . Yes this specific issue was suspiciously similar to some others we fixed, I probably should have marked them as duplicates. |
@Sanne I think it is still an issue. I had .runSubscriptionOn(Infrastructure.getDefaultExecutor()) in my gRPC services. When removing them the issue reappeared. Also it seems like if you include smallrye-health it will break after some time as well. |
Also I noticed that sometimes the error message now says: „io.vertx.pgclient.PgException: sorry, too many clients already“. Are the clients not closed properly? |
That's worrying, can you provide a reproducer? |
While building the reproducer I noticed that it also happens when using hibernate-reactive without panache so I created another reproducer with only hibernate. Postgres used: Panache Reproducer For both reproducers I used grpcui and just did 4 requests to the grpc service. The 5h request will hang forever and cause the issue |
I got "sorry, too many clients already" yesterday, and it was caused by me having too many IO threads talking to the DB. I could turn it down using Perhaps that's just what this is about? |
When setting |
Also when using hibernate-panache without the reactive version it works well when wrapping it like this:
|
@FroMage, @Sanne I figured out a way around this issue that potentially helps to find the cause. When using hibernate-reactive the „normal“ way (without panache) and using all tricks like withTransaction and runSubscriptionOn it still doesn‘t work in gRPC. But when using a SessionFactory instead of Session in the Inject and then manually closing the session after the call it works fine. So I guess that in the context of gRPC the session isn‘t closed correctly. |
Also when using the non reactive version of panache and enabling metrics you can see that the sessions stay in the active state for ever |
For anyone experiencing the same issue I have found a workaround by appending all grpc calls with:
|
@michalszynkiewicz can you have a look? It may be out of date. |
@cescoffier I'm able to reproduce it on 2.0.0.Alpha1, will take a deeper look |
In my tests, in the hibernate version, it is enough to close the session, e.g.:
@Sanne @cescoffier @FroMage what should close the session? |
@michalszynkiewicz I have tried a similiar aproach with Panache.getSession().close(), that seems to work fine under light load but caused issues under higher load. I will try to collect the concrete exceptions and errors that were caused. Is Panache.getSession() and JpaOperations.getSession() the same? |
indeed, three terminals each making 4000 calls with grpcurl result in some
And it's still the case with what we have in the |
It seems like the higher the load the more percentage of the request result in that error. |
@michalszynkiewicz thanks for helping :) I suspect this might be a duplicate of hibernate/hibernate-reactive#707 , could you build Hibernate Reactive from |
@michalszynkiewicz sorry I was wrong, that fix was already integrated in Quarkus main: #16588 - |
Also I had the same issue without the reactive version of hibernate, so shouldn‘t be a hibernate-reactive issue |
I see the session is created with That wouldn't explain why sometimes a closed session is used whe we add manual session closing... |
Keep in mind that the Session can't be closed yet after the reactive method is invoked, as it needs to process the IO results when the actual data to process is returned. It should be closed automatically at the end of such event. @mkouba had a fix for something that might be related, but AFAIR he mentioned it wouldn't cover all use cases: |
IIRC, the request context clean-up wasn't working properly for this use case. @0SkillAllLuck , if you'd like to test if it works for you, you can use this branch: https://github.com/michalszynkiewicz/quarkus/tree/grpc-request-context-cleanup |
@michalszynkiewicz Maybe that is an error on my site, but now it works flawless until I get the first error for example a null constraint. From that point on no request will work but instead that first error will be repeatet over and over again. I used the following order:
Am I missing something obvious ? |
@0SkillAllLuck I haven't seen this. I see problems in dev mode, all doesn't work after dev mode reload, I'll update the PR when I get to the bottom of it. |
@0SkillAllLuck I tried to reproduce the behavior you're describing with my newest code and I couldn't. Could you check if the newest changes in the branch I linked? Thanks a lot for testing the previous version :) If it still doesn't work for you, I'd be grateful for a small reproducer (an update of one of the existing ones?) |
@michalszynkiewicz I updated to the latest changes in your branch. Now the error is gone but changes made inside Panache.withTransaction are not persisted. I need to manually call Panache.getSession().flush() |
@0SkillAllLuck I don't think you need |
@michalszynkiewicz I've removed all |
Great, thanks a lot for checking. I fixed a few other issues in my PR, I hope it will get into the next release of 2.0.0 |
I'm wondering if the fixes for it are not worth backporting to 1.11. What do you guys think? |
Describe the bug
When invoking Panache Entities via a GRPC Service in a reactive way, the first couple invocations will work but after that all that happens is a "java.lang.IllegalStateException: session is currently connecting to database"
Expected behavior
The Panache invocations work until the end of the universe.
Actual behavior
After a couple of invocations only "java.lang.IllegalStateException: session is currently connecting to database" is produced.
HealthChecks will also fail if present in the project.
To Reproduce
Reproducer: 0SkillAllLuck/code-with-quarkus
Environment (please complete the following information):
java -version
: openjdk version "11.0.9" 2020-10-20mvnw --version
orgradlew --version
): Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)The text was updated successfully, but these errors were encountered: