-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDBC connection does not reset #18685
Comments
Are you getting a new connection from the pool each time or are you holding onto one for the life of the app? |
|
Maybe you could try adding |
Hi again! @gsmet i tried your fix, but there is no difference. |
Also, i see something similar in here #15025. I am reproducing the issue in |
Hi, Many conn are suffering the exact same flow, very often at startup but also when we set max-life-time. Generally that goes with corrupted metrics. As soon as this happens the active count will be < 0. I've been unable to implement a reproducer. I'm in touch with Luis in private. This looks very specific to cloud environment, certainly to the cloud db server pattern/product. |
I can confirm I am also facing this issue with quarkus 2.0.3.final in a kubernetes environment with HPA scaling pods up/down based on load |
@jonsalvas out of curiosity, do you know the details about the db setup and possible extra pooling mecanism? |
Any updates on this issue? |
From Reddit thread highlighting the issue: I was able to reproduce the issue, but it doesn't seem very serious to me.
My application configuration doesn't have anything special regarding database connection validation or maximum age. In my opinion the only scenario where this is an issue is when database connections are frequently interrupted. |
@maxandersen
I'm pretty sure there is something really tricky and specific. I managed to reduce this "pseudo" leak, I'll paste my settings tomorrow. |
@apat59 thanks for the info - you mention Luis; Luis who ? (there are a handfull of those around :) |
also @apat59 what is the latest version of quarkus you tested this against ? 1.13.7 is quite old by now :) |
so I tried to reproduce this locally and thus far it works as I expect. when db is down you get errors; when db returns the pool will eventually recover and all works again. See https://github.com/maxandersen/quarkus18685reproduce for reproducer. this is on quarkus v2.7.1.Final |
Can you try with agroal/agroal#48 ? With this patch I see things resolving immediately after the DB is back up. |
@apat59 sounds great. if you can try with stuarts patch that would be great but if notlook for what error codes you get and if its any of the ones starting with 08 as seen in https://www.postgresql.org/docs/13/errcodes-appendix.html at least you should see faster recovery. |
@stuartwdouglas @maxandersen from what I can understand, the problem is not just about error codes but how the Amazon Aurora service does the DB failover. I don't know the internals, but it looks like there is a "primary" that can fail, and when it does Agroal does flush the connections (based on the SQL error code) and new connections are created to a read-only "replica". Agroal does expect that the "replica" instance will eventually become the "primary" at some point in the future (once the recovery procedure is complete) and because of that the connections to the replica are not flush. Aurora seems to keep the "replica" always in read-only mode and bring the "primary" back up, which is not the expected behavior. From looking around in the Amazon product documentation, they seem to acknowledge their failover problem because they do suggest a couple of 'solutions' to deal with it. We can think of ways to handle this failover on our side, but a first step (that can be useful in other scenarios as well) would be to somehow expose the |
Yeah This sounds like a separate issue that i suggest You open separate issue for. Looks like Agroal might wanna learn some tricks to deal with Aurora Amazon special fail over Logic. |
Hi, |
Is this still an issue? |
Closing for lack of feedback |
Describe the bug
ManagedExecutor
:Somewhere down the line, I do some DB selects / inserts
2. Reset the database. Reproduced in aws cloud, but even in local, with docker postgres db container. At this step end, databse is up and ready. Accepting new connections.
3. Got stack trace
Expected behavior
Database connection is restored after DB is up again.
Actual behavior
Infinite failure. It recovers just after the app is restarted.
Output of
java -version
openjdk 11.0.11 2021-04-20
Quarkus version or git rev
1.13.7
Build tool (ie. output of
mvnw --version
orgradlew --version
)maven
The text was updated successfully, but these errors were encountered: