-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Another] Unable to acquire JDBC connection #9123
Comments
Enable datasource metrics using
and
Then check the |
Ok so my Like this one
Do you think is this jackson error which take my app out ? |
The |
I tested :
active count + 2 this is my method :
|
Problem found Ok, so after multiple tests, I find where is the problem :
I don't know why but when I call Blabla.list("service = ?1", id) it works but doesn't close the statement, and in my metrics the To resolve that I just injected the entityManager in my service class and make the request with DTO projection in my request Detail of my process May be the statement is not close because :
So, if someone can help me on that, or confirm that there are a problem with the extension, it would be nice. I'm available to talk more about this, and may be help to solve this bug. EDIT I confirm the problem is when I do the transformation of Blabla to BlablaDTO, it's the same when I put it in BlablaResource or BlablaService. |
I have just bumped into this problem, and I understand the problem, but should not it be fixed somehow? I was running stress tests on a Quarkus based application with JMeter, the thread number was 300 and the loop count was 100 and it started to throw this exception around the half of the test, so it was really annoying that it was creating exceptions, the error rate was really small, 0.01-0.02% percentage. The application is a simple CRUD based app, with some e-commerce based, list items, add items to cart, and start order and I also used Panache. I'm on 1.9.2.Final. |
@nandorholozsnyak if you can get us a reproducer, that would help. |
Hey @gsmet , I will create one as soon as I can. |
/cc @Sanne FYI, there might be some odd issue lurking here. |
@gsmet @Sanne Hi, I gave an exemple on the 8 May response. We can talk more about this error. Sincerely, |
yes this looks tricky, I should investigate. I need to finish some releases, but hope to switch to this on Wednesday. |
Unless @barreiro maybe has some time to look into this? Please let me know :) |
Hi, I've witnessed the "acquisition timeout" issue too with an as400 database, using the jt400 driver. A problem that someone else also mentionned in this article https://stackoverflow.com/questions/62307245/quarkus-as400-datasource-doesnt-use-all-available-connections I can provide details if needed. I'd be happy to help. I've been using Quarkus a lot since version 1.4.2 and really enjoy it. Thanks for the great work ! |
I just recently had this issue and figured out the problems and cleaned up all cases of connections that were opened during transactions... it is not just forgetting to close connections or prepared statements or regular statements or results/row-sets... it is quite difficult if the code is complex... but end of the day... the "acquisition timeout" issue resolved when all the code was fixed. It would be nice if the Agroal connection pool had a mode that was picky... and one that was less picky... since I had doubts for a bit that I would find all the cases... then I would just want the transactional code to work as it does without connection pooling.... or without a picky connection pooler. It would also be nice to find all the issues upfront... like failing to close connections... but the ones that it failed to find were the cases of re-opening (or creating a new connection) in the middle of a transaction... a copied connection object... something like that... caused this "acquisition timeout" after 5 or 10 or so hits... not right away. I'm only posting to assure any readers... that the issue really is in the code using this connection pool, and once the bugs are found, the pooler works great. |
@typelean thank you so much for the feedback. Some "picky' modes we added in Agroal 1.10 to deal with the issues you describe. These are not yet exposed as a configuration on Quarkus at the moment, but are expected to be in a future version. |
@tgallin Hi, were you able to solve this? Can you provide more details? Thanks. |
Hi, unfortunately, I never solved this issue. As @typelean suggested it was a problem in the code of connections not properly closed I had a second look at our code but couldn't find places where I forgot to close the connections. I ended up disabling the pooling. I know this workaround is not great but it is the only one that has worked for me so far. I haven't tried the latest quarkus version though. quarkus.hibernate-orm.dialect=org.hibernate.dialect.DB2400Dialect |
I did spend a lot of time fiddling with my custom Jdbc library (jdbc 4.0 with postgres) and a lot of testing Now, it has been working reliably in production for months. Currently on Quarkus 1.11.2.Final and 1.11.7.Final, |
@typelean could you share some more details please? To be fair I really do like Agroal being very precise in its demands, as otherwise people seem to have many ways to shoot themselves in the foot without even realizing how critical it is to get this right. But of course details matter, especially if you know what you're doing we could use your feedback to make it better and/or more flexible. |
To be honest, I am not confident of being able to help much... I am sorry... Yes, I have my own jdbc library because The thing I noticed was moving from DBCP was that it was very tolerant of not properly closing connections.... yet still performed connection pooling and was reliable in production. As I moved to Agroal, it got more picky, but in every case I could see that it was correct, and my code was imperfect... I just used isolation and a lot of testing to fix it. But if Agroal had a not so picky mode that was safe enough (like DBCP was), though not as proper, then I could have had less stress with my production problem... when I did not have enough time to figure out the picky fixes. So these are my close statements... that are called all over the place in my use of PreparedStatement, Statement, for queries and updates. public void close() { // rs = ResultSet ps = Prepared Statement st = Statement c = Connection |
Don't worry :) Just trying to understand your perspective.
Ok you don't like JPA - fair enough, it's not a fit for all use cases. But this doesn't imply you have to use your own JDBC library - what do you mean by that, not a custom JDBC implementation but rather you're suggesting you have your own thin layer above the JDBC datasource pool, right?
Interesting point there; I'll think about it. cc/ @barreiro WDYT about this specifically? I normally would expect people to have solid coverage by integration tests, but indeed while we should fail-fast in staging/dev mode it might be fair to try being more lenient in production mode. |
At this point I am fine with the way Agroal works and I don't need a "lenient" mode (like DBCP) because I have fixed my "thin layer" over JDBC and all is working fine in Dev and Production. I would not want to run a different mode in production if I could get the "picky" mode working... but while I was in a state where I had not resolved all the "picky" issues... I could not ignore production that needed to keep working... So I would have a appreciated the "lenient" mode to get by... but I was forced to resolve all the "picky" issues. But it was a bit stressful, since there was no way I was going to change all my code... No it was can I use Agroal or DBCP to get connection pooling working... and Agroal was working much better in Quarkus. I'm the only developer at our place of 5 developers who like Quarkus... and so I have to fight a bit of a battle to get the good stuff integrated in production code... Just care about others who may need connection pooling and are not willing to buy into JPA... and don't quite have all their "thin layer" JDBC quite perfect... DBCP shielded my from it with its "lenient" behavior. |
A few features were added to Agroal and exposed on Quarkus to help developers with the correctness of their applications.
One other thing that would be helpful as well, but I never got a chance to work on it, would be the possibility to manage the pool from the Quarkus developer console (list connections in use and flush them manually if necessary). Going lenient sounds like a good idea, until it's not. I would rather stay away from going down that route. |
I'm getting the same problem. I'm running Quarkus on 2.4.0.Final. I'm using a AWS RDS Postgres Database. This is my configuration: quarkus.datasource.db-kind = postgresql I tried this configuration too: quarkus.datasource.jdbc.max-size = 1000 But it didn't worked. @tgallin this configuration (quarkus.datasource.jdbc.pooling-enabled=false) 'solved' the problem? @barreiro I didn't understand how this configuration (quarkus.datasource.jdbc.transaction-requirement) can fix the problem or what I need set to fix (off, warn, strict). Thanks in advance! |
I have the "acquisition timeout" issue with as400 databases only, using the jt400 driver. I never had any problem with postgresql or sqlserver databases. @marcogutto yes, to work around the issue I have with as400 databases, I use this configuration (quarkus.datasource.jdbc.pooling-enabled=false) and it works. The performance is terrible though but that was to be expected not using a pool of connections. I'd rather use a better alternative but so far I haven't found any. |
Acquiring with a transaction ensures that it will eventually complete and the connection is released. Getting back to the issue, it's hard to tell what is the cause for pool exhaustion without a reproducer. There are many reports of Agroal working fine with the same or similar settings. Disable pooling is indeed an horrible workaround in terms of performance. |
I am experiencing the same exception when using smallrye reactive messaging as well. My implementation is simple in the fact that I have a service with one query retrieving data. The @transactional annotation on the getLinkId() method should be closing the connection to release it back to the pool. I have also tried to set the LinkService to @ApplicationScoped and using the @transactional default of REQUIRED. Java version: 17
|
We are observing this issue during database maintenances. After the database server DNS switch, I am expecting quarkus to detect all of the old "dead" connections and re-establish them automatically after the first acquisition timeout. Instead, all the quarkus based services remain permanently dead, continuously throwing acquisition timeout errors unless I restart them one by one. Building some kind of an automated system to restart all services after a DB server upgrade seems like the wrong way to approach it, there should be some resiliency built in into the connection pool. (running Quarkus 1.12) |
I have a quarkus (2.13.1.Final) application with quartz for scheduled jobs (over multiple pods). I'm using a postgres as a service DB from Azure and I'm regularly, but randomly getting I have JSON/REST endpoinst where we query something from the database (and then return it). Usually everything is fine, until at random intervals the following happens:
Followed by:
Does anyone have some experience with this kind of problem? |
Not sure if this is helpful/useful but all the services we migrated to the reactive (postgres) driver stopped experiencing these issues. Not discounting the possibility that same issues are occurring but being silently swallowed, but if anything it seems to me like the Vert.x driver has a better "dead connection" detection/restarting approach than the Agroal DS. P.S. @rkorver it's possible to increase the acquisition timeout property. You can look into that possibly, and decrease the time that idle connections are kept in the pool. |
Hi, I have same kind of error with quarkus 2.13.4.Final and Jdk 17 I can give other informations if needed
|
Hi all, The issue in my case was our (Azure) firewall shutting down the connections randomly. One way to circumvent this is to activate a public endpoint on your azure databases, and connect directly through that (it won't actually be accessible from outside your network, don't worry) , but I'm sure there are other solutions to this as well. For example I've heard that an alternative solution is to switch to "flexible server" instead of single server. Another issue is that agroal keeps the connections alive too long for the database so I solved that by setting: quarkus.datasource.jdbc.max-lifetime=10m Hope this helps |
Thanks @rkorver ! Another possible (brute force) approach that pretty much resolved this for us was setting the Note: increasing datasource connection pool limits is dangerous due to the combination of typically low connection limits that relational databases like postgres have and autoscaling behaviors of K8s-like runtimes. So if you go down that path make sure to do some performance testing to ensure you don't exhaust the DB. |
Same problem here with MSSQL JDBC connector. The worst thing about it is that even though the application won't run anymore, the smallrye-health check reports that everything is fine. Quarkus 2.16.4.Final |
I experience this issue at Azure too after upgrading from Quarkus 2.4.1.Final to 2.16.4.Final. |
Same problem here at Azure with quarkus 2.16.7 and postgres. Unfortunately neither |
@kraeftbraeu maybe I'm mixing things up with the flexible server. For us the issue was the azure firewall shutting down the connections. Activating a public endpoint on the postgres instance and connecting through that allowed us to circumvent the internal firewall between AKS and postgres, or so I understand. |
And problem still exist Quarkus 3.2.2.Final and with MariaDB. |
Problem exist Quarkus 2.7.6.Final in Docker Swarm and PostgreSQL |
Describe the bug
Hi everyone, I followed the guide on quarkus jdbc, quarkus hibernate orm and quarkus hibernate panache.
My environnement :
What's the problem ?
I launched the application with one datasource (same problem on postgres and mariadb).
I put some configuration for timeout (extremely low because of my bug) : See below my conf
After 1000/2000 requests, I get the error : Unable to acquire JDBC connection (Detail below on Scrrenshot section)
The
idle_in_transaction_session_timeout
is set to 0 on PostgresThe
statement_timeout
is set to 0 on Postgres too.This behaviour is the same on Mariadb.
Sometimes it happens at the 50th request sometimes at the 200th, sometimes more but it happens everytime.
Expected behavior
No more
Unable to acquire JDBC connection
😭 .Actual behavior
(Describe the actual behavior clearly and concisely.)
To Reproduce
Steps to reproduce the behavior:
Configuration
Screenshots
I can give you the error
Environment (please complete the following information):
uname -a
orver
: Linux NOPLACELIKEHOME 4.19.107-1-MANJARO Switch to the Maven distributed copy of the SubstrateVM annotations #1 SMP Fri Feb 28 21:14:27 UTC 2020 x86_64 GNU/Linuxjava -version
:mvnw --version
orgradlew --version
):Additional context
I tried to put acquisition timeout at 30s or more but it's the same and I don't think it can be possible to an user waiting more than 30s because it can be difficult to get a JDBC connection.
I tried to follow advices I saw on Google but none did the job.
I'm available by this issue, email, phone, or others to talk more about this.
Thank you in advance for the time you will take to read this,
The text was updated successfully, but these errors were encountered: