-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
org.hibernate.QueryTimeoutException: Unable to acquire JDBC Connection #3280
Comments
Scheduler logs:
This goes on forever |
Thread dump: |
I managed to reproduce the issue. By submitting a few hundred jobs simultaneously (2 sessions at the same time), the problem would come up after ~20 minutes: export SESSIONID=`curl --request POST \
--url http://localhost:8080/rest/scheduler/login \
--header 'content-type: application/x-www-form-urlencoded' \
--data 'username=$PA_USERNAME&password=$PA_PASSWORD'` for i in `seq 1 350`; do curl --header "sessionid:$SESSIONID" \
--form "file=@/home/paraita/job.xml;type=application/xml" \
http://localhost:8080/rest/scheduler/submit; echo""; sleep 1; done It is required to log into the scheduler web portal and use it to display the pending jobs or the problem wouldn't happen. One quick fix would be to remove the parallel stream and use a regular one. (I didn't notice any sensible slow down in doing so). |
I found the root cause. The Scheduler is relying on Hibernate for the persistence (with its own context). The scheduling-api is also relying on Hibernate but uses another Hibernate context. The probability to have one job accessed at the same time in both contexts is really small, but tends to increase depending on 2 factors:
From the Hibernate doc, it is possible to have multiple EntityManagers but they shouldn't be mapping the same entities, the common practice is to separate those contexts into different units. I believe using a regular stream inside the lambda will mitigate a lot the exception. |
Reduced the severity as it did not reappear. |
This serious issue freezes the scheduler, where all database operation seems to be dangling forever.
Attached is a resulting thread dump, and some scheduler logs.
The text was updated successfully, but these errors were encountered: