Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(database): limit the number of open database connections (#437) #437

Merged

Conversation

mdonadoni
Copy link
Member

Up until now, to avoid keeping open idle connections with the database,
the connection pool was disposed after every request. However,
connections that are checked out at the moment of the pool disposal are
kept open, and they do not count towards the maximum number of
connections set by SQLALCHEMY_POOL_SIZE and SQLALCHEMY_MAX_OVERFLOW.

This resulted in many connections being open at the same time,
saturating the capacity of the database.

Instead of destroying the pool at each request, connections are closed
each time they are put back in the pool. The end result is the same, as
no idle connection is kept open for long periods of time. At the same
time, the pool enforces that the number of open connections is limited,
so that the database is not overwhelmed.

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (b9f8364) 46.02% compared to head (980f749) 45.99%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #437      +/-   ##
==========================================
- Coverage   46.02%   45.99%   -0.04%     
==========================================
  Files          17       17              
  Lines        1234     1235       +1     
==========================================
  Hits          568      568              
- Misses        666      667       +1     
Files Coverage Δ
reana_job_controller/job_manager.py 77.58% <100.00%> (-0.75%) ⬇️
reana_job_controller/factory.py 93.33% <80.00%> (-2.97%) ⬇️

…b#437)

Up until now, to avoid keeping open idle connections with the database,
the connection pool was disposed after every request. However,
connections that are checked out at the moment of the pool disposal are
kept open, and they do not count towards the maximum number of
connections set by `SQLALCHEMY_POOL_SIZE` and `SQLALCHEMY_MAX_OVERFLOW`.

This resulted in many connections being open at the same time,
saturating the capacity of the database.

Instead of destroying the pool at each request, connections are closed
each time they are put back in the pool. The end result is the same, as
no idle connection is kept open for long periods of time. At the same
time, the pool enforces that the number of open connections is limited,
so that the database is not overwhelmed.
@mdonadoni mdonadoni force-pushed the fix-too-many-database-connections branch from 8d97f70 to 980f749 Compare February 22, 2024 11:49
Copy link
Member

@giuseppe-steduto giuseppe-steduto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I agree with the need for a cleaner and more scalable solution like pgBouncer, but it can come later.

Alternatively, I was also thinking about declaring the pool as a NullPool or using other SQLAlchemy options like pool_recycle, but I guess it would impact badly also the connections made by r-server and the other components, so this is fine for me for the moment!

@mdonadoni
Copy link
Member Author

mdonadoni commented Feb 23, 2024

Alternatively, I was also thinking about declaring the pool as a NullPool or using other SQLAlchemy options like pool_recycle

Regarding NullPool, we can't really use as it does not limit the number of connections open (i.e. it does not support pool_size).

Regarding pool_recycle, connection are recycled when checking them out of the pool, so they are kept open indefinitely it they stay in the pool and are not being used. See Setting Pool recycle: any DBAPI connection that has been open for more than one hour will be invalidated and replaced, upon next checkout. Note that the invalidation only occurs during checkout.

So unfortunately neither of them are a good option for us!

@mdonadoni mdonadoni merged commit 980f749 into reanahub:master Feb 28, 2024
14 checks passed
@mdonadoni mdonadoni deleted the fix-too-many-database-connections branch February 28, 2024 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants