Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why isn't a connection pool being used? #163

Closed
whatamithinking opened this issue Feb 6, 2022 · 1 comment
Closed

Why isn't a connection pool being used? #163

whatamithinking opened this issue Feb 6, 2022 · 1 comment
Labels

Comments

@whatamithinking
Copy link

Description

Apologies if I am missing the obvious.

Other similar libraries (aiomy, aioodbc, aiopg, etc.) all have connection pooling built in. The other libraries all seem to take a similar approach where they wrap a library that works with multiple threads but not with async tasks.

Since this library seems to be widely used, as opposed to aioodbc which also supports SQLite but is not in as widespread use, I am a little confused as to why a connection pool is not being used to increase concurrency.

Details

  • OS: Windows 10
  • Python version: 3.9
  • aiosqlite version: 0.17.0
  • Can you repro on 'main' branch?
  • Can you repro in a clean virtualenv?
@amyreese
Copy link
Member

amyreese commented Feb 7, 2022

Generally speaking, the other DB libraries are pooling connections because a) making a network connection takes a significant amount of time between TCP handshake, etc, so having connections already made and waiting is faster; and b) because network bandwidth is limited, so preventing too many simultaneous connections can help ensure better network utilization.

However, since sqlite by definition is a local database, either on-disk or in memory, the benefits from connection pooling are minimal: there is no network delay to opening a file on disk or allocating memory, and disk bandwidth is much higher (and better managed by the kernel). "Connections" are lightweight compared to mysql or postgres, and can generally be opened or closed at will. The biggest cost to a new connection will likely be spinning up another thread.

I added a simple perf test to measure speed of creating connections, and on my M1 Mac Mini with Python 3.10.2, I was able to open ~4700 connections per second from a file, or ~5300/s for in-memory DBs:

(.venv) user@mordin ~/workspace/aiosqlite main  » python -m unittest aiosqlite.tests.perf.PerfTest -k test_connection
Running perf tests for at least 2.0s each...
..
Perf Test                 Iterations  Duration         Rate
connection_file                 9454      2.0s     4726.7/s
connection_memory              10504      2.0s     5251.8/s

----------------------------------------------------------------------
Ran 2 tests in 4.005s

When it comes to concurrency, you can simply create more connections with aiosqlite, without needing a connection pool. Pooling connections could potentially still help in that regard if you expect to be making a very large number of concurrent requests and want to limit the number of background threads, though you will likely end up with contention on the limited number of connections in the pool instead.

My general suggestion for most use cases with aiosqlite is to just keep the code simple, and open a new connection using a context manager anywhere in the code you need to talk to the db. This also helps keep transactions isolated, and prevents a wide class of bugs around managing queries on shared connection threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants