Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resources (psql db connection) from worker threads not closed (gc.collect() works around it for asyncio backend] #373

Closed
blueyed opened this issue Sep 30, 2021 · 6 comments

Comments

@blueyed
Copy link

blueyed commented Sep 30, 2021

I am seeing an issue where database connections in tests with FastAPI/Starlette are still open at the end of the test session, resulting in an error when the test database is trying to get dropped ("There is 1 other session using the database.").

This can be fixed when using the asyncio backend with the following patch:

diff --git i/src/anyio/from_thread.py w/src/anyio/from_thread.py
index d845f99..e83641d 100644
--- i/src/anyio/from_thread.py
+++ w/src/anyio/from_thread.py
@@ -404,3 +404,5 @@ async def run_portal() -> None:
             portal.call(portal.stop, False)

         run_future.result()
+        import gc; gc.collect()

However, with the trio backend this does not help there still.

A test project is available at https://github.com/ming-tung/experiment-django-with-fastapi.

It is not as minimal as it could be, since probably this does not need Django nor a database really, since the issue appears to be about resource "leaks" from the ThreadPoolExecutor in general.

To reproduce it with the example project, which requires a PostgreSQL DB:

% git clone https://github.com/ming-tung/experiment-django-with-fastapi
% cd experiment-django-with-fastapi
% python -m venv .venv
% poetry config --local virtualenvs.in-project true
% source .venv/bin/activate
% export DATABASE_URL=psql://username:password@localhost:5432/experiment-django-with-fastapi
% pytest

This will have a delay in the end, and show two warnings, which are caused by the failure to delete the DB - the important part being "Error when trying to teardown test databases" (fixed/improved in Django via django/django#14918)

Upgrading starlette still shows the issue:

% pip install -e ~v/starlette
…
Successfully installed anyio-3.3.2 sniffio-1.2.0 starlette-0.16.0

When installing anyio editable and applying the patch to trigger the garbage collector it works, but still fails when patching the test project to use trio:

% pip install trio
…
Successfully installed async-generator-1.10 outcome-1.1.0 sortedcontainers-2.4.0 trio-0.19.0
diff --git i/poll/tests/test_api.py w/poll/tests/test_api.py
index 62e7369..bba3783 100644
--- i/poll/tests/test_api.py
+++ w/poll/tests/test_api.py
@@ -4,7 +4,7 @@
 from mysite.asgi import fastapp
 from poll.models import Book
 
-client = TestClient(fastapp)
+client = TestClient(fastapp, backend="trio")
 
 
 @pytest.fixture

For reference: Starlette integrated anyio in 0.15.0 (encode/starlette#1157 (encode/starlette@42592d6)).

Before integration of anyio Starlette was already affected by this, but I've only investigated how to "fix" it with anyio being integrated.

I can see that adding an explicit "gc.collect()" is fishy, and probably not a good fix in general, but wanted to create an issue already, and would be happy to investigate further, preferably after getting some pointers.

refs: pytest-dev/pytest-django#429 (long-standing issue with Django DB connections and threads)
Python 3.9.7

@agronholm
Copy link
Owner

Relying on the garbage collector for this seems indeed dubious. What's actually supposed to close these connections?

@blueyed
Copy link
Author

blueyed commented Sep 30, 2021

Apparently they are meant to be closed at the end of the program implicitly, which fails when threads come into play.
For the main thread it is fine to have it still, since there it can be handled (closed/re-used) when dropping the test DB.

The difference between asyncio and trio backends appears to be that trio is using daemon threads, where joining them is not possible, and they are meant to be closed at the end of the program. Using non-daemon threads, and gc'ing manually helped there also.

I believe the workaround of explicitly closing the connections manually before tearing down the test databases is the only way to solve this properly/probably.
With the asyncio backend you could use the following via Django's connection_created signal:

    thread = threading.current_thread()
    if thread.name == "MainThread":
        return
    root_task = thread.root_task
    root_task.add_done_callback(lambda task: connection._close())

Otherwise a query using pg_terminate_backend can be used, as mentioned on https://www.stavros.io/posts/fastapi-with-django/ also.

Closing the issue here, since it is not really a problem with anyio, but it could maybe only provide a consistent API to register cleanup functions from within threads, but this would still require to invoke/trigger them manually somehow likely, and might not really work with different types of threads (daemon/non-daemon).

Thanks for looking into this! :)

@blueyed blueyed closed this as completed Sep 30, 2021
@blueyed
Copy link
Author

blueyed commented Sep 30, 2021

Re-opening: actually I think it would be beneficial in general to trigger the garbage collector automatically, which might happen later eventually anyway, resulting in "flaky behavior". This would help libraries like Starlette/FastAPI and its users then to not getting confused about it, and/or having to add code for this.
For asyncio this is trivial (see the patch), and for trio it could be done probably also.

@blueyed blueyed reopened this Sep 30, 2021
@smurfix
Copy link
Collaborator

smurfix commented Oct 1, 2021

Well, the Python-ish way to handle a connection that needs closing, or a file that needs closing, or … you get my drift, is to use a context manager. Or in this case an async context manager.

This is because of "Structured Concurrency" which is pretty much the cornerstone of Trio's, and thus anyio's, design.

IMHO it is a very bad idea to add support for "registering cleanup functions" to anyio. There already is a way to register a cleanup function: you use a context manager. Two actually if you count try: … finally ….

@agronholm
Copy link
Owner

What he said. Involving the garbage collector outside of tests is a bad, bad idea and it would be trying to fix the problem from the wrong end.

@blueyed
Copy link
Author

blueyed commented Oct 1, 2021

Ok, given that it is only about tests to start with (Starlette's TestClient) it should get solved there then maybe.
But from the top of my head I could not see how/where a context manager could be used there: the connections get created at the start of tests (on demand), but not within the control of the test runner.
Anyway, closing then. Feel free to follow up with more info, of course.
Thanks!

@blueyed blueyed closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants