Fix compatibility with third-party sync-only middleware #19

Archmonger · 2024-08-28T02:14:08Z

Description

Fix GCP Middleware fails to run with ServeStatic #18
Disable Django sync middleware capability to avoid issues with Django's usage of asgiref.AsyncToSync

Checklist

Please update this checklist as you complete each item:

Tests have been developed for bug fixes or new functionality.
The changelog has been updated, if necessary.
Documentation has been updated, if necessary.
GitHub Issues closed by this PR have been linked.

_{By submitting this pull request I agree that all contributions comply with this project's open source license(s).}

Archmonger · 2024-08-28T02:15:19Z

@brianglass can you see if your GCP middleware works with this branch?

pip install git+https://github.com/Archmonger/ServeStatic@fix-gcp-middleware

brianglass · 2024-08-28T15:59:39Z

@Archmonger, yes, that works.

brianglass · 2024-08-28T18:11:05Z

@Archmonger, I deployed this to production and it is working well. I can confirm that the warnings have gone away.

Archmonger · 2024-08-28T18:27:59Z

Fairly frustrating since this proves how incredibly brittle async support is within Django.

I'll work on getting this merged soon.

brianglass · 2024-08-28T18:49:54Z

Yes, it's pretty difficult to get your stack clear of synchronous middleware. I should probably look for a way to get rid of the Google middleware since it causes a context switch. it's frustrating to go through all the effort of building everything async only to need something like this in the end.

brianglass · 2024-08-28T20:02:41Z

For future reference to anyone reading this conversation, I created a wrapper around the Google logging middleware to make it async compatible. The middleware doesn't actually do anything other than assign the request object to a thread local.

from asgiref.sync import iscoroutinefunction
from django.utils.decorators import sync_and_async_middleware
from google.cloud.logging_v2.handlers.middleware import RequestMiddleware

@sync_and_async_middleware
def google_logging_middleware(get_response):
    if iscoroutinefunction(get_response):
        async def middleware(request):
            response = await get_response(request)
            google_middleware = RequestMiddleware(lambda request: response)
            return google_middleware(request)
    else:
        middleware = RequestMiddleware(get_response)

    return middleware

Archmonger · 2024-08-29T06:30:12Z

@brianglass I just pushed a different attempt at fixing this issue. Let me know if it works in your environment.

brianglass · 2024-08-29T16:18:09Z

@Archmonger, that again produces the following error:

web-1  | ERROR Internal Server Error: /readings/gregorian/2024/8/23/
web-1  | Traceback (most recent call last):
web-1  |   File "/usr/local/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
web-1  |     response = get_response(request)
web-1  |                ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.12/site-packages/google/cloud/logging_v2/handlers/middleware/request.py", line 48, in middleware
web-1  |     return get_response(request)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.12/site-packages/asgiref/sync.py", line 187, in __call__
web-1  |     raise RuntimeError(
web-1  | RuntimeError: You cannot use AsyncToSync in the same thread as an async event loop - just await the async function directly.

If I switch to using the wrapped middleware I posted above, the error goes away.

This reverts commit 1964aeb.

Archmonger · 2024-08-29T20:49:48Z

Okay - I've reverted back to the solution that initially worked. Unfortunately Django is extremely sensitive to mixed sync/async contexts, so unless some changes happen to asgiref/django our middleware will now be purely async.

The downside of this change is that Django WSGI will now need to convert our middleware from async to sync.

On the bright side, if we can keep things purely async it makes maintainability a lot easier!

If this becomes a problem for WSGI users, I will need to revert this PR and kludge together my own variant of asgiref.async_to_sync that always runs in a temporary thread. Would be a similar solution to the servestatic.middleware.AsyncToSyncIterator utility function I developed.
EDIT: I tested this and it was not a solution unfortunately.

brianglass · 2024-08-30T14:21:08Z

Now that I've made such a todo about this, I realize that the Google logging middleware may not even be safe for use in async applications. Since it is using a thread-local to share the request object with the logger and async Django might be processing multiple requests concurrently in a single thread, it doesn't make sense to use this. I'll be looking for an alternative or simply going back to vanilla logging.

Archmonger · 2024-08-30T18:12:26Z

Django does not reuse threads. It is one thread per request.

brianglass · 2024-08-30T19:57:03Z

According to the Django async Docs, "The main benefits are the ability to service hundreds of connections without using Python threads. This allows you to use slow streaming, long-polling, and other exciting response types." I take that to mean that multiple requests can be handled in a single thread. So instead of using a thread local, one should use contextvars to manage state. These docs say, "Context managers that have state should use Context Variables instead of threading.local() to prevent their state from bleeding to other code unexpectedly, when used in concurrent code."

Am I missing something?

Archmonger · 2024-08-30T20:47:38Z

The explanation is a bit of a sore point for me, since it drives home how async is still half-baked in Django.

Upon the release of Django 3.1, the asgiref converter functions (sync_to _async and async_to_sync) ran within a thread pool. I believe it was something like CPU_CORES * 2 amount of threads by default. Unfortunately, the Django team quickly realized that this was causing silent corruption of the ORM. The Django ORM is lazy, so it starts a DB connection and only finishes execution when when a value is needed, which was problematic for some of the DB backends. They did not want to rewrite any aspects of the ORM since that might impact sync performance, especially given that there was a decision made that Django would prioritize sync at all costs. So instead, they forced all of those converter functions to run on the main thread. This effectively meant that async code was run "purely sync" except with even worse performance due to stack complexity.

Enough people complained how cripplingly bad Django's async performance was, so around the release of Django ~4.0, they opted for a "middle ground" solution, where Django would create a temporary thread for each HTTP request and use that as a safe execution environment when needed whenever conversion is needed within the Django stack.

There's still always an async performance penalty under the hood since Django utilizes this converter function all over their source code, but at least it's not absolutely crippling now.

However, to this day all ORM, cache, and template rendering is still performed directly on the main thread via async_to_sync(thread_sensitive=True), which stalls the ASGI queue and forces a context-switch performance hit.

Things are more performant if you step outside the Django stack (such as the way channels sits on-top of Django's HTTP stack). This unfortunately means you only get the most benefits of ASGI if you avoid entering Django entirely.

I haven't had time to benchmark ServeStatic, but I suspect this also means that using servestatic.middleware.ServeStaticMiddleware is likely significantly less performant than wrapping the entire Django application in servestatic.ServeStaticASGI.

brianglass · 2024-09-02T17:56:28Z

Thanks for that detailed explanation @Archmonger.

Fix middleware that uses AsyncToSync

54e054b

Remove unneeded serve method

f0cc91f

Archmonger added 2 commits August 28, 2024 22:34

Add changelog

24a44e1

Attempt using async_to_sync

1964aeb

Archmonger changed the title ~~Fix middleware that uses AsyncToSync~~ Fix compatibility with third-party sync-only middleware Aug 29, 2024

misc fixes

5b8d843

Archmonger added 2 commits August 29, 2024 13:28

Revert "Attempt using async_to_sync"

6671738

This reverts commit 1964aeb.

minor cleanup

c7509e0

Archmonger added 2 commits August 29, 2024 13:59

comment and changelog updates

0afb1fc

fix tests

ae2992a

Archmonger marked this pull request as ready for review August 29, 2024 22:31

Add tests

41adbaa

Archmonger merged commit a486fe9 into main Aug 29, 2024
15 checks passed

Archmonger deleted the fix-gcp-middleware branch August 29, 2024 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix compatibility with third-party sync-only middleware #19

Fix compatibility with third-party sync-only middleware #19

Archmonger commented Aug 28, 2024 •

edited

Loading

Archmonger commented Aug 28, 2024

brianglass commented Aug 28, 2024

brianglass commented Aug 28, 2024

Archmonger commented Aug 28, 2024

brianglass commented Aug 28, 2024

brianglass commented Aug 28, 2024 •

edited by Archmonger

Loading

Archmonger commented Aug 29, 2024

brianglass commented Aug 29, 2024 •

edited

Loading

Archmonger commented Aug 29, 2024 •

edited

Loading

brianglass commented Aug 30, 2024

Archmonger commented Aug 30, 2024

brianglass commented Aug 30, 2024

Archmonger commented Aug 30, 2024 •

edited

Loading

brianglass commented Sep 2, 2024

Fix compatibility with third-party sync-only middleware #19

Fix compatibility with third-party sync-only middleware #19

Conversation

Archmonger commented Aug 28, 2024 • edited Loading

Description

Checklist

Archmonger commented Aug 28, 2024

brianglass commented Aug 28, 2024

brianglass commented Aug 28, 2024

Archmonger commented Aug 28, 2024

brianglass commented Aug 28, 2024

brianglass commented Aug 28, 2024 • edited by Archmonger Loading

Archmonger commented Aug 29, 2024

brianglass commented Aug 29, 2024 • edited Loading

Archmonger commented Aug 29, 2024 • edited Loading

brianglass commented Aug 30, 2024

Archmonger commented Aug 30, 2024

brianglass commented Aug 30, 2024

Archmonger commented Aug 30, 2024 • edited Loading

brianglass commented Sep 2, 2024

Archmonger commented Aug 28, 2024 •

edited

Loading

brianglass commented Aug 28, 2024 •

edited by Archmonger

Loading

brianglass commented Aug 29, 2024 •

edited

Loading

Archmonger commented Aug 29, 2024 •

edited

Loading

Archmonger commented Aug 30, 2024 •

edited

Loading