-
-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent garbage collection of main lifespan task #972
Prevent garbage collection of main lifespan task #972
Conversation
it would help having a reproducible example without aioredis and that doesn't require to edit the standard library, bit hacky is quite an understatement here.
a very simple example with the "bad" way of creating tasks like the below does not show this error, you can even add a async def index(request: Request):
return JSONResponse({"hello": "e"})
routes = [
Route("/", index),
WebSocketRoute("/ws", Echo)
]
async def counter(msg, delay):
i = 0
while True:
print(msg, i)
i += 1
await asyncio.sleep(delay)
async def on_startup():
logger.debug('startup lifespan')
asyncio.create_task(counter('count', 1))
async def on_shutdown():
logger.debug('shutdown lifespan')
app = Starlette(routes=routes, on_startup=[on_startup], on_shutdown=[on_shutdown]) so we need to understand why it is gc in the first place imho, quick google reveals some interesting links, another redis issue that looks very similar : jonathanslenders/asyncio-redis#56 https://bugs.python.org/issue21163 is also interesting and does not suggest anything related to hard-references. |
Great questions! I went ahead and did a little more digging and have some more solid examples. So, a simple example that fails with garbage collection is an app that uses import asyncio
import gc
async def request_google():
reader, writer = await asyncio.open_connection('google.com', 80)
writer.write(b'GET / HTTP/2\n\n')
await writer.drain()
response = await reader.read()
return response
def app(scope):
async def asgi(receive, send):
google_response = await request_google()
await send({"type": "http.response.start", "status": 200, "headers": [[b"content-type", b"text/plain"]]})
await send({"type": "http.response.body", "body": google_response})
return asgi If we run this with the garbage collection modification, we see that it immediately errors with:
You may notice that if we replace the
So, it sounds like Note, we can replace
If, however, we would like to create an example where this happens without modifying the standard library code, we simply need to create an example where our application waits on a web request ( import asyncio
import gc
async def request_google():
reader, writer = await asyncio.open_connection('google.com', 80)
writer.write(b'GET / HTTP/2\n\n')
await writer.drain()
response = await reader.read()
return response
async def do_gc():
for i in range(10):
await request_google()
gc.collect()
def app(scope):
async def asgi(receive, send):
google_responses = b''
task = asyncio.create_task(do_gc())
for i in range(10):
google_responses += await request_google()
await task
await send({"type": "http.response.start", "status": 200, "headers": [[b"content-type", b"text/plain"]]})
await send({"type": "http.response.body", "body": google_responses})
return asgi Here, even without the standard library modification in place, we consistently see:
Interestingly, perhaps related to when Python performs garbage collection, we can also see this in the starlette example when we replace
In all of these cases, we don't see any instances of these errors with the modification proposed in this PR. Anyways, let me know if this all makes sense. Would be happy to provide more details. |
ok @MatthewScholefield your example was definitely very helpful thanks import asyncio
import gc
async def request_google(x):
reader, writer = await asyncio.open_connection('google.com', 80)
writer.write(b'GET / HTTP/2\n\n')
await writer.drain()
response = await reader.read()
print(x)
return response
async def do_gc():
for i in range(10):
await request_google(f"do_gc: {i}")
gc.collect()
async def startup():
task = asyncio.create_task(do_gc())
for i in range(10):
await request_google(f"startup: {i}")
await task
async def app(scope, receive, send):
if scope['type'] == 'lifespan':
message = await receive()
if message['type'] == 'lifespan.startup':
await startup()
await send({'type': 'lifespan.startup.complete'})
elif scope['type'] == 'http':
await send({"type": "http.response.start", "status": 200, "headers": [[b"content-type", b"text/plain"]]})
await send({"type": "http.response.body", "body": b"1"}) traceback:
|
uvicorn/lifespan/on.py
Outdated
@@ -22,12 +22,14 @@ def __init__(self, config: Config) -> None: | |||
self.startup_failed = False | |||
self.shutdown_failed = False | |||
self.should_exit = False | |||
self.main_task = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this wont be used anywhere, let's remove that
uvicorn/lifespan/on.py
Outdated
|
||
async def startup(self) -> None: | ||
self.logger.info("Waiting for application startup.") | ||
|
||
loop = asyncio.get_event_loop() | ||
loop.create_task(self.main()) | ||
task = loop.create_task(self.main()) | ||
self.main_task = task # Keep a hard reference to prevent garbage collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.main_task = task # Keep a hard reference to prevent garbage collection | |
main_lifespan_task = loop.create_task(self.main()) # noqa: F841 # Keep a hard reference to prevent garbage collection, see https://github.com/encode/uvicorn/pull/972 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is enough just to create the variable, let's also add a comment pointing to the PR as it's definitely non-trivial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up, I realized recently that this isn't enough to just create the variable. If the app waits on any io immediately after sending the startup complete message it will still get collected by GC. Probably not something many apps do though so relatively low impact.
Hmm, intuitively I would have expected that the task would continue running outside of Anyways, I've made the suggested changes and the linting errors should be fixed. Let me know if you'd like me to do anything else. - Matthew |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for this.
This maintains a reference to the created task in the lifespan to fix some subtle bugs that arise from the entire app being garbage collected due to no references of the running coroutine being kept.
TL;DR:
asyncio.create_task(foo())
= badtask = asyncio.create_task(foo())
= goodThis bug can be illustrated with a relatively simple app using
aioredis
(I could probably find an example without aioredis, but this example should suffice):When running this with
uvicorn example:app
it seems like everything works (the app starts up correctly), but if we force garbage collection on a specific line within the event loop, we consistently encounter the following error:While a bit hacky, to debug we can force this garbage collection in the event loop as follows:
/usr/lib/python3.*/asyncio/base_events.py
and withindef _run_once
, near the bottom immediately within thefor i in range(ntodo):
, addimport gc; gc.collect()
.asyncio
event loop so that it uses this modified code by running with:uvicorn example:app --loop asyncio
After this change, every execution of
uvicorn
should result in the error shown above.If we apply the changes from this PR we can see this will no longer error.
Note, this was initially discovered and reported within aioredis. However, I've since realized that the error lied in the uvicorn code that ran above it.