-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "event loop is already running" bug on Linux #450
Conversation
The E2E snapshot failure does not appear related. New jobs are showing up as "Stopped" instead of "In progress" on first render. I am able to reproduce this on my Linux machine. Since this is a new and legitimate issue, it will be best to address this in a separate PR. I've filed an issue to track this: #451 |
please update playwright snapshots |
0157beb
to
9f5dc56
Compare
Great, now the snapshot updating workflow doesn't work because of this: actions/runner-images#8531 🤦 I'll update the branch and try again. |
please update playwright snapshots |
Oh right, workflows always are run as defined on |
Great, now my fork isn't letting me create a new PR to allow me to run the update snapshots workflow manually because this one already exists. I must have rolled a 1d20 this morning on my luck stat. 🤦 OK, fine, I'll update the workflow in a separate PR and remove the commit from this branch. |
6434736
to
0b84cb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job on fixing this 🚀 .
Kicking CI. Not sure why the workflows aren't running after the snapshot update. |
Description
This PR fixes the dreaded "event loop is already running" bug. I've determined the root cause of this bug and will explain why these changes fix it.
Context
First, I would like to describe the context:
multiprocessing.Process
.nbconvert.preprocessors:ExecutePreprocessor
, which are actually implemented by wrapping the async methods in a function namedrun_sync()
, provided byjupyter_core.utils
run_sync()
, as you might expect, takes a coroutine and returns a synchronous function that "does the same thing". For more specifics, it's best to refer to the code directly.First cause
When
nbconvert
callsrun_sync()
, it sometimes throws an exception with the message "The event loop is already running".Root cause
multiprocessing
starts a process by spawning a new one from scratch.multiprocessing
starts new processes by forking the current process.asyncio.get_event_loop()
fails in forked child processes, as it returns the event loop of the parent process instead of the child process in which it was called.This means that on Linux,
asyncio.get_event_loop()
in a new process actually returns the event loop of the server process, which of course is always running (since it's running the server). This is why the exception is thrown. 😁The fix
To fix this, we need to indicate to
multiprocessing
(MP) to only start processes viaspawn
instead offork
. Thankfully this can be done via MP contexts, as shown in this PR.I also see that we are running the Downloader in a new process via MP. However, I did not implement the same changes there for 3 reasons:
asyncio.get_event_loop()
in the implementation for the same bug to appear.spawn
instead offork
does have a significant performance drawback, so unless there is reason to otherwise, we should stick with MP's default behavior.Next steps
I filed a new issue to add unit test coverage: #453