Fix "event loop is already running" bug on Linux #450

dlqqq · 2023-10-30T21:25:28Z

Description

Fixes RuntimeError: "This event loop is already running" with nbclient==0.8.0 #435
See also All jobs failing with message "This event loop is already running" #330

This PR fixes the dreaded "event loop is already running" bug. I've determined the root cause of this bug and will explain why these changes fix it.

Context

First, I would like to describe the context:

The executor is started as a separate process via the class multiprocessing.Process.
The executor calls synchronous methods on the class nbconvert.preprocessors:ExecutePreprocessor, which are actually implemented by wrapping the async methods in a function named run_sync(), provided by jupyter_core.utils
- run_sync(), as you might expect, takes a coroutine and returns a synchronous function that "does the same thing". For more specifics, it's best to refer to the code directly.

First cause

When nbconvert calls run_sync(), it sometimes throws an exception with the message "The event loop is already running".

Root cause

On all non-POSIX systems (e.g. Windows & MacOS), multiprocessing starts a process by spawning a new one from scratch.
However, only on POSIX systems (e.g. Linux), multiprocessing starts new processes by forking the current process.
asyncio.get_event_loop() fails in forked child processes, as it returns the event loop of the parent process instead of the child process in which it was called.
- This may be fixed in Python 3.12, but I do not have the time to verify. See: GH-66285: fix forking in asyncio python/cpython#99769

This means that on Linux, asyncio.get_event_loop() in a new process actually returns the event loop of the server process, which of course is always running (since it's running the server). This is why the exception is thrown. 😁

The fix

To fix this, we need to indicate to multiprocessing (MP) to only start processes via spawn instead of fork. Thankfully this can be done via MP contexts, as shown in this PR.

I also see that we are running the Downloader in a new process via MP. However, I did not implement the same changes there for 3 reasons:

The default Downloader (provided by the default JobFilesManager) works on Linux without any further changes.
Since the Downloader's interface only has synchronous methods, an implementer would have to go out of their way to call asyncio.get_event_loop() in the implementation for the same bug to appear.
Using spawn instead of fork does have a significant performance drawback, so unless there is reason to otherwise, we should stick with MP's default behavior.

Next steps

I filed a new issue to add unit test coverage: #453

dlqqq · 2023-10-30T21:43:54Z

The E2E snapshot failure does not appear related. New jobs are showing up as "Stopped" instead of "In progress" on first render. I am able to reproduce this on my Linux machine.

Since this is a new and legitimate issue, it will be best to address this in a separate PR. I've filed an issue to track this: #451

dlqqq · 2023-10-30T21:47:29Z

please update playwright snapshots

dlqqq · 2023-10-30T21:55:56Z

Great, now the snapshot updating workflow doesn't work because of this: actions/runner-images#8531

🤦 I'll update the branch and try again.

dlqqq · 2023-10-30T21:57:46Z

please update playwright snapshots

dlqqq · 2023-10-30T21:59:51Z

Oh right, workflows always are run as defined on main rather than as defined on the current branch. So I have to do this manually from my fork. 🤦

dlqqq · 2023-10-30T22:13:20Z

Great, now my fork isn't letting me create a new PR to allow me to run the update snapshots workflow manually because this one already exists. I must have rolled a 1d20 this morning on my luck stat. 🤦

OK, fine, I'll update the workflow in a separate PR and remove the commit from this branch.

3coins

Great job on fixing this 🚀 .

dlqqq · 2023-10-30T22:37:31Z

Kicking CI. Not sure why the workflows aren't running after the snapshot update.

dlqqq added the bug Something isn't working label Oct 30, 2023

dlqqq requested a review from 3coins October 30, 2023 21:25

dlqqq force-pushed the fix-event-loop-bug branch from 0157beb to 9f5dc56 Compare October 30, 2023 21:47

dlqqq mentioned this pull request Oct 30, 2023

Migrate from hub to gh in workflows #452

Merged

fix event loop is already running bug on Linux

0b84cb3

dlqqq force-pushed the fix-event-loop-bug branch from 6434736 to 0b84cb3 Compare October 30, 2023 22:20

3coins approved these changes Oct 30, 2023

View reviewed changes

Update Playwright Snapshots

9db1f45

dlqqq closed this Oct 30, 2023

dlqqq reopened this Oct 30, 2023

dlqqq mentioned this pull request Oct 30, 2023

[1.x] Fix "event loop is already running" bug on Linux #454

Merged

dlqqq added enhancement New feature or request and removed enhancement New feature or request labels Oct 30, 2023

dlqqq merged commit 867d2ea into jupyter-server:main Oct 30, 2023
7 of 8 checks passed

dlqqq deleted the fix-event-loop-bug branch October 30, 2023 22:50

dlqqq mentioned this pull request Oct 30, 2023

RuntimeError: "This event loop is already running" with nbclient==0.8.0 #435

Closed

dlqqq mentioned this pull request Apr 23, 2024

Auto-download files from the staging directory to output #500

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix "event loop is already running" bug on Linux #450

Fix "event loop is already running" bug on Linux #450

dlqqq commented Oct 30, 2023 •

edited

Loading

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

3coins left a comment

dlqqq commented Oct 30, 2023

Fix "event loop is already running" bug on Linux #450

Fix "event loop is already running" bug on Linux #450

Conversation

dlqqq commented Oct 30, 2023 • edited Loading

Description

Context

First cause

Root cause

The fix

Next steps

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023

3coins left a comment

Choose a reason for hiding this comment

dlqqq commented Oct 30, 2023

dlqqq commented Oct 30, 2023 •

edited

Loading