Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slower atexit methods do not run to completion #1422

Open
parantapa opened this issue May 8, 2024 · 6 comments
Open

Slower atexit methods do not run to completion #1422

parantapa opened this issue May 8, 2024 · 6 comments
Labels

Comments

@parantapa
Copy link

Description

Cleanup methods that are slower do not run to completion when restarting kernel.

Reproduce

  1. Create a new IPython notebook in Jupyter Lab.
  2. Create and execute a new cell with the following code
import atexit
import time
from pathlib import Path

def write_hello():
    time.sleep(5)
    Path("hello.txt").write_text("hello")

atexit.register(write_hello)
  1. Click on "Restart kernel"
  2. The new file "hello.txt" was not created.

Expected behavior

The "hello.txt" should have been created.

Context

  • Operating System and version: ArchLinux
  • Browser and version: Firefox Developer version 125.0b6
  • JupyterLab version: 4.1.6
  • Jupyter server version: 2.14.0

Additional info

Jupyter lab shows a TimeoutError on console.

[E 2024-05-02 10:50:31.388 ServerApp] Uncaught exception GET /api/kernels/28d27336-01eb-4fed-9b24-e5b4a1ae8ef4/channels?session_id=3fa3bd68-c341-4029-b71b-92c68bdc3ba6 (127.0.0.1)
    HTTPServerRequest(protocol='http', host='localhost:8888', method='GET', uri='/api/kernels/28d27336-01eb-4fed-9b24-e5b4a1ae8ef4/channels?session_id=3fa3bd68-c341-4029-b71b-92c68bdc3ba6', version='HTTP/1.1', remote_ip='127.0.0.1')
    Traceback (most recent call last):
      File "/home/parantapa/miniconda3/envs/notebook_env/lib/python3.11/site-packages/tornado/web.py", line 1790, in _execute
        result = await result
                 ^^^^^^^^^^^^
      File "/home/parantapa/miniconda3/envs/notebook_env/lib/python3.11/site-packages/jupyter_server/services/kernels/websocket.py", line 65, in get
        await self.pre_get()
      File "/home/parantapa/miniconda3/envs/notebook_env/lib/python3.11/site-packages/jupyter_server/services/kernels/websocket.py", line 59, in pre_get
        await self.connection.prepare()
      File "/home/parantapa/miniconda3/envs/notebook_env/lib/python3.11/site-packages/jupyter_server/services/kernels/connection/channels.py", line 318, in prepare
        raise TimeoutError(msg)
    TimeoutError: Kernel never reached an 'alive' state.

This error is gone when using a larger kernel_info_timeout.
However, the atexit method still does not run to completion.

The above code works as expected if run from IPython (inside shell).

I have been trying to find the spot in Jupyter Lab, Jupyter Server and Ipython that is responsible for killing the kernel. But I since I am not familiar enough, I haven't figured it out.

I had posted this issue on JupyterLab jupyterlab/jupyterlab#16276 (comment)
and was told this might be a better place to report the issue.

@parantapa parantapa added the bug label May 8, 2024
Copy link

welcome bot commented May 8, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@parantapa

This comment was marked as off-topic.

1 similar comment
@parantapa

This comment was marked as off-topic.

@bluescarni
Copy link

I think I am seeing this issue as well.

I have written a library that generates large amounts of binary data which is written to disk into temporary files and then memory mapped. In order to guarantee that these files are properly removed at shutdown, I have registered atexit functions to perform the cleanup. These functions can take a few seconds to run because of the latency due to memory-unmapping the big files and removing them.

Occasionally, some files are not removed on shutdown, and they persist on disk. I have observed this behaviour only when using jupyterlab, never when using just the Python interpreter to run my code/scripts.

The fact that some files are deleted and some are not strongly suggests to me that the cleanup functions are being interrupted by a signal while they are running.

@minrk
Copy link
Contributor

minrk commented Nov 22, 2024

This is likely KernelManager.shutdown_wait_time, which defaults to 5 seconds (which is probably too short for a default).

You can set in ~/.jupyter/jupyter_server_config.py:

# give kernels 30 seconds to shutdown cleanly before we terminate them
c.KernelManager.shutdown_wait_time = 30

@bluescarni
Copy link

This is likely KernelManager.shutdown_wait_time, which defaults to 5 seconds (which is probably too short for a default).

You can set in ~/.jupyter/jupyter_server_config.py:

# give kernels 30 seconds to shutdown cleanly before we terminate them
c.KernelManager.shutdown_wait_time = 30

Hi @minrk and thanks for the reply!

This will probably fix the issue, but I was wondering - wouldn't it be better (if possible) to ask the user whether or not to forcibly stop the kernel rather than just silently killing it?

I am thinking something like the UI in Window Managers when an application is failing to immediately close after pressing the x button.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants