Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JEP for sub-shells #91

Merged
merged 7 commits into from
Sep 9, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 144 additions & 0 deletions kernel-subshells/kernel-subshells.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
---
title: Jupyter kernel sub-shells
authors: David Brochart (@davidbrochart), Sylvain Corlay (@SylvainCorlay), Johan Mabille (@JohanMabille)
issue-number: XX
pr-number: XX
date-started: 2022-12-15
---

# Summary

This JEP introduces kernel sub-shells to allow for concurrent shell requests. This is made possible
by defining new control channel messages, as well as a new shell ID field in shell messages.

# Motivation

Users have been asking for ways to interact with a kernel while it is busy executing CPU-bound code,
for the following reasons:
- inspect the kernel's state to check the progress or debug a long-running computation (e.g.
through a variable explorer).
- visualize intermediary results before the final result is computed.
- request [completion](https://jupyter-client.readthedocs.io/en/stable/messaging.html#completion) or
[introspection](https://jupyter-client.readthedocs.io/en/stable/messaging.html#introspection).
- process
[Comm messages](https://jupyter-client.readthedocs.io/en/stable/messaging.html#custom-messages)
immediately (e.g. for widgets).

Unfortunately, it is currently not possible to do so because the kernel cannot process other
[shell requests](https://jupyter-client.readthedocs.io/en/stable/messaging.html#messages-on-the-shell-router-dealer-channel)
until it is idle. The goal of this JEP is to offer a way to process shell requests concurrently.

# Proposed Enhancement

The [kernel protocol](https://jupyter-client.readthedocs.io/en/stable/messaging.html) only allows
for one
[shell channel](https://jupyter-client.readthedocs.io/en/stable/messaging.html#messages-on-the-shell-router-dealer-channel)
where execution requests are queued. Accepting other shells would allow users to connect to a kernel
and submit execution requests that would be processed in parallel.

We propose to allow the creation of optional "sub-shells", in addition to the current "main shell".
This will be made possible by adding new message types to the
[control channel](https://jupyter-client.readthedocs.io/en/stable/messaging.html#messages-on-the-control-router-dealer-channel)
for:
- creating a sub-shell,
- deleting a sub-shell,
- listing existing sub-shells.

A sub-shell should be identified with a shell ID, either provided by the client in the sub-shell

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say the kernel always generates the shell id and we don't support the client providing an id. Once you have clients providing ids, then it's always a guessing game if there is contention between clients, or you have clients generate UUIDs, at which point you might as well have the kernel generate a truly unique id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if in the future we allow per-cell sub-shells (through e.g. cell metadata), it could open up possibilities such that a cell creates a sub-shell, and other cells run in this sub-shell, so they would need the shell ID. We could build complex asynchronous systems like that.
akernel can do something similar but programmatically: __task__() returns a handle to the previous cell task, so the next cell can do whatever it wants with it (await it, etc.).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the client specifies a subshell id, it will need to wait until it is confirmed in the reply to be sure it has reserved that name. In that case, why not just get the subshell id from the reply message, and be guaranteed it didn't fail because of a name conflict? What does having the client give the subshell id do for us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that it allowed us to reuse it later, at least in the case of a self-contained notebook where we know there is no shell ID conflict.

Copy link
Member

@jasongrout jasongrout Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A notebook might be opened with two copies, in which case each copy would want to start up a subshell with the same name? For example, either a notebook in real-time collaboration, or a notebook opened twice side by side in JLab?

Or perhaps if you try to create a subshell with an existing id, it just acknowledges that the subshell is already created, with no error? Multiple clients might send computations to the same subshell?

What if we treat it like we do kernel sessions right now, with a user-supplied name as a key? In other words, a client subshell creation request optionally gives a name (not an id). If a subshell with that name already exists, its id is returned. If it doesn't exist, a new subshell with that name is created and returned. And if a name is not given by the client, an unnamed subshell is created and returned. Thoughts? This gives you the ability to share subshells between clients addressable with some client-supplied string, but gives me always unique ids created by the resource manager.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps if you try to create a subshell with an existing id, it just acknowledges that the subshell is already created, with no error?

I like that. It seems that there is no distinction between a sub-shell name and a sub-shell ID in this case.

What if we treat it like we do kernel sessions right now, with a user-supplied name as a key?

In that case there seems to be an unnecessary mapping between sub-shell name and sub-shell ID, or am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a technical difference: since the client name and shell id are different namespaces, the shell id (generated by the kernel) does not have to check for conflicts with the names already given, the client can request a shell that is guaranteed that no one else will ever request (i.e., a shell specific to itself, guaranteed to not collide with any other requested name).

For example, suppose the shell ids are generated by starting at 0 and incrementing for each new subshell. If the client asks for shell name 5, and the client name and shell id namespaces are conflated, the client won't know if it's getting some random subshell someone else created (i.e., shell id 5), or if it's getting a shell with specific meaning "5" (i.e., client name 5). Likewise, any time a new shell id is generated, the kernel would have to check to see if someone else had already claimed that number.

I think it's a much better design to keep the client-specific names in a separate namespace from the unique shell ids. With this design, any client asking for a shell named "autocomplete" gets the same autocomplete shell shared with every other client requesting the same subshell name. However, if you want to get your own subshell separate from any other client, you just request a subshell without a name.

creation request, or given by the kernel in the sub-shell creation reply. The shell ID of the
targeted sub-shell must then be sent along with any shell message. This allows any other client
(console, notebook, etc.) to use this sub-shell. If no shell ID is sent, the message targets the
main shell. Sub-shells are thus multiplexed on the shell channel through the shell ID, and it is the
responsibility of the kernel to route the messages to the target sub-shell according to the shell
ID.

Essentially, a client connecting through a sub-shell should see no difference with a connection
through the main shell, and it does not need to be aware of it. However, a front-end should provide
some visual information indicating that the kernel execution mode offered by the sub-shell has to be
used at the user's own risks. In particular, because sub-shells may be implemented with threads, it
is the responsibility of users to not corrupt the kernel state with non thread-safe instructions.

# New control channel messages

## Create sub-shell

Message type: `create_subshell_request`:

```py
content = {
# Optional, the ID of the sub-shell if specified by the client.
'shell_id': str
}
```

Message type: `create_subshell_reply`:

```py
content = {
# 'ok' if the request succeeded or 'error', with error information as in all other replies.
'status': 'ok',

# The ID of the sub-shell, same as in the request if specified by the client, given by the
# kernel otherwise.
'shell_id': str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you envision sub-shells having any properties or inputs? Or are they all by definition identical for a given kernel (at least to start)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only setting I can think of, since the difference between sub-shells and the main shell is that they run concurrently, would be to specify which concurrency "backend" should be used: a thread or a process or asynchronous programming.
But I think it would lead to too much complexity and threading will always be used anyway.

}
```

## Delete sub-shell

Message type: `delete_subshell_request`:

```py
content = {
# The ID of the sub-shell.
'shell_id': str
}
```

Message type: `delete_subshell_reply`:

```py
content = {
# 'ok' if the request succeeded or 'error', with error information as in all other replies.
'status': 'ok',
}
```

## List sub-shells

Message type: `list_subshell_request`: no content.

Message type: `list_subshell_reply`:

```py
content = {
# A list of sub-shell IDs.
'shell_id': [str]
}
```

# Behavior

## Kernels not supporting sub-shells

The following requests should be ignored: `create_subshell_request`, `delete_subshell_request` and
`list_subshell_request`. A `shell_id` passed in any shell message should be ignored. This ensures
that existing kernels don't need any change to be compatible with the kernel protocol changes
required by this JEP.

This means that all shell messages are processed in the main shell, i.e. sequentially.

Since sub-shells are basically a "no-op", the behavior around
[kernel restart](https://jupyter-client.readthedocs.io/en/stable/messaging.html#kernel-shutdown) and
[kernel interrupt](https://jupyter-client.readthedocs.io/en/stable/messaging.html#kernel-interrupt)
is unchanged.

## Kernels supporting sub-shells

A sub-shell request may be processed concurrently with other shells. Within a sub-shell, requests
are processed sequentially.

A [kernel restart](https://jupyter-client.readthedocs.io/en/stable/messaging.html#kernel-shutdown)
should delete all sub-shells. A
[kernel interrupt](https://jupyter-client.readthedocs.io/en/stable/messaging.html#kernel-interrupt)
should interrupt the main shell and all sub-shells.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that perhaps an interrupt request could give a subshell id to interrupt only that subshell. However, if we want to be backwards compatible, we have to interrupt all shells: if all subshell requests are processed in the main shell, then interrupting the kernel will currently interrupt all shells.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Maybe we could also say that a kernel should do its best at interrupting only the requested sub-shell, but that it may interrupts all shells?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Maybe we could also say that a kernel should do its best at interrupting only the requested sub-shell, but that it may interrupts all shells?

That sounds too unpredictable to me. I think if we want subshell-specific interrupt, we need another message so we can be backwards compatible and predictable.