Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JEP for sub-shells #91

Merged
merged 7 commits into from
Sep 9, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions kernel-subshells/kernel-subshells.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: Jupyter kernel sub-shells
authors: David Brochart (@davidbrochart), Sylvain Corlay (@SylvainCorlay), Johan Mabille (@JohanMabille)
issue-number: XX
pr-number: XX
date-started: 2022-12-15
---

# Summary

This JEP introduces kernel sub-shells to allow for concurrent code execution. This is made possible
by defining new control channel messages, as well as a new shell ID field in shell channel messages.

# Motivation

Users have been asking for ways to interact with a kernel while it is busy executing CPU-bound code,
for the following reasons:
- inspect the kernel's state to check the progress or debug a long-running computation.
- visualize some intermediary result before the final result is computed.
SylvainCorlay marked this conversation as resolved.
Show resolved Hide resolved

Unfortunately, it is currently not possible to do so because the kernel cannot process other
[execution requests](https://jupyter-client.readthedocs.io/en/stable/messaging.html#execute) until
it is idle. The goal of this JEP is to offer a way to run code concurrently.

# Proposed Enhancement

The [kernel protocol](https://jupyter-client.readthedocs.io/en/stable/messaging.html) only allows
for one
[shell channel](https://jupyter-client.readthedocs.io/en/stable/messaging.html#messages-on-the-shell-router-dealer-channel)
where execution requests are queued. Accepting other shells would allow users to connect to a kernel
and submit execution requests that would be processed in parallel.

We propose to allow the creation of optional "sub-shells", in addition to the current "main shell".
This will be made possible by adding new message types to the
[control channel](https://jupyter-client.readthedocs.io/en/stable/messaging.html#messages-on-the-control-router-dealer-channel)
for:
- creating a sub-shell,
- deleting a sub-shell,
- listing existing sub-shells.

A sub-shell should be advertised to the client with a shell ID, which must be sent along with
further messages on the shell channel in order to target a sub-shell. This allows any other client
(console, notebook, etc.) to use this sub-shell. If no shell ID is sent, the message targets the
main shell. Sub-shells are thus multiplexed on the shell channel through the shell ID, and it is the
responsibility of the kernel to route the messages to the target sub-shell according to the shell
ID.

Essentially, a client connecting through a sub-shell should see no difference with a connection
through the main shell, and it does not need to be aware of it. However, a front-end should provide
some visual information indicating that the kernel execution mode offered by the sub-shell has to be
used at the user's own risks. In particular, because sub-shells may be implemented with threads, it
is the responsibility of users to not corrupt the kernel state with non thread-safe instructions.

# New control channel messages

## Create sub-shell

Message type: `create_subshell_request`: no content.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the user should be able to provide a shell ID in the content, in case they want to reuse it later, instead of getting the shell ID in the reply.
If the provided shell ID already exists, an error would be sent in the reply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general, the entity creating the resource should get to decide the id. Otherwise it turns into a guessing game picking an unused id.

So +1 to the kernel answering back with the subshell id.


Message type: `create_subshell_reply`:

```py
content = {
# 'ok' if the request succeeded or 'error', with error information as in all other replies.
'status': 'ok',

# The ID of the sub-shell.
'shell_id': str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you envision sub-shells having any properties or inputs? Or are they all by definition identical for a given kernel (at least to start)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only setting I can think of, since the difference between sub-shells and the main shell is that they run concurrently, would be to specify which concurrency "backend" should be used: a thread or a process or asynchronous programming.
But I think it would lead to too much complexity and threading will always be used anyway.

}
```

## Delete sub-shell

Message type: `delete_subshell_request`:

```py
content = {
# The ID of the sub-shell.
'shell_id': str
}
```

Message type: `delete_subshell_reply`:

```py
content = {
# 'ok' if the request succeeded or 'error', with error information as in all other replies.
'status': 'ok',
}
```

## List sub-shells

Message type: `list_subshell_request`: no content.

Message type: `list_subshell_reply`:

```py
content = {
# A list of sub-shell IDs.
'shell_id': [str]
}
```

# Points of discussion

The question of sub-shell ownership and life cycle is open, in particular:
- Is a sub-shell responsible for deleting itself, or can a shell delete other sub-shells?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not yet specified exactly where exactly the shell id is passed on the shell messages. I imagine a shell_id field in the header (both request to route and response to identify) should suffice. I don't think it should be added to content.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the shell ID should be passed in the header of request messages.
Should it be copied in the header of the reply or is it enough for it to be present in the parent header (since it's included in the reply)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to header, especially since many shell message types may use subshells, such as comm messages and autocomplete or inspect requests.

I think it's fine for it to be in the parent_header of the reply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for it to be in the parent_header of the reply.

Actually, given that a frontend will likely have to route busy/idle messages, output messages, etc. based on their shell id, does it makes sense to have that shell id in the header of any messages coming from the subshell on the iopub channel, not just shell reply messages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should already be in the parent header, which is also included in the messages coming from the subshell on the iopub channel, right?

- Can a sub-shell create other sub-shells?
- Does sub-shells have the same rights as the main shell? For instance, should they be allowed to
shut down or restart the kernel?
Copy link
Member

@jasongrout jasongrout Jan 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, semantics of subshells for kernels that have not implemented them, i.e., using a subshell does not guarantee computations will run concurrently. Really, the only guarantee is that a specific subshell's computations will be run in order (and for a kernel not implementing subshells, this is done by just serializing all shell messages on to the main shell thread).

Also, what happens if you specify a subshell that does not exist on a kernel that supports subshells?

Is there any change to busy/idle message semantics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the semantics of subshells around interrupt and restart messages, or more generally kernel restarts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jasongrout for the questions, I have opinions but we should discuss these points.

Also, semantics of subshells for kernels that have not implemented them

Maybe the create_subshell_reply should return an error, and the shell_id field of any message should be ignored?

Also, what happens if you specify a subshell that does not exist on a kernel that supports subshells?

Again, I think the reply should be an error.

Is there any change to busy/idle message semantics?

I'm tempted to say that this is a front-end issue. It has all the information about which shell is busy/idle, and it should decide how to translate that information to the user: an OR of all the busy signals, or only select the main shell busy signal.

What are the semantics of subshells around interrupt and restart messages, or more generally kernel restarts?

This almost seems to be orthogonal, as it's about the control channel, which this JEP doesn't change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost seems to be orthogonal, as it's about the control channel, which this JEP doesn't change.

I meant more: what is the lifecycle of subshells around kernel restarts. I think it makes sense for subshells to be terminated, but I also think that should be specified.

If I interrupt the kernel, does it interrupt all shells or just the main shell, or can I selectively interrupt a subshell?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that restarting a kernel should terminate subshells, and that interrupting the kernel should interrupt all shells, otherwise it could quickly become a mess. And yes, we should specify it 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Starting a new thread for separate conversations)

Also, semantics of subshells for kernels that have not implemented them

Maybe the create_subshell_reply should return an error, and the shell_id field of any message should be ignored?

Also, what happens if you specify a subshell that does not exist on a kernel that supports subshells?

Again, I think the reply should be an error.

Currently, in ipykernel, do you know what happens if you send an unrecognized control message? Does it reply with an error, or does it ignore the message?

I think if a shell message is sent with an unknown subshell id, a reasonable response is to ignore the subshell id and schedule the computation on the main shell. That would be backwards compatible with kernels that do not recognize subshell id info, and still guarantees that subshell requests are executed in order.

In other words, I think it is reasonable that giving subshell info does not guarantee concurrency with other messages. It only guarantees that subshell messages will be processed in order, and it is an optimization/implementation detail that messages on one subshell can be processed concurrently with another subshell's messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, in ipykernel, do you know what happens if you send an unrecognized control message? Does it reply with an error, or does it ignore the message?

ipykernel ignores the message, but I don't think the kernel protocol specifies this behavior.

I think if a shell message is sent with an unknown subshell id, a reasonable response is to ignore the subshell id and schedule the computation on the main shell.

I agree that a kernel that doesn't support subshells should ignore shell_ids, and process everything in the main shell, but should a kernel that supports subshells process messages with an unknown shell_id in the main shell, or reply with an error?

In other words, I think it is reasonable that giving subshell info does not guarantee concurrency with other messages. It only guarantees that subshell messages will be processed in order, and it is an optimization/implementation detail that messages on one subshell can be processed concurrently with another subshell's messages.

Agreed. Also, this JEP doesn't specify the concurrency backend. It will most likely use threads, but we can imagine that a kernel uses e.g. asyncio. In this case, concurrency would work only if cells are "collaborative", i.e. they await (a bit like in akernel).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ipykernel ignores the message, but I don't think the kernel protocol specifies this behavior.

Since the kernel_info reply carries with it the kernel protocol the kernel speaks, the client will also know what messages the kernel understands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Then I'm wondering if we should even try to be backwards-compatible, since a client knowing that the kernel doesn't support sub-shells should not use them. Maybe we should remove this section?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to be backwards compatible. It will be a long hard process to get many shells to support this subshell id (see what @krassowski mentioned in #91 (comment)). It will be much simpler from the client perspective if you can write one codebase that works with both the current protocol and the new protocol with reasonable degradations for the current protocol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I do wonder if instead of incrementing the kernel protocol number, we should instead have the idea of kernel extensions, e.g., a kernel can advertise which messages it supports, and doesn't have to support other messages in previous kernel protocol versions. For example, I can see a kernel wanting to support subshells before it supports debugging, but there would be no way for a kernel to tell that to a client.

@minrk - what do you think about introducing a field in the kernel info reply for a kernel to advertise which messages it supports?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about introducing a field in the kernel info reply for a kernel to advertise which messages it supports?

I think if we're starting to increasingly implement additional features in the protocol that are optional for kernels, this makes sense. If we're doing that, is there a more general 'feature' list that can't be represented purely as support for a given message type (e.g. subset of capabilities of a given message type).

I think that should probably be a separate proposal, though.

I don't think the kernel protocol specifies this behavior.

I think we should probably define a response pattern for kernels to use for unrecognized/unsupported messages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should probably be a separate proposal, though.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about introducing a field in the kernel info reply for a kernel to advertise which messages it supports?

We actually already do it for the debugger (a field indicates whether the kernel supports the debugger). I agree that generalizing it as a list would be a nice improvement. Working on a JEP.