Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async jupyter client #10

Merged
merged 30 commits into from
Feb 26, 2020
Merged

Async jupyter client #10

merged 30 commits into from
Feb 26, 2020

Conversation

davidbrochart
Copy link
Member

This is a follow-up of #6. It has to be used with jupyter/jupyter_client#506.

Copy link
Contributor

@MSeal MSeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together! I need to do some more thorough testing, but there are a few things we need to figure out in the meanwhile.

A simple one is that #12 (and other PRs touching execute_cell) will conflict with this PR since it touches most public methods so it'll probably become stale as smaller PRs merge in. If we start something experimental and want more incremental PRs it might make sense for us to work out of a branch.

A larger concern that I'm not sure how to completely bridge is that we need to support python 3.5+ (probably can cut to 3.6+ later in the year). I haven't looked into what the best approach is to support async across such versions when one needs to support synchronous calls back in pre-async python verisons. Perhaps we could build a wheel that has different files imported depending on the python version, though that might be tricky. Another option is to release two modules, nbclient and nbclient-async independently. Open to suggestions on this topic.

loop = get_loop()
return loop.run_until_complete(self.async_run_cell(cell, cell_index, store_history))

async def poll_output_msg(self, parent_msg_id, cell, cell_index, timeout=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to do some more manual testing to ensure we correctly timeout, pass execution context, and handle errors cleanly. Some of those patterns are tested in the synchronous path with some assumptions that might not hold for the async version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to explain a bit more how I implemented the asynchronous functionality, there are now two tasks that are run in parallel:

  • _poll_for_reply: asynchronously awaits a reply on the shell channel (possibly with timeout).
  • poll_output_msg: asynchronously awaits a message on the IOPub channel for cell execution. We first await as long as no reply is received on the shell channel, and when a reply is received on the shell channel we cancel this task and launch it again with a timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that helps!

@@ -270,6 +295,55 @@ def test_many_parallel_notebooks(capfd):
assert captured.err == ""


def test_async_parallel_notebooks(capfd, tmpdir):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Glad these tests are here to validate parallel execution

tox.ini Outdated
deps = .[dev]
commands = pytest -vv --maxfail=2 --cov=nbclient -W always {posargs}
commands =
/bin/bash -c 'python -m pip install -e git+https://[email protected]/davidbrochart/jupyter_client.git@async_client#egg=jupyter_client'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can put that in the requirements file to test as a git path without needing to toss it in the commands. That also makes the package install cleaner for local development and indicates the branch is dependent on an unreleased code dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to put it in the requirements file but never managed to make it work, that's why it ended up here (I had an error about pip not being able to parse it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good, there's a way to make it happy but we can instead just focus on getting async support merged and released in jupyter_client as a pre-req for this PR to be released.

@davidbrochart
Copy link
Member Author

Thanks for reviewing @MSeal. There are a couple of issues related to compatibility of async with older Python versions:

  • in setup_kernel (which is now async because it awaits start_new_kernel_client which awaits wait_for_ready which awaits get_msg...) there is a yield which leads to a SyntaxError: 'yield' inside async function in Python 3.5. Also in Python 3.5 and 3.6 asynccontextmanager doesn't exist in contextlib.
  • in the tests we now mock messages asynchronously, but AsyncMock only exists in unittest.mock from Python 3.8.

That being said, I think we should work on fixing these issues and not have a nbclient-async separate module.

@SylvainCorlay
Copy link
Member

Another option is to release two modules, nbclient and nbclient-async independently.

That seems a bit overkill to me!

@davidbrochart
Copy link
Member Author

Trigger CI.

@davidbrochart davidbrochart reopened this Feb 4, 2020
@davidbrochart
Copy link
Member Author

@MSeal I managed to get Python compatibility from 3.5 to 3.8, but it looks like the CI dies randomly for test_many_parallel_notebooks. I'm not sure if it's because of low CI resources or if it's a real issue.

@davidbrochart
Copy link
Member Author

Trigger CI.

@davidbrochart davidbrochart reopened this Feb 4, 2020
@MSeal
Copy link
Contributor

MSeal commented Feb 4, 2020

@MSeal I managed to get Python compatibility from 3.5 to 3.8, but it looks like the CI dies randomly for test_many_parallel_notebooks. I'm not sure if it's because of low CI resources or if it's a real issue.

I've seen python 3.5 fail randomly in nbconvert and now in nbclient where the kernel dies unexpectantly due to port conflicts. I believe that's related to ipython no longer releasing 3.5 versions (ipython==7.9.0 vs ipython==7.12.0) and there being some outstanding bugs in ipython 7.9. I think the parallel notebook tests should probably scope to 3.6+ given the ipython issues. However the

E       AssertionError: assert '[IPKernelApp] WARNING | Parent appears to have exited, shutting down.\n' == ''
E         - [IPKernelApp] WARNING | Parent appears to have exited, shutting down.

is new to me -- that looks like something different is happening.

)
exec_reply = await self._poll_for_reply(parent_msg_id, cell, exec_timeout)
if not task_poll_output_msg.done():
task_poll_output_msg.cancel()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never knew about cancel, can this mean a message can get lost?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because messages are queued anyway in zmq. And we create a new task polling for messages just after that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can it cancel, when it's off the zmq queue, say, it's in the json parser. How does it cancel like Ctrl-c.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't do anything between two await, so it depends on the implementation. For instance, in zmq.asyncio.Socket.recv_multipart we might await zmq.asyncio.Socket.recv several times and so cancelling in the middle of it can lead to losing a message, you're right.

Copy link
Member Author

@davidbrochart davidbrochart Feb 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way that I tried out was to leave the first poll_output_msg task running (with no timeout) and launch another one with a timeout when _poll_for_reply is done (and if the first poll_output_msg task is not done), and then asyncio.wait(both_tasks, return_when=asyncio.FIRST_COMPLETED), but it doesn't handle exceptions well (there is a return_when=asyncio.FIRST_EXCEPTION for that).
I can't think of a better solution right now.

@SylvainCorlay
Copy link
Member

due to port conflicts

@MSeal there is an accurate description of that issue in jupyter/jupyter_client#487 by @JohanMabille.

Basically the whole handshake model of the kernel protocol is an antipattern. The client should not tell the kernel which ports to use. It should only open one socket for the kernel to communicate back the available ports for itself...

@davidbrochart
Copy link
Member Author

@maartenbreddels I reworked the polling on the shell and iopub channels. Now the polling on iopub channel task is run only once (with no timeout), and when we have an exec reply we wait_for that task with a timeout, which will cancel the task after the timeout. That way there is no task rescheduling and no risk to loose a message.

@maartenbreddels
Copy link
Contributor

maartenbreddels commented Feb 8, 2020 via email

@davidbrochart
Copy link
Member Author

@MSeal jupyter_client now has an async API (jupyter/jupyter_client#506 has been merged). How would you like to proceed? I guess we should make a release of jupyter_client before merging this PR.

Copy link
Contributor

@MSeal MSeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this all in place and updated!

@MSeal MSeal merged commit d4e09db into jupyter:master Feb 26, 2020
@meeseeksmachine
Copy link

This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/welcome-new-jupyter-client-nbclient-maintainer-david-brochart/3467/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants