Handle warnings in tests #751

blink1073 · 2022-03-06T18:06:08Z

Done so far:

Close fds in LocalProvisioner.wait to avoid unclosed fids
Destroy kernel client's context when calling stop_channels to avoid unclosed zmq context
Use separate future for shutdown to avoid conflict: kernel.shutdown_ready
Add guard around interrupting in KernelManager.shutdown
Clean up handling of event loops
Increase the number of available fds in tests
Miscellaneous teardown cleanups

So far the only warnings we can't easily work around are due to pytest-dev/pytest-asyncio#77. We'd have to rewrite all of the tests not to use TestCase.

kevin-bates

Thanks for spending the necessary time on this @blink1073 - it's non-trivial and much appreciated! Just had a couple of comments.

Should we open an issue to move the remaining tests to pytest to rid ourselves of TestCase? Perhaps someone with some bandwidth could take that on as a good intro?

.github/workflows/main.yml

kevin-bates · 2022-03-07T16:10:56Z

jupyter_client/client.py

+        if self._created_context:
+            self.context.destroy()


Should self._create_context be unset here?

kevin-bates · 2022-03-07T16:11:13Z

jupyter_client/manager.py

+        if self.has_kernel:
+            await ensure_async(self.interrupt_kernel())


Nice - make this call to interrupt() conditional on start status. Good catch.

kevin-bates · 2022-03-07T16:11:54Z

jupyter_client/tests/conftest.py

+if resource is not None:
+    soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
+
+    DEFAULT_SOFT = 4096
+    if hard >= DEFAULT_SOFT:
+        soft = DEFAULT_SOFT
+
+    old_soft, old_hard = resource.getrlimit(resource.RLIMIT_NOFILE)
+    hard = old_hard
+    if old_soft < soft:
+        if hard < soft:
+            hard = soft
+        resource.setrlimit(resource.RLIMIT_NOFILE, (soft, hard))


I find this a little difficult to understand. I think the double calls to getrlimit are part of the confusion. Perhaps a comment as to the goal would be helpful. It appears to ensure a minimal soft limit of DEFAULT_SOFT if the current hard limit is at least that much.

It seems like the hard < soft (L30) condition will never be true because old_soft < soft (L29) can only be true if soft was increased to DEFAULT_SOFT, which implies hard >= soft, which leads me to wonder some kind of update to hard is missing.

blink1073 · 2022-03-07T19:40:59Z

It looks like this breaks the server tests, so I'll have to do both PRs at once.

blink1073 · 2022-03-07T20:36:10Z

Okay, this part is done, stopping for today.

minrk · 2022-03-08T08:37:40Z

jupyter_client/manager.py

+                fut = Future()
+            except RuntimeError:
+                # No event loop running, use concurrent future
+                fut = CFuture()


If this is inside an async def, can this except branch ever occur? There should always be a running loop from inside a coroutine.

The previous strategy of setting _ready outside a coroutine might not have a running loop, but now that you've moved it into the coroutine (good, I think), there should always be a running loop.

Returning two types of Future depending on context would be tricky because they aren't interchangeable (CFuture is not awaitable without calling asyncio.wrap_future(cfuture)).

An important difference of this strategy, though: the attribute will not be set until an arbitrarily-later point in time due to asyncness. The previous strategy ensured ._ready was defined immediately before any waiting, whereas putting it in the coroutine means the attribute will not be defined immediately.

Two new things to consider (may well both be covered):

previous waits for self._ready that may now be called when _ready is not yet

can _ready be set and then re-set? If so, waits for _ready may get a 'stale' state before the new _ready future is attached. This can be avoided with a non-async delattr prior to the async bit.

Originally _ready was only defined in the constructor, which is why we had the fallback option. It seems like these futures are becoming problematic in general. Maybe what we really want is a state and a signal when the state changes, like we have on the JavaScript side.

Alternatively, if we want to define ready and shutdown_ready as "the most recent" starting and shutting down futures, we could also add an is_ready and is_shutdown_ready for when there are no pending of either. We would use this trick that tornado used to work around the new deprecation warnings:

loop = asyncio.get_event_loop_policy().get_event_loop() # this always returns the same loop future = asyncio.Future(loop=loop)

We could then make .ready and .shutdown ready futures only available when using the AsyncKernelManager.

Haha oh my, nevermind, I'm digesting https://bugs.python.org/issue39529

I think part of the problem is that jupyter_client is a library, not an application, in the same way that tornado is. It looks like we also can't rely on asyncio_run in the future, since it relies on get_event_loop and set_event_loop. I think here's a sketch of what we need to do:

The base classes will be asynchronous and call the the private async versions of functions directly as needed. We will only use futures and tasks from within async methods.

The synchronous classes will work by wrapping the asynchronous public methods using a decorator that runs a new private asyncio loop as return asyncio.new_event_loop().run_until_complete(async_method).

We remove the .ready future and add new methods called async def wait_for_pending_start() and async def wait_for_pending_shutdown() to handle pending kernels.

With this new structure, the synchronous managers could also support pending kernels, breaking it into two synchronous steps.

We can handle the ready future in this PR and defer the other parts to a separate refactor PR.

btw, concurrent.futures are generally more compatible and flexible than asyncio futures - they are threadsafe, reusable across loops, etc. If you always use a cf.Future, you can await it in a coroutine with asyncio.wrap_future(cf_future). That may be the simplest solution here. It's the inconsistency that I saw as a problem (self._ready may be awaitable). Something like:

# earlier, _not_ in a coroutine self._ready = cf.Future() async def wait_for_ready(self): if self._ready is None: raise RuntimeError("haven't started yet!") # wrap cf.Future self._ready to make it awaitable await asyncio.wrap_future(self._ready)

ought to be more reliable.

To always use new event loops (or asyncio.run) for each sync method call, one important consideration is that all your methods completely resolve by the end of the coroutine (i.e. not hold references to the current loop for later calls, which means requiring pyzmq 22.2, among other things). A possibly simpler alternative is to maintain your own event loop reference. asyncio.get_event_loop() is deprecated because it did too many things, but that doesn't mean nobody should hold onto persistent loop references. It just means that they don't want to track a global not-running loop. If each instance had a self._loop = asyncio.new_event_loop() and always ran the sync wrappers in that loop (self._loop.run_until_complete(coro())), it should work fine. That avoids the multiple-loop problem for sync uses because it's still a single persistent loop. Each instance may happen to use a different loop, but that should not be a problem in the sync case.

Sweet, thanks for the insights Min. That sounds like a good approach.

More thoughts on ready future handling:

Remove the public ready attribute(s), but add public is_start_pending and is_shutdown_pending flags.

Internally, store current ready and pending flags for start and shutdown.

In start(), check for _start_pending and error if true. Set _start_pending and set a new _start_ready future. Clear _start_pending when ready.

In shutdown(), check for _shutdown_pending and bail if true. Set _shutdown_pending and set a new _shutdown_ready future. Clear _shutdown_pending when ready.

In wait_for_start_ready() and wait_for_shutdown_ready() store current ready, and wait for it. If new current ready is different from the stored one, recurse.

blink1073 · 2022-03-13T11:54:32Z

Rather than introduce more flags and more confusion, I propose we actually use a simple state machine as @echarles suggested.

Currently server adds "execution_state" to the manager object based on kernel status messages. I think eventually we should move that logic here, but it doesn't need to be in this PR.

The states used here could mirror the ones used in JupyterLab :

Unknown
Starting
Running (Maps to Idle and Busy execution states)
Terminating
Restarting
Dead (with optional traceback)

We are Unknown to start. State transitions are as follows:

Unknown -> Starting
Starting -> Running or Dead
Running -> Terminating, Restarting, or Dead
Terminating -> Dead
Restarting -> Running or Dead
Dead -> Starting

Then we have a state: Unicode() trait and a convenience wait_for_state() method that takes
an optional desired state or returns for any state change. It would also raise an exception if it
gets to Dead state and it wasn't the expected state, along with the traceback.

The state trait can also be observed directly.

We also have an exception: Unicode() trait that is set when an exception leads to a Dead status.

echarles · 2022-03-13T12:39:15Z

Rather than introduce more flags and more confusion, I propose we actually use a simple state machine as @echarles suggested.

Maybe the first step would be to draw and share that state machine that would be highly beneficial to jupyter client, but also to e.g. the jupyterlab services package evolution.

Having a SVG format for that drawing would allow everyone to use their favorite editor (e.g. I use inkscape).

davidbrochart · 2022-03-13T15:30:48Z

I have not looked at this PR in details, but it seems to me that supporting a sync API is becoming problematic. At the time we switched jupyter_client to async, I had the feeling that keeping a sync API was kind of mandatory, although it involved using nest-asyncio which is not without problems. But today, could jupyter_client be only async? That would simplify the library a lot.

blink1073 · 2022-03-13T15:52:56Z

Good call @echarles. I think we should adopt the transitions library. It is MIT licensed, already on conda-forge, and lets you produce diagrams from your state machine.

We could add a CI job that create the diagram and compares it to the one in the documentation, and fails and asks you to rebuild if it has changed.

blink1073 · 2022-03-13T15:58:08Z

@davidbrochart, I think there are still valid use cases for having a synchronous manager. What we could say is that only the async class is meant to be overridable, and offer a function that wraps the async class to become synchronous. We would have a section like:

# These methods are meant to be overridden by subclasses
async def _start_kernel(...)

And the public functions would be async in the base class, but made synchronous in the sync class.
We would create our own sync manager(s) using this wrapper.
I think this would have to be an 8.0 release.

echarles · 2022-03-13T16:02:50Z

Good call @echarles. I think we should adopt the transitions library. It is MIT licensed, already on conda-forge, and lets you produce diagrams from your state machine.

Code-to-doc, that sounds exciting. Looks like transitions uses has not svg native support and uses graphviz to generate the diagram. As graphviz can generate SVG, that is all good.

We could add a CI job that create the diagram and compares it to the one in the documentation, and fails and asks you to rebuild if it has changed.

That sounds exiciting.

BTW I have been looking recently at xstate on the frontend area. Looks like state machine architecture is trendy atm.

echarles · 2022-03-13T16:04:22Z

@davidbrochart, I think there are still valid use cases for having a synchronous manager

+1 Although async + sync is harder to maintain in jupyter-client, it benefits basic consumers of this library who don't have to worry about asynchronicity.

minrk · 2022-03-14T13:02:34Z

I think keeping sync is important.

It's also possible to avoid nest_asyncio if you run 'asyncio-for-sync' in a dedicated IO thread. call_soon_threadsafe and concurrent.futures.Future make this pretty doable, and in my experience simpler and more robust than using nest_asyncio. ipyparallel's sync client has worked that way for a long time.

blink1073 · 2022-03-14T17:09:30Z

I opened #755 to track the nest_asyncio problem.

blink1073 · 2022-03-30T21:49:02Z

Closing this one in favor of #760 and a follow up in #763

Handle warnings in tests

d7b50a9

blink1073 added the maintenance label Mar 6, 2022

blink1073 added 7 commits March 6, 2022 18:37

set strict asyncio mode

b1484ed

switch to auto mode

eaf2195

fix multikernelmanager tests

c068a82

fix provisioning

7ccc0cf

cleanup

4db6e4e

fix for min version

8de3e86

fix toml

b8935af

blink1073 requested review from Zsailer and kevin-bates March 7, 2022 01:31

blink1073 added 15 commits March 6, 2022 19:35

try pyzmq 18

c729b23

use bugfix version

40d3890

do not fail on warnings for min deps

afdacfb

fix event loop warning

87832a2

py10 fixups

c3b432a

remove problematic test

841c6bc

handle loop warning

602f361

fix windows tempdir handling

8945cc2

more cleanup

1edb75f

more cleanup

2db6359

use latest nest-asyncio

77f5e4b

try updated tornado

d217f59

try newer asyncio-test

2c506ec

try newer pyzmq

7a2f6d1

try newer ipykernel

9868fa8

blink1073 mentioned this pull request Mar 7, 2022

get_event_loop is deprecated in Python 3.10+ zeromq/pyzmq#1669

Closed

try removing zmq warning guard

9251cc1

kevin-bates reviewed Mar 7, 2022

View reviewed changes

restore future warning

457e344

blink1073 added 5 commits March 7, 2022 13:56

bump zmq version

ff8e482

fix version

d670d8c

relax a couple versions

9ec08a8

clean up ci

968b9c7

try a newer ipykernel

35645f8

fix shutdown all behavior

8c4c472

blink1073 mentioned this pull request Mar 8, 2022

Add warnings for tests jupyter-server/jupyter_server#709

Closed

minrk reviewed Mar 8, 2022

View reviewed changes

blink1073 mentioned this pull request Mar 8, 2022

Register pre/post save hooks, call them sequentially jupyter-server/jupyter_server#696

Merged

blink1073 added 3 commits March 12, 2022 06:04

wip refactor

d108c05

more cleanup

66c6dab

fix test

40bacf0

blink1073 mentioned this pull request Mar 13, 2022

Clean up CI jupyter-server/jupyter_server#723

Merged

4 tasks

jess-x mentioned this pull request Mar 14, 2022

Meeting Notes 2022 jupyter-server/team-compass#15

Closed

blink1073 mentioned this pull request Mar 14, 2022

Clean up synchronous managers #755

Open

This was referenced Mar 29, 2022

Handle Warnings #760

Merged

Add State Machine Handling to KernelManager #763

Open

blink1073 closed this Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle warnings in tests #751

Handle warnings in tests #751

blink1073 commented Mar 6, 2022 •

edited

Loading

kevin-bates left a comment

kevin-bates Mar 7, 2022

blink1073 Mar 7, 2022

kevin-bates Mar 7, 2022

blink1073 Mar 7, 2022

kevin-bates Mar 7, 2022

blink1073 Mar 7, 2022

blink1073 commented Mar 7, 2022

blink1073 commented Mar 7, 2022

minrk Mar 8, 2022

minrk Mar 8, 2022

blink1073 Mar 8, 2022

blink1073 Mar 8, 2022 •

edited

Loading

blink1073 Mar 8, 2022

blink1073 Mar 8, 2022

blink1073 Mar 8, 2022

minrk Mar 9, 2022

blink1073 Mar 9, 2022

blink1073 Mar 9, 2022 •

edited

Loading

blink1073 commented Mar 13, 2022

echarles commented Mar 13, 2022

davidbrochart commented Mar 13, 2022

blink1073 commented Mar 13, 2022

blink1073 commented Mar 13, 2022 •

edited

Loading

echarles commented Mar 13, 2022 •

edited

Loading

echarles commented Mar 13, 2022

minrk commented Mar 14, 2022

blink1073 commented Mar 14, 2022

blink1073 commented Mar 30, 2022

		if self.has_kernel:
		await ensure_async(self.interrupt_kernel())

Handle warnings in tests #751

Handle warnings in tests #751

Conversation

blink1073 commented Mar 6, 2022 • edited Loading

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blink1073 commented Mar 7, 2022

blink1073 commented Mar 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blink1073 Mar 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blink1073 Mar 9, 2022 • edited Loading

Choose a reason for hiding this comment

blink1073 commented Mar 13, 2022

echarles commented Mar 13, 2022

davidbrochart commented Mar 13, 2022

blink1073 commented Mar 13, 2022

blink1073 commented Mar 13, 2022 • edited Loading

echarles commented Mar 13, 2022 • edited Loading

echarles commented Mar 13, 2022

minrk commented Mar 14, 2022

blink1073 commented Mar 14, 2022

blink1073 commented Mar 30, 2022

blink1073 commented Mar 6, 2022 •

edited

Loading

blink1073 Mar 8, 2022 •

edited

Loading

blink1073 Mar 9, 2022 •

edited

Loading

blink1073 commented Mar 13, 2022 •

edited

Loading

echarles commented Mar 13, 2022 •

edited

Loading