Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop hook for extensions #526

Merged
merged 7 commits into from
Jul 8, 2021
Merged

Conversation

oliver-sanders
Copy link
Contributor

closes #241

Call a stop_extension method for each extension app on server shutdown if present.

Not sure how to go about testing this, any examples for the other cleanup_* methods?

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2021

Codecov Report

Merging #526 (0eeeff8) into master (2595716) will increase coverage by 0.11%.
The diff coverage is 86.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #526      +/-   ##
==========================================
+ Coverage   76.32%   76.43%   +0.11%     
==========================================
  Files         110      110              
  Lines        9819     9879      +60     
  Branches     1067     1079      +12     
==========================================
+ Hits         7494     7551      +57     
+ Misses       1944     1943       -1     
- Partials      381      385       +4     
Impacted Files Coverage Δ
jupyter_server/extension/application.py 71.77% <0.00%> (-0.92%) ⬇️
jupyter_server/serverapp.py 65.35% <77.77%> (+0.39%) ⬆️
jupyter_server/tests/extension/test_app.py 98.24% <93.75%> (-1.76%) ⬇️
jupyter_server/extension/manager.py 89.23% <100.00%> (+0.52%) ⬆️
jupyter_server/pytest_plugin.py 90.76% <100.00%> (ø)
jupyter_server/utils.py 64.41% <100.00%> (+1.25%) ⬆️
jupyter_server/tests/test_utils.py 100.00% <0.00%> (ø)
jupyter_server/services/kernels/kernelmanager.py 80.79% <0.00%> (+0.72%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2595716...0eeeff8. Read the comment docs.

@mwakaba2
Copy link
Member

Yeah I couldn't find an example of tests for cleanup methods. The _cleanup method is getting called after every jp_serverapp test.
@Zsailer or @kevin-bates Would a unit test for the function cleanup_extensions make sense here?

@kevin-bates
Copy link
Member

Hi @mwakaba2 - although I feel a bit hypocritical saying this given the "unit test" state of kernels and terminals, I think we should minimally include a test that calls stop_all_extensions. It looks like the unit tests for extensions are already fairly rich and extending the MockExtensionApp to include the new method makes sense. I'm not sure we need to rig a test that calls _cleanup_extensions() since that gets entangled with the basic plumbing of the server lifecycle.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @oliver-sanders - thank you for this pull request!

This topic has come up recently, but more in the context of an event mechanism (see #466). This PR appears to target subclasses of ExtensionApp and I wonder if it could be generally extended to include server extensions. Perhaps it could model the load functionality to call a stop-related method for those cases. (Note: this was the first time I've poked around the extensions subsystem, so please take what I say with a grain of salt.)

Should the call to stop_all_extensions() reside within the IOLoop lifecycle so the extensions can use the same message loop as when they were running or continue to invoke methods in a similar manner? I realized this when I was about to suggest we make stop_extension() async so we could then gather up the futures across the extensions and await them in parallel - then realized there's no loop at that point. I think not having an IOLoop may be problematic - or least something that might hinder what/how extension authors want to handle stop requests.

We might take a look at having ShutdownHandler call into serverapp and marry that call (could be _cleanup_extensions()) with _signal_stop() so that extensions could be stopped on the IOLoop prior to its shutdown (perhaps).

jupyter_server/serverapp.py Outdated Show resolved Hide resolved
docs/source/developers/extensions.rst Outdated Show resolved Hide resolved
@oliver-sanders
Copy link
Contributor Author

oliver-sanders commented Jun 11, 2021

Hi, thanks for looking in.

Note: this was the first time I've poked around the extensions subsystem, so please take what I say with a grain of salt.

Ditto here.

I wonder if it could be generally extended to include server extensions

Have had a look at doing this, I think it would require a different interface, something like def add_shutdown_handler. I think it's ok to say "if you want this functionality move to an extension app" as in order to have something to shut down the extension needs state which is a bit messy in the absence of an application?

Should the call to stop_all_extensions() reside within the IOLoop lifecycle

Good idea, the cleanup_kernels and cleanup_terminals methods use the run_sync interface which may create a new event loop. It would be good to do this in the same loop, however, I'm guessing it's that way for reason?

I think we should minimally include a test that calls stop_all_extensions

Tested that stop_extension is called by ServerApp.cleanup_extensions.

@oliver-sanders oliver-sanders force-pushed the 241 branch 2 times, most recently from d49040f to 7637dac Compare June 11, 2021 15:43
@kevin-bates
Copy link
Member

I think it's ok to say "if you want this functionality move to an extension app" as in order to have something to shut down the extension needs state which is a bit messy in the absence of an application?

That seems reasonable.

the cleanup_kernels and cleanup_terminals methods use the run_sync interface which may create a new event loop. It would be good to do this in the same loop, however, I'm guessing it's that way for reason?

Yeah, they are async functions that must be called from non-async functions in this case, thus the wrapper.

I spent some time with this today, and believe this might be difficult to accomplish. I think we'd need to call _cleanup() (which is missing a caller in the latest changes) from the signal handler. However, my brief testing did confirm that the current IOLoop at that time is still self.io_loop, but I wasn't able to get the methods to behave correctly. (kept getting cannot reuse already awaited coroutine - and I'm not sure why.)

We could try playing with this more as I think it would be good to allow the extensions to stop from within the same ioloop, but that might be pushing things when coupled with the sig handler dynamic. (Note the first handler creates a thread to allow the user to answer y/n, but I think the second SIGINT is still handled on the main thread. We'll need to see if answering 'y' (vs. 2 ctrl-c) changes things as well.

I see the comment regarding moving the cleanup_extensions prior to cleanup_kernels was resolved, yet the method still occurs after kernels have been shut down, was that intentional?

@oliver-sanders
Copy link
Contributor Author

Sorry, got my commits in a twist, I've fiddled the way the _cleanup gets called, I think this handles the IOLoop issues.

I've tested:

  • ctrl+c & respond "y" when prompted
  • ctrl+c & ctrl+c
  • Calling serverapp.stop() from a handler
test example
from jupyter_server.base.handlers import JupyterHandler
from jupyter_server.extension.application import ExtensionApp
from jupyter_server.extension.handler import ExtensionHandlerMixin
        
import tornado
import tornado.gen
        
        
class MyExtensionHandler(ExtensionHandlerMixin, JupyterHandler):
        
    @tornado.web.authenticated
    def get(self):
        self.write('goodbye world')
        self.serverapp.stop()
        
        
class MyExtensionApp(ExtensionApp):
        
    # -------------- Required traits --------------
    name = "myextension"
    default_url = "/myextension"
    load_other_extensions = True
    file_url_prefix = "/render"
        
    def initialize_settings(self):
        ... 
        
    def initialize_handlers(self):
        self.handlers.extend([
            ('/myextension', MyExtensionHandler)
        ])  
        
    def initialize_templates(self):
        ... 
        
    async def stop_extension(self):
        print('#1')
        await tornado.gen.sleep(1)
        print('#2')
        
        
def _jupyter_server_extension_points():
    return [
        {   
            "module": "myext",
            "app": MyExtensionApp
        }   
    ]         

@kevin-bates
Copy link
Member

Hi Oliver. I took this for a spin just now and it looks like both the kernel and terminal shutdowns are acting differently with these changes. The symptoms are that those shutdown operations are not taking place.

Here's the output using the released server with 1 kernel and 1 terminal active during the two SIGINTS:

^C[C 2021-06-29 16:39:45.830 ServerApp] received signal 2, stopping
[I 2021-06-29 16:39:45.838 ServerApp] Shutting down 1 kernel
[D 2021-06-29 16:39:45.840 ServerApp] Clearing buffer for 8a7abf65-bf04-4d73-b120-b9cf6aa70100
[I 2021-06-29 16:39:45.840 ServerApp] Kernel shutdown: 8a7abf65-bf04-4d73-b120-b9cf6aa70100
[D 2021-06-29 16:39:45.866 ServerApp] 304 GET /static/favicons/favicon-busy-1.ico (::1) 1.86ms
[D 2021-06-29 16:39:45.871 ServerApp] 304 GET /static/favicons/favicon.ico (::1) 2.74ms
[I 2021-06-29 16:39:46.157 ServerApp] Shutting down 1 terminal
[I 2021-06-29 16:39:46.166 ServerApp] EOF on FD 23; stopping reading
[I 2021-06-29 16:39:46.266 ServerApp] Terminal 1 closed
Websocket closed
[D 2021-06-29 16:39:46.268 ServerApp] <LspStdIoReader(parent=<LanguageServerSession(language_server=pyls, argv=['/opt/anaconda3/envs/jupyter-server-dev/bin/python', '-m', 'pyls'])>)> closed
[D 2021-06-29 16:39:46.268 ServerApp] <LspStdIoWriter(parent=<LanguageServerSession(language_server=pyls, argv=['/opt/anaconda3/envs/jupyter-server-dev/bin/python', '-m', 'pyls'])>)> closed

while running the same scenario with this branch yields:

^C[C 2021-06-29 16:37:34.628 ServerApp] received signal 2, stopping
[I 2021-06-29 16:37:34.632 ServerApp] Shutting down 8 extensions
[I 2021-06-29 16:37:34.633 ServerApp] Shutting down 1 kernel
[I 2021-06-29 16:37:34.633 ServerApp] Shutting down 1 terminal
[D 2021-06-29 16:37:34.634 ServerApp] <LspStdIoReader(parent=<LanguageServerSession(language_server=pyls, argv=['/opt/anaconda3/envs/jupyter-server-dev/bin/python', '-m', 'pyls'])>)> closed
[D 2021-06-29 16:37:34.635 ServerApp] <LspStdIoWriter(parent=<LanguageServerSession(language_server=pyls, argv=['/opt/anaconda3/envs/jupyter-server-dev/bin/python', '-m', 'pyls'])>)> closed
<my-prompt>$ [IPKernelApp] WARNING | Parent appears to have exited, shutting down.

I'm not too worried about the LSP messages, but it would be good to ensure the appropriate kernel/terminal shutdown sequences are taking place. Could you please confirm you see similar behavior when there are active kernels at shutdown? Thanks.

@oliver-sanders
Copy link
Contributor Author

oliver-sanders commented Jun 30, 2021

Could you please confirm you see similar behavior when there are active kernels at shutdown? Thanks.

Well spotted, thanks for reviewing.

Yes I've managed to reproduce this using Jupyter Lab. It would appear that the run_sync logic doesn't like being called from async functions.

Since cleanup_terminals and cleanup_kernels are only called from _cleanup I've made them async too. This means these cleanup methods will also be run using the same event loop whereas before they were being run in an asyncio event loop.

This feels nicer, however, I'm not sure whether there will be downstream consequences.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work Oliver. This last set of changes appear to resolve the kernel/terminal shutdown issues - thank you!

@kevin-bates kevin-bates self-requested a review June 30, 2021 18:15
@kevin-bates
Copy link
Member

Shoot - didn't catch the test failures. I see what you mean by the run_sync as it looks like the test issues stem from run_sync(app._cleanup()).

@oliver-sanders
Copy link
Contributor Author

Dammit, should be fixed now.

This came down to the disparity between AsyncMappingKernelManager (used in real life?) and MappingKernelManager (used in tests?).

I've added a run_sync interface that calls async functions with the provided io loop and used that for the cleanup of kernels, terminals & extensions.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great work here Oliver!

@kevin-bates
Copy link
Member

@oliver-sanders - We discussed the merge of this PR in the server meeting today and decided to go ahead and create a minor release to carry this change late next week. Since there could be folks that both derive from ServerApp and override the existing kernel and terminal cleanup methods (unlikely, but possible) we felt that a week's time would allow those folks to either prepare for this change or request this be done via a major release since, technically speaking, this could be construed as an API change.

This will be a useful addition for extensions developers - thanks again!

@blink1073 blink1073 added this to the 1.10 milestone Jul 2, 2021
Copy link
Member

@Zsailer Zsailer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliver-sanders, this is great! Thank you for taking the time to work on this.

Overall, I think this approach looks good. I left some small comments around implementation details in line.

First, I think we should just add stop_extension as a method to ExtensionApp and have it pass (i.e. do nothing) by default.

Second, I don't think we need an attribute in the ExtensionManager to store ExtensionApps. I think we can just loop through the ExtensionPoints and ask “are you an ExtensionApp?” Otherwise, we open the possibility for that attribute to go stale over time.

@oliver-sanders
Copy link
Contributor Author

@Zsailer, sorted, deviated a little to keep the extension_app property a mapping of extension names rather than extension point names as extension names are more user-facing (so make more sense to the person reading the log files).

In anticipation of downstream developers needing to debug shutdown logic I added some debug logging so we can see which extensions have started/finished shutting down.

@Zsailer
Copy link
Member

Zsailer commented Jul 8, 2021

Thanks, @oliver-sanders. This looks good to me. Merging away!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop / unload / cleanup hook for a server extension
6 participants