Cleanup kernel for dashboards run within Jupyter Notebook #117

parente · 2015-11-07T02:33:02Z

Deploying completely standalone dashboards can be overweight for situations in which the notebook author wants to quickly share a series of "micro-dashboards" for a couple of trusted users (e.g., teammates) to play with. It would be nice to hand users a URL to a notebook within Jupyter and have it appear as a dashboard. We already do this somewhat with the ?dashboard query arg which hides everything except the Jupyter menubar. A few more actions are required to get a functioning dashboard for every user that hits the notebook.

Hide all chrome, including the menubar.
Spawn a completely new kernel to stay independent from other users. (Is this already supported in the notebook?)
Run all cells. (This has security implications.)

The text was updated successfully, but these errors were encountered:

parente · 2015-11-07T02:55:38Z

Another approach would be to piggyback on the "Deploy as > Local dashboard" which was originally intended for notebook authors to see what their dashboards would look like when deployed. They address all three bullets in the issue description. However, they "leak" kernels. A visit to the dashboard spawns a kernel but nothing ever cleans the kernel up when the user navigates away.

Perhaps cleanup is fixable in the dashboard extension? Periodically check which kernels have connected websocket clients and cleanup those that have none?

dalogsdon · 2015-11-09T15:28:44Z

Cleanup is an interesting issue. @jhpedemonte and I discussed this a little bit as we were investigating the Thebe alternative in #105. For forever-running processes, such as Spark Streaming, when do we clean up? If the main usage of the dashboard "preview" is to see the layout with less focus on actual content, then navigation away from the page could be ok.

For the deployed case, we discussed manually shutting down standalone dashboard kernels as you can do in the Notebook already. That could be an alternative in this case too.

parente · 2015-11-09T15:47:08Z

Cleanup has been something we've punted to the container level with tmpnb. When the user has not had interaction with a kernel for a while, the container hosting it goes away.

I think forever-running processes are actually misplaced within the same kernel / container as the dashboard. I know we're doing that in our demos, but I want to believe we only do it for educational purposes. If we wanted to deploy a dashboard that we wanted to scale even to a dozen users, I wouldn't want each one doing its own streaming. Instead, I'd expect the stream to be in its own process with an API and the dashboard notebooks just to sample it.

Today, if I want to build that standalone API, I've got to step outside of the notebook environment. But I think this where kernel gateway and notebooks that becomes micro services can get interesting.

Notebook Update and rename LICENSE to LICENSE.md #1: all the Spark stream stuff, gets converted to a microservice that either exposes an API and/or dumps to a traditional view DB (like in lambda arch)
Notebook Setup Travis CI #2: all the UI stuff, gets converted to a dashboard that uses the API / DB from notebook Update and rename LICENSE to LICENSE.md #1

jhpedemonte · 2015-11-09T20:38:18Z

That makes sense. Our (potential) future work then would be to make those steps as easy as possible for a dev.

dalogsdon · 2015-11-11T21:43:22Z

Course of action based on discussion above:

Expose "Deploy as > Local dashboard" page as some URL
Start new kernel
Run all code
Kill kernel when page or websocket connection is lost

parente · 2015-11-11T22:12:26Z

Expose "Deploy as > Local dashboard" page as some URL

Not sure we need this one. If user has ability to log into the Jupyter server, they can just hit the URL of the locally deployed dashboard. I'm OK with the notebook author still controlling when that app is deployed.

Kill kernel when page or websocket connection is lost

Immediately when lost is too quick. WS disconnects happen all the time. tmpnb works on a reaping cycle after a period of inactivty.

dalogsdon · 2015-11-12T16:39:04Z

So then this issue is just to clean up kernels spawned by locally deployed dashboards?

parente · 2015-11-12T16:55:42Z

Yes. Either:

automatically, when there are no active websockets associated with a kernel for some configurable period of time
manually, via the Running tab in the Jupyter notebook UI

I believe the reason the kernel doesn't appear in the UI for the second approach is because there is no associated session resource. Thebe only requests the kernel, not a session to go with it (and rightly so because there is no associated notebook with the kernel).

jhpedemonte · 2015-11-13T16:59:25Z

Regarding the Running tab in notebook UI: how do you feel about adding a 'Dashboards' section there, to go with 'Notebook' and 'Terminals'? We might be able to do it as part of our extension.

jhpedemonte · 2015-11-13T17:09:44Z

The refresh in the Running tab isn't extensible at all, not even by hacking. We could just add our own 'Dashboards' section and manage it ourselves. Add our own timer and event handler on the refresh button. But we may need to add an /api/dashboards server extension point, to query the running Dashboard kernels.

parente · 2015-11-13T17:40:20Z

Definitely don't want to add new UI to the notebook frontpage (also called dashboard just to be confusing) given the roadmap of where things are headed with phosphor. That just introduces more things that need to be ported.

I was more wondering if, once we're based on jupyter-js-services, if we could make an /api/session request and pass along the filename of the dashboard HTML file plus a little UUID to allow for multiple instances, instead of a proper notebook to get the associated kernel tracked by the UI automatically.

When deploying a dashboard, use the Notebook REST API to create a new Notebook Session. This allows the dashboard kernel to show in the list of currently running notebooks. (ref jupyter#117) :chicken: (c) Copyright IBM Corp. 2015

dalogsdon · 2015-11-16T23:03:16Z

To begin, we create a new notebook session so that the dashboard kernel shows in the Running Notebooks list.

@parente We have a fallback to use kernel API directly in case the server does not have a Sessions API. We tested against tmpnb.org, but it has the sessions API since it spawns a notebook server. Is there any scenario that would only have the kernel API?

parente · 2015-11-17T01:51:13Z

Nice!

Kernel gateway doesn't have a session API exposed, only kernels, because it's meant to be headless. I would claim notebook within tmpnb.org is also meant to be headless since tmpnb does the cleanup at the container level. How hard would it be to only create the session when not in tmpnb mode (or whatever the equivalent setting is in the new dashboard branch)?

Also, since the goal here would be to have multiple kernels spawned for a dashboard for independent users, can you spawn multiple sessions for the same dashboard file with the same name? By tacking on a ID?

jhpedemonte · 2015-11-17T15:14:43Z

How hard would it be to only create the session when not in tmpnb mode

Not hard. There's already a tmpnb_mode boolean we can use.

can you spawn multiple sessions for the same dashboard file with the same name? By tacking on a ID?

We can. One issue, though, is that the Notebook UI makes a link out of the path passed to the session API. We take advantage of that and use the full dashboard path so clicking on the link in the Running tab will correctly open the dashboard. If we add an ID, it would have to be compatible with this feature, for now. Maybe something like /files/local_dashboards/foobar/index.html?kernelid=1?

jhpedemonte · 2015-11-17T15:19:52Z

Seems to work:

parente · 2015-11-17T15:44:58Z

Works for me, but maybe a diff arg name would be better because it's not really the kernel ID. dashboardid? instanceid?

jhpedemonte · 2015-11-17T15:45:29Z

Arg, nevermind. It seems that clicking on those links opens up a URL that looks like this: .../index.html%3Fkernelid%3D1. Notebook is not properly unescaping.

parente · 2015-11-17T15:46:59Z

Can you have repeat entries? If so, then forget the ID. I assumed (probably improperly) that if you had repeats of the same name, only one would show. If they all show but have the same name, that's fine. They're all indistinguishable at a "who's using it?" level anyway.

dalogsdon · 2015-11-17T16:15:11Z

The issue with the URL encoding is that they split the URL on / and encode the components, but they fail to handle the ? and &, so the query gets encoded.

Do we care about the actual links in this case since we only did this for shutdown purposes? Currently they will open that deployed dashboard and start a new kernel. ~~Since duplicates are allowed in the list, this makes the only purpose of the kernelid portion to link to an existing kernel, which we don't care about.~~

parente · 2015-11-17T16:47:00Z

Ah, right. Every click starts a new kernel, doesn't return you to the existing one. So, yea, in non-tmpnb mode (or whatever the new equiv is), start a session, just dupe the bare filename, every click starts a new one in notebook UI, don't bother with an ID, and the prime use we tell everyone is for the "shutdown" button.

Hacky, but that's the point. We're turning existing notebook server into a local dashboard host. That's a fine option ahead of full-on deployment at scale.

dalogsdon · 2015-11-17T17:03:41Z

Sorry, misread "Can you have repeat entries?" as "You can have repeat entries". Looks like duplicates are not allowed in the list. If you attempt to POST api/sessions with a duplicate path, it returns the existing session.

We will need to use the id. We could open an issue on notebook to encode query params properly in an attempt to get the id working. Since we cannot extend the "dashboard" (tree) page, it could be tricky to hack around the issue. We could have the links not work and only use the Shutdown button for the time being and wait for the issue to be fixed, or we could try to come up with another solution.

dalogsdon · 2015-11-17T17:07:46Z

Also, if we do get the links working, we could make use of the generated IDs to connect to the correct kernel if desired. Currently we store the path in local storage, so only one kernel would work on one machine (browser).

parente · 2015-11-17T17:08:49Z

We could open an issue on notebook to encode query params properly in an attempt to get the id working.

Might be a hard sell since the main use case is notebooks and they don't have query params. But a PR with the simple fix might find its way in, in the interest of future proofing.

We could have the links not work and only use the Shutdown button for the time being and wait for the issue to be fixed.

I think this is fine for now. Especially since this PR isn't going anywhere until the thebe replacement work lands. Rather move onto other enhancements / issues.

If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015

When deploying a dashboard, use the Notebook REST API to create a new Notebook Session. This allows the dashboard kernel to show in the list of currently running notebooks. (ref jupyter#117) :chicken: (c) Copyright IBM Corp. 2015

If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015

jhpedemonte · 2015-11-17T22:35:51Z

Now creating a unique session for each user (stored in localStorage).

parente · 2015-11-19T16:24:33Z

Unassigning until unblocked in #105.

parente · 2015-12-08T03:24:46Z

This work is now part of jupyter/dashboards_bundlers#3. The associated PR #127 can remain open here until we no longer need it.

parente added the enhancement label Nov 7, 2015

parente added the jhpedemonte/dalogsdon label Nov 16, 2015

parente mentioned this issue Nov 16, 2015

Report mode #55

Closed

parente changed the title ~~Additional options for immediate dashboard view~~ Cleanup kernel for dashboards run within Jupyter Notebook Nov 16, 2015

parente assigned dalogsdon Nov 16, 2015

parente removed the jhpedemonte/dalogsdon label Nov 16, 2015

parente added the dependency label Nov 17, 2015

dalogsdon added a commit to jhpedemonte/dashboards that referenced this issue Nov 17, 2015

Create a unique notebook session per user.

e633716

If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015

dalogsdon added a commit to jhpedemonte/dashboards that referenced this issue Nov 17, 2015

Create a unique notebook session per user.

0b05e9e

If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015

parente unassigned dalogsdon Nov 19, 2015

parente modified the milestone: 0.3.0 Nov 19, 2015

This was referenced Dec 2, 2015

Make deployed dashboards secure, scalable #13

Closed

Replace Thebe in Deploy as -> Local dashboard implementation with jupyter-js-* packages jupyter/dashboards_bundlers#3

Closed

parente removed this from the 0.3.0 milestone Dec 8, 2015

parente closed this as completed Dec 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup kernel for dashboards run within Jupyter Notebook #117

Cleanup kernel for dashboards run within Jupyter Notebook #117

parente commented Nov 7, 2015

parente commented Nov 7, 2015

dalogsdon commented Nov 9, 2015

parente commented Nov 9, 2015

jhpedemonte commented Nov 9, 2015

dalogsdon commented Nov 11, 2015

parente commented Nov 11, 2015

dalogsdon commented Nov 12, 2015

parente commented Nov 12, 2015

jhpedemonte commented Nov 13, 2015

jhpedemonte commented Nov 13, 2015

parente commented Nov 13, 2015

dalogsdon commented Nov 16, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

parente commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 19, 2015

parente commented Dec 8, 2015

Cleanup kernel for dashboards run within Jupyter Notebook #117

Cleanup kernel for dashboards run within Jupyter Notebook #117

Comments

parente commented Nov 7, 2015

parente commented Nov 7, 2015

dalogsdon commented Nov 9, 2015

parente commented Nov 9, 2015

jhpedemonte commented Nov 9, 2015

dalogsdon commented Nov 11, 2015

parente commented Nov 11, 2015

dalogsdon commented Nov 12, 2015

parente commented Nov 12, 2015

jhpedemonte commented Nov 13, 2015

jhpedemonte commented Nov 13, 2015

parente commented Nov 13, 2015

dalogsdon commented Nov 16, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

parente commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

dalogsdon commented Nov 17, 2015

parente commented Nov 17, 2015

jhpedemonte commented Nov 17, 2015

parente commented Nov 19, 2015

parente commented Dec 8, 2015