-
Notifications
You must be signed in to change notification settings - Fork 210
Cleanup kernel for dashboards run within Jupyter Notebook #117
Comments
Another approach would be to piggyback on the "Deploy as > Local dashboard" which was originally intended for notebook authors to see what their dashboards would look like when deployed. They address all three bullets in the issue description. However, they "leak" kernels. A visit to the dashboard spawns a kernel but nothing ever cleans the kernel up when the user navigates away. Perhaps cleanup is fixable in the dashboard extension? Periodically check which kernels have connected websocket clients and cleanup those that have none? |
Cleanup is an interesting issue. @jhpedemonte and I discussed this a little bit as we were investigating the Thebe alternative in #105. For forever-running processes, such as Spark Streaming, when do we clean up? If the main usage of the dashboard "preview" is to see the layout with less focus on actual content, then navigation away from the page could be ok. For the deployed case, we discussed manually shutting down standalone dashboard kernels as you can do in the Notebook already. That could be an alternative in this case too. |
Cleanup has been something we've punted to the container level with tmpnb. When the user has not had interaction with a kernel for a while, the container hosting it goes away. I think forever-running processes are actually misplaced within the same kernel / container as the dashboard. I know we're doing that in our demos, but I want to believe we only do it for educational purposes. If we wanted to deploy a dashboard that we wanted to scale even to a dozen users, I wouldn't want each one doing its own streaming. Instead, I'd expect the stream to be in its own process with an API and the dashboard notebooks just to sample it. Today, if I want to build that standalone API, I've got to step outside of the notebook environment. But I think this where kernel gateway and notebooks that becomes micro services can get interesting.
|
That makes sense. Our (potential) future work then would be to make those steps as easy as possible for a dev. |
Course of action based on discussion above:
|
Not sure we need this one. If user has ability to log into the Jupyter server, they can just hit the URL of the locally deployed dashboard. I'm OK with the notebook author still controlling when that app is deployed.
Immediately when lost is too quick. WS disconnects happen all the time. tmpnb works on a reaping cycle after a period of inactivty. |
So then this issue is just to clean up kernels spawned by locally deployed dashboards? |
Yes. Either:
I believe the reason the kernel doesn't appear in the UI for the second approach is because there is no associated session resource. Thebe only requests the kernel, not a session to go with it (and rightly so because there is no associated notebook with the kernel). |
Regarding the Running tab in notebook UI: how do you feel about adding a 'Dashboards' section there, to go with 'Notebook' and 'Terminals'? We might be able to do it as part of our extension. |
The refresh in the Running tab isn't extensible at all, not even by hacking. We could just add our own 'Dashboards' section and manage it ourselves. Add our own timer and event handler on the refresh button. But we may need to add an |
Definitely don't want to add new UI to the notebook frontpage (also called dashboard just to be confusing) given the roadmap of where things are headed with phosphor. That just introduces more things that need to be ported. I was more wondering if, once we're based on jupyter-js-services, if we could make an /api/session request and pass along the filename of the dashboard HTML file plus a little UUID to allow for multiple instances, instead of a proper notebook to get the associated kernel tracked by the UI automatically. |
When deploying a dashboard, use the Notebook REST API to create a new Notebook Session. This allows the dashboard kernel to show in the list of currently running notebooks. (ref jupyter#117) :chicken: (c) Copyright IBM Corp. 2015
To begin, we create a new notebook session so that the dashboard kernel shows in the Running Notebooks list. @parente We have a fallback to use kernel API directly in case the server does not have a Sessions API. We tested against tmpnb.org, but it has the sessions API since it spawns a notebook server. Is there any scenario that would only have the kernel API? |
Nice! Kernel gateway doesn't have a session API exposed, only kernels, because it's meant to be headless. I would claim notebook within tmpnb.org is also meant to be headless since tmpnb does the cleanup at the container level. How hard would it be to only create the session when not in tmpnb mode (or whatever the equivalent setting is in the new dashboard branch)? Also, since the goal here would be to have multiple kernels spawned for a dashboard for independent users, can you spawn multiple sessions for the same dashboard file with the same name? By tacking on a ID? |
Not hard. There's already a
We can. One issue, though, is that the Notebook UI makes a link out of the path passed to the session API. We take advantage of that and use the full dashboard path so clicking on the link in the Running tab will correctly open the dashboard. If we add an ID, it would have to be compatible with this feature, for now. Maybe something like |
Works for me, but maybe a diff arg name would be better because it's not really the kernel ID. dashboardid? instanceid? |
Arg, nevermind. It seems that clicking on those links opens up a URL that looks like this: |
Can you have repeat entries? If so, then forget the ID. I assumed (probably improperly) that if you had repeats of the same name, only one would show. If they all show but have the same name, that's fine. They're all indistinguishable at a "who's using it?" level anyway. |
The issue with the URL encoding is that they split the URL on Do we care about the actual links in this case since we only did this for shutdown purposes? Currently they will open that deployed dashboard and start a new kernel. |
Ah, right. Every click starts a new kernel, doesn't return you to the existing one. So, yea, in non-tmpnb mode (or whatever the new equiv is), start a session, just dupe the bare filename, every click starts a new one in notebook UI, don't bother with an ID, and the prime use we tell everyone is for the "shutdown" button. Hacky, but that's the point. We're turning existing notebook server into a local dashboard host. That's a fine option ahead of full-on deployment at scale. |
Sorry, misread "Can you have repeat entries?" as "You can have repeat entries". Looks like duplicates are not allowed in the list. If you attempt to POST We will need to use the id. We could open an issue on notebook to encode query params properly in an attempt to get the id working. Since we cannot extend the "dashboard" (tree) page, it could be tricky to hack around the issue. We could have the links not work and only use the Shutdown button for the time being and wait for the issue to be fixed, or we could try to come up with another solution. |
Also, if we do get the links working, we could make use of the generated IDs to connect to the correct kernel if desired. Currently we store the path in local storage, so only one kernel would work on one machine (browser). |
Might be a hard sell since the main use case is notebooks and they don't have query params. But a PR with the simple fix might find its way in, in the interest of future proofing.
I think this is fine for now. Especially since this PR isn't going anywhere until the thebe replacement work lands. Rather move onto other enhancements / issues. |
If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015
When deploying a dashboard, use the Notebook REST API to create a new Notebook Session. This allows the dashboard kernel to show in the list of currently running notebooks. (ref jupyter#117) :chicken: (c) Copyright IBM Corp. 2015
If in tmpnb mode, create a new kernel rather than using session API. ref jupyter#117 (c) Copyright IBM Corp. 2015
Now creating a unique session for each user (stored in localStorage). |
Unassigning until unblocked in #105. |
This work is now part of jupyter/dashboards_bundlers#3. The associated PR #127 can remain open here until we no longer need it. |
Deploying completely standalone dashboards can be overweight for situations in which the notebook author wants to quickly share a series of "micro-dashboards" for a couple of trusted users (e.g., teammates) to play with. It would be nice to hand users a URL to a notebook within Jupyter and have it appear as a dashboard. We already do this somewhat with the
?dashboard
query arg which hides everything except the Jupyter menubar. A few more actions are required to get a functioning dashboard for every user that hits the notebook.The text was updated successfully, but these errors were encountered: