Track things needed to use JupyterHub for tmpnb #255

yuvipanda · 2016-12-30T08:32:50Z

This ticket tracks the features required in JupyterHub and friends before we can replicate all the features of tmpnb using a particular configuration of JupyterHub.

Login without requiring any authentication - TmpAuthenticator
Super quick (<3s?) response time between hitting URL and seeing notebook interface
Culler that keeps user list clean of dead users / inactive servers

yuvipanda · 2016-12-30T08:37:18Z

https://github.com/yuvipanda/jupyterhub-tmpauthenticator will take care of the authenticator.

Now, we need to test if there are any ways we can hit our performance deadline easily without pooling. If we can't, then we need to implement pooling! The best way to do that would be as a mixin that can be used with any spawner. Modifications to jupyter-singleuser will also be required to support pooling.

yuvipanda · 2016-12-30T08:41:39Z

@rgbkrk I'm specifically interested in finding out what you think about the 'if we can get notebooks to start in under x seconds (for some value of x), we do not need pooling' line of thinking. How low would 'x' have to be?

rgbkrk · 2016-12-30T14:34:20Z

I'm specifically interested in finding out what you think about the 'if we can get notebooks to start in under x seconds (for some value of x), we do not need pooling' line of thinking. How low would 'x' have to be?

500ms

It's not starting a single server that's the problem - it's the stampeding herd. Even the current ~3s to launch a single server isn't too bad -- it adds up once more users are piling in. Once you have 10 users show up around the same time, it tends to be 30s for the tenth user because of how Docker operates. Couple that with the typical spikes in load we had on the nature demo and now have on try.jupyter.org and you're talking several minute waits for something that should be a fast demo.

yuvipanda · 2016-12-30T17:19:48Z

@rgbkrk Ok, so it's ultimately 'X percentile start time under Ys, for upto Z concurrent servers per second' - so it is a measure of latency and throughput rather than just latency. X could be 90-95, Y could be 1-3s, Z could be anywhere between a hundred to a few thousand. Would those numbers and definitions be acceptable for you? If not which numbers would be, or is the definition not what you had in mind?

rgbkrk · 2016-12-30T18:38:02Z

Yeah this is a great definition, thanks @yuvipanda

minrk · 2017-01-02T12:34:30Z

Modifications to jupyter-singleuser will also be required to support pooling.

More precisely, modifications to the container entrypoint would be needed, not necessarily jupyterhub-singleuser. This could be modifications to jupyterhub-singleuser, or it could be a different entrypoint that is in a pre-flight stage until removed from the pool and assigned to a specific user, at which point it launches jupyterhub-singleuser.

In general, the pool entrypoint will:

allocate resources
perform common setup

and Spawner.start will:

perform user-specific setup (e.g. setting uid, API token, mounting user volumes, etc.)
finish launching single-user server

Only some cases will be able to get all the way to launching the notebook server in the preflight stage, where no user-specific action (e.g. uid, starting in a not-yet-mounted working dir) is needed. tmpnb is one such case, though.

It's not clear to me how a Mixin approach would work, since there is so much that would be specific to any given Spawner implementation. What are you envisioning there?

yuvipanda · 2017-01-03T08:46:10Z

I'm going to first make sure I can measure the three variables I want, and then attempt to do this *without* pooling. If it doesn't work, then I'll look to pooling and see what I can do with pure mixins (ideally!), if not then subclasses for a few spawners. I haven't gotten that far yet - I think step 1 for me is to establish a way to reliably measure these things.

…

On Mon, Jan 2, 2017 at 4:34 AM, Min RK ***@***.***> wrote: Modifications to jupyter-singleuser will also be required to support pooling. More precisely, modifications to the container entrypoint would be needed, not necessarily jupyterhub-singleuser. This could be modifications to jupyterhub-singleuser, or it could be a different entrypoint that is in a pre-flight stage until removed from the pool and assigned to a specific user, at which point it launches jupyterhub-singleuser. In general, the pool entrypoint will: 1. allocate resources 2. perform common setup and Spawner.start will: 1. perform user-specific setup (e.g. setting uid, API token, mounting user volumes, etc.) 2. finish launching single-user server Only some cases will be able to get all the way to launching the notebook server in the preflight stage, where no user-specific action (e.g. uid, starting in a not-yet-mounted working dir) is needed. tmpnb is one such case, though. It's not clear to me how a Mixin approach would work, since there is so much that would be specific to any given Spawner implementation. What are you envisioning there? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#255 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23o75IUAa52socqwHGIIA3ruEh2_Bks5rOO7XgaJpZM4LYEC7> .

-- Yuvi Panda T http://yuvi.in/blog

rgbkrk · 2017-01-03T16:55:57Z

At the very least, it's not too bad if the first pooling spawner was named TmpnbPoolingSpawner.

yuvipanda · 2017-03-06T17:56:36Z

I have deployed a tmpnb-style JupyterHub (URL private, ask me for it!), and it starts up pretty quick even with no pooling. @rgbkrk approves so far. Next step is to train a 100 simulated users at it and track start times.

willingc · 2017-03-06T18:51:27Z

@yuvipanda Nice!

akhmerov · 2017-05-20T10:36:18Z

Perhaps a relevant aspect is allowing jupyterhub to limit the number of concurrent servers. Requesting infinitely many servers is after all the easiest way to DOS any tmpnb setup.

yuvipanda changed the title ~~Move to using JupyterHub~~ Track things needed to use JupyterHub for tmpnb Dec 30, 2016

izahn mentioned this issue Jan 3, 2017

token authentication #256

Closed

pauleve mentioned this issue Feb 28, 2018

tmpnb: switch to JupyterHub colomoto/colomoto-docker#32

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track things needed to use JupyterHub for tmpnb #255

Track things needed to use JupyterHub for tmpnb #255

yuvipanda commented Dec 30, 2016 •

edited

Loading

yuvipanda commented Dec 30, 2016

yuvipanda commented Dec 30, 2016

rgbkrk commented Dec 30, 2016 •

edited

Loading

yuvipanda commented Dec 30, 2016

rgbkrk commented Dec 30, 2016

minrk commented Jan 2, 2017

yuvipanda commented Jan 3, 2017 via email

rgbkrk commented Jan 3, 2017

yuvipanda commented Mar 6, 2017

willingc commented Mar 6, 2017

akhmerov commented May 20, 2017 •

edited

Loading

Track things needed to use JupyterHub for tmpnb #255

Track things needed to use JupyterHub for tmpnb #255

Comments

yuvipanda commented Dec 30, 2016 • edited Loading

yuvipanda commented Dec 30, 2016

yuvipanda commented Dec 30, 2016

rgbkrk commented Dec 30, 2016 • edited Loading

yuvipanda commented Dec 30, 2016

rgbkrk commented Dec 30, 2016

minrk commented Jan 2, 2017

yuvipanda commented Jan 3, 2017 via email

rgbkrk commented Jan 3, 2017

yuvipanda commented Mar 6, 2017

willingc commented Mar 6, 2017

akhmerov commented May 20, 2017 • edited Loading

yuvipanda commented Dec 30, 2016 •

edited

Loading

rgbkrk commented Dec 30, 2016 •

edited

Loading

akhmerov commented May 20, 2017 •

edited

Loading