-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Launch user sessions in multiple cluster from a single hub #7
Comments
The filesystem issue is key and probably not easy to solve. |
Unfortunately cross-DC NFS is not really viable for reliability, performance and security reasons :( I think step 1 would likely just involve a per-cluster home directory. We could augment it with a shared directory that is sync'd across all the clouds, via either FUSE or something like https://rclone.org/. I've made a release of the spawner already at https://github.com/yuvipanda/jupyterhub-multicluster-kubespawner, and am waiting for cloud credits to land up before I can do a deployment. |
2i2c team sprint meeting notes:
|
Update: pinning this one for a bit@yuvipanda and I just had a conversation about this work, and we agreed that it'd be best to prioritize some other development efforts first before we complete this one, especially since the LEAP hub needed to be deployed quickly enough that we just did it "the old fashioned way". We're going to focus on these two pieces
And will re-visit this one at a later date. |
Description of problem and opportunity to address it
Problem description
When communities have datasets or resources that are spread across multiple cloud locations (across data centers, cloud providers, etc), they currently must deploy one JupyterHub per location to provide access to the cloud resources that are there.
This creates a few problems:
Proposed solution
We should make it possible for a single hub to launch interactive sessions in multiple cloud locations, not only on the location where a hub is running.
This would allow communities to have a single hub as a "launch pad" for other kinds of infrastructure that is out there. It would reduce the complexity of running multiple hubs at once, and is potentially a way for communities to divide up their interactive sessions across billing accounts.
Implementation guide and constraints
Tech implementation
One likely candidate to make this possible is to define a new JupyterHub Spawner that knows how to talk to other Kubernetes clusters, along with some kind of process that can live on those clusters and "listen" for requests to launch interactive sessions. Then the spawner would request a session on a remote cluster, and direct the person there.
Considerations
Driving test cases
@rabernat has need for a few hubs that are similar flavors of a Pangeo hub. These are attached to a few different pots of money. Rather than providing one hub per test case, we could use this as an opportunity to prototype a multi-cluster launcher that is described here.
Updates and ongoing work
The text was updated successfully, but these errors were encountered: