-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rearchitecture #186
Comments
I plan to work on an ideal kubernetes implementation first, following the abstraction laid out above, then add hadoop support, then job queue support, redefining abstractions as needed (working from most -> least common deployment backend). Design changes will almost certainly occur as I work through issues, but I think we can make the kubernetes deployment look more "kubernetes native" without sacrificing our other backends or splitting the codebase. |
cc @yuvipanda for thoughts on metacontroller/CRDs (apologies for the poorly spec'd design notes above, I'm still working out my thoughts). |
This looks amazing, and is something I wanna see in the JupyterHub world too! I'm curious about the choice of JWT. They are awesome with short expiry, but present serious issues if they have longer expiry times. Unlike cookies with backing sessions, there is no way to revoke these if compromised without revoking all tokens. So no 'logout' is possible. http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/ has some more information on the problems with replacing cookies with JWT. But I don't actually know how clients pass this info to dask-gateway... At a cursory glance the rest seem great! |
Glad to hear it!
I went with JWTs instead of cookies, because I wanted something we could trust that wouldn't require us to keep a user database in our application. Since the username is in the JWT, we only need to validate the JWT instead of doing a DB lookup. The issues you bring up are valid, but aren't as big of a deal for dask-gateway as they would be for JupyterHub. Authentication for the gateway is designed for no-human-in-the-loop authentication (e.g. no web form, only things like tokens, kerberos, etc...), so forcing a relogin for all users will only consume resources (extra calls to the authentication backend) rather than user time (since login is automated). I think this is ok for applications we care about. |
Thank you for the explanation! The 'no human in the loop' definitely makes things different. To make sure I understand this correctly, I want to write out the example of how this would work with JupyterHub. To get a JWT you would:
Is this about right? If you wanna revoke any token, you would need to revoke all tokens. Where would this happen? The way I can think of is to rotate the signing key, but then you would need to rotate the key across everything that is validating JWTs, where it becomes a key distribution problem. It makes deployment a little more complex, at which point it is a tradeoff from having a database. This can also be mitigated by having This makes me wonder - can the I might also be talking from a fundamental ignorance of how dask-gateway auth works. If so, let me know and I'll happily trawl through the code instead :) |
Yeah, that looks correct.
This is an interesting idea. I see the jwt lifetime as set relatively short (15 min by default perhaps, but configurable). Talking through the tradeoffs:
I think I like the JWT approach slightly more since the caching in this case is global across all instances of the server, which I find easier to reason about. Thoughts? |
I'm currently hacking away on this, progress so far is good. One complication that's come up though is implementing our current user resource limits model (https://gateway.dask.org/resource-limits.html) on kubernetes. Currently we provide configurable limits on:
Requests for new clusters/workers that exceed this limit will fail. This is easy to do in a system where all state is kept on a single server (or when using a database), but much harder when trying to keep all state in kubernetes objects. For non-kubernetes backends we could keep or tweak the existing model as needed, but for kubernetes we'll definitely need to change things. A few options I see here:
Option 1 (and 2) support user-level (or group-level) resource limits natively, but impose restrictions on namespace usage ( Right now I'm leaning towards Option 2 (or 1), but option 3 also isn't bad. cc @yuvipanda, @mmccarty @jacobtomlinson for thoughts. |
Great news Jim! We are in favor of option 2. That also gives us the option to fall back to option 1 if needed. |
Option 2 also sounds reasonable to me. |
Creating a new namespace per user can get complicated and unweildy on clusters that aren't solely dedicated to dask-gateway / Jupyter use. I like option 2, assuming you aren't creating namespaces there. You can also use ResourceQuota without namespaces - see https://kubernetes.io/docs/concepts/policy/resource-quotas/#resource-quota-per-priorityclass. |
@droctothorpe could you expand a little on the drawbacks? I think using traefik to reverse proxy the pod services doesn't have much to do with the cluster's service mesh. In this instance it sounds like traefik is effectively being used at the application level. |
Having dug a little deeper into the Traefik documentation, @jacobtomlinson, my concerns were unwarranted. Deleting the comment. Thanks for the appropriate pushback. |
I was looking for a possible way to set up authentication with JWTs and found this issue, but no other mention of JWT-based authentication. Is this work on-going somewhere? |
After a few months working on this, a few past design decisions are starting to cause some issues.
At the same time, the external facing API seems fairly solid and useful. I've been trying to find a way to rearchitect to allow (but not mandate) running multiple services, without a (mandatory) database to synchronize with, and I think I've reached a decent design.
Design Requirements
Below I present a design outline of how I hope to achieve these goals for the various backends.
Authentication
As with JupyterHub, we have an
Authenticator
class that configures authentication for the gateway server. If a request comes in without a cookie/token, the authenticator is used to authenticate the user, then a cookie is added to the response so the authenticator won't be called again. The decrypted cookie is a uuid that maps back to a user row in our backing database.I propose keeping the authenticator class much the same, but instead of using a uuid as a cookie, I propose using a jwt to store the user information. If a request contains a jwt, it's validated, and (if valid) information like user name and groups can be extracted from the token. This removes the requirement for a shared database to map cookies to user names - subsequent requests will already contain this information.
Cluster IDs
Currently we mint a new
uuid
for every created cluster. This is nice as all backends have the same cluster naming scheme, but means we need to rely on a database to map uuids to cluster backend information (e.g. pod name, etc...).To remove the need for a database, I propose to encode backend information in the cluster id. This means that each backend will have a different looking id, but means we can parse the cluster id to reference it in the backing resource manager instead of using a database to map between our id and the backend's.
For the following backends, things might look like:
Cluster Managers
To support multiple dask-gateway servers, I found it helpful to split our backends into two categories:
Useful database in the resource manager
No useful database in the resource manager
The former category could support all our needed functionality without any need for synchronization outside of requests to the resource manager. The latter would require additional infrastructure on our end if we wanted to run multiple instances of the gateway server.
Walking through my proposed ideal implementations for each backend:
Kubernetes
The proposed design for running dask-gateway on kubernetes in an ideal (IMO) deployment is as follows:
dask-gateway-server
pods, or could be their own pods - right now I'm leaning towards the former for simplicity.Listing clusters, adding new clusters, removing old clusters, etc... can all be done with single kubernetes api requests, removing a lot of need for synchronization on our end. It also means admins can use kubectl without worrying about messing with a deployment's internal state.
Hadoop
The Hadoop resource manager could also be used to track all the application state we care about, but I'm not sure if querying it will be performant enough for us. We likely want to support an optional (or maybe mandatory) database here. Some benchmarking before implementation will need to be done.
In this case, an installation would contain:
HPC job queue systems
HPC job queue systems will require an external DB to synchronize multiple dask-gateway servers when running in HA mode. With some small tweaks, this will mostly look like the existing implementation. I plan to rely on Postgres for this, making use of its
SKIP LOCKED
feature to implement a work queue for synchronizing spawners, and asyncpg for its fast postgres integration.Cluster backend base class
Currently we define a base
ClusterManager
class that each backend has to implement. Each dask cluster gets its ownClusterManager
instance, which manages starting/stopping/scaling the cluster.With the plan described above we'll need to move the backend-specific abstractions up higher in the stack. I propose the following initial definition (will likely change as I start working on things):
Things like
kubernetes
and maybehadoop
would implement theClusterBackend
class. We'd also provide an implementation that uses a database to manage the multi-user/cluster state and abstracts on a single-cluster class (probably looking the same as our existingClusterManager
base class).The text was updated successfully, but these errors were encountered: