Add authentication to HTTP API #6431

jacobtomlinson · 2022-05-24T11:11:54Z

Proposal for closing #6407. Feedback and suggestions very welcome.

This PR adds authentication to the HTTP API via a decorator that can be reused on any methods we want to put behind it.

To use API routes that modify cluster state the user must set the API key in the config at distributed.scheduler.http.api-key and then that same API key must be set in the HTTP Authorization header of the request curl -H "Authorization: Bearer {api-key}" http://localhost:8787/api/v1/retire_workers.

Auth can be disabled by setting distributed.scheduler.http.api-key to False which may be useful in deployments where auth is handled elsewhere.

But by default, the api-key is set to "" which will always result in a 403 Forbidden. So I've also re-enabled the HTTP API by default (reverting #6420) as the new authentication default is also secure.

I've added a number of tests to ensure this auth handling is robust, but suggestions on improving testing further are welcome.

Closes #6407

github-actions · 2022-05-24T12:46:30Z

Unit Test Results

      15 files ±  0       15 suites ±0 6h 37m 32s ⏱️ + 3m 48s
  2 811 tests +  2   2 729 ✔️ +1   80 💤 +1 2 ❌ ±0
20 839 runs +13 19 905 ✔️ +9 930 💤 +2 4 ❌ +2

For more details on these failures, see this check.

Results for commit 37eabda. ± Comparison against base commit 60f0886.

♻️ This comment has been updated with latest results.

ntabris · 2022-05-24T16:18:51Z

But by default, the api-key is set to "" which will always result in a 403 Forbidden

This affects dashboard as well?

jacobtomlinson · 2022-05-24T16:43:35Z

@ntabris no just the new HTTP API that was added in #6270, and only the routes that affect the scheduler, read-only routes will not require auth.

jacobtomlinson · 2022-05-26T09:07:10Z

@fjetter @gjoseph92 would you have some time to review this?

pentschev · 2022-05-26T15:57:15Z

rerun tests

Note: rerunning gpuCI tests since those errors should be fixed by #6434

gjoseph92 · 2022-05-26T16:51:05Z

I wonder if the list of which routes do vs don't require auth should be in the config? (Note this is basically what we had in #6408, with a better name). Seems like something security-conscious users would like to be able to set themselves.

jacobtomlinson · 2022-05-27T13:57:46Z

Thanks @gjoseph92. I think giving folks the ability to individually select which routes are authenticated is going too far in terms of configurability, folks will be easily able to shoot themselves in the foot by missing one out. This feels like unnecessary complexity.

My preference would be to leave read-only routes public and read-write routed with auth. Then if folks want the read-only routes to also have authentication then it is up to them to add another layer, in the same way we don't authenticate the dashboard today and folks commonly add something like Kong in front.

I could maybe be persuaded that we could add a simple config option for "use auth for all API routes", "use auth for read-write routes only" and "disable auth". But again this still feels like unnecessary complexity. This also opens us up to creep in terms of all the other routes on the HTTP server.

gjoseph92 · 2022-05-27T18:05:00Z

Sure. This seems fine as an incremental step.

I have a little hesitancy about adding auth in general—it seems like it should be out of scope for dask, and as you mentioned adding a proxy in front seems like better practice anyway—but I suppose basic auth like this is somewhat common practice (prometheus does it, for example).

I guess I'm less into the read vs read-write distinction; I don't feel like that's a security guarantee we actually make or test against strongly enough to be sure the "read" routes are actually safe. An all-or-nothing approach would probably be more intuitive to me—if you're in a security environment where you'd need auth on the API, you probably should have it on everything. (And most likely, you should actually not have dask's auth on anything, and instead have a proxy in front with (m)TLS termination, connection limiting, auth, and whatever else you want.)

But that's a harder incremental change because:

we want the API to be secure
we want the API to be on by default (is this true? if not, that makes everything way easier.)
we don't want accessing the dashboard to suddenly require auth by default

So what you have here probably makes sense as an incremental step, because it lets use the API more securely, and iterate on it, without needing to update dask-cloudprovier, dask-k8s, dask-gateway, Coiled, and every other deployment to suddenly handle authenticating the dashboard.

But I think long-term, there shouldn't be a split between read and read-write routes. Auth should be all or nothing, and any deployment system exposing things to the internet should handle the "all" case. I think this is the ticket we should probably be discussing and have a plan to implement. I'd be a bit disappointed if things just stayed in this split case forever; if we don't think we'll ever have full auth, then I'm not sure we should actually add it incrementally just for the API.

jacobtomlinson · 2022-05-30T12:33:22Z

I have a little hesitancy about adding auth in general—it seems like it should be out of scope for dask, and as you mentioned adding a proxy in front seems like better practice anyway

I'm a bit confused by this comment.

The Dask RPC has always been unauthenticated by default, with the option to enable mTLS to authenticate it. So auth is very much in-scope for Dask.

When I implemented the HTTP API I followed the same principle of having an unauthenticated interface by default, although I hadn't yet implemented any auth so we have disabled the API by default. This was a sensible precaution.

This PR adds that authentication and also makes it default secure, which is inconsistent with the RPC but I think a reasonable choice. One of the main worries raised in #6407 was dask-cloudprovider exposing clusters to the internet in the most basic deployment scenarios, to make that less risky the mTLS is enabled by default but we needed this PR to do the same for the HTTP API.

Ideally I'd like to rethink the whole approach of exposing to the internet in those scenarios though.

I guess I'm less into the read vs read-write distinction; I don't feel like that's a security guarantee we actually make or test against strongly enough to be sure the "read" routes are actually safe. An all-or-nothing approach would probably be more intuitive to me

Routes like the /health endpoint might cause us a headache here as they must be available without authentication.

You also mention "we don't want accessing the dashboard to suddenly require auth by default" which I totally agree with. But the dashboard is a read-only endpoint that returns HTML, the read-only routes here in this API return the same information as JSON. So those goals are in conflict.

I would be more than happy to move more API routes under authentication, but some things like health and the dashboard must remain outside of it.

Again this is another reason why I want to rethink the exposing to the internet defaults in dask-cloudprovider as we currently make the dashboard available to anyone.

we want the API to be on by default (is this true? if not, that makes everything way easier.)

Ideally, I would like it to be on by default just as the RPC is.

Alternatively, I would like it to be easily enabled via a boolean config option. Currently to enable it you must configure a full list of HTTP routes and include the HTTP API in this list. This requires looking up the defaults and keeping that list up to date which can be done in dask-kubernetes but is cumbersome.

This is extra unpleasant for users who want to enable it manually via their .yaml files.

we want the API to be secure

I would nuance this to say we want the API to be able to be secure, just as the RPC can have TLS enabled.

I'm struggling a little to reason here about the HTTP API given the current security posture of the RPC. Being consistent and being default-secure are conflicting goals.

I do think that either making the HTTP API secure by default or disabled by default is a good practice. Maybe we should consider doing the same with the RPC?

I'd be a bit disappointed if things just stayed in this split case forever; if we don't think we'll ever have full auth, then I'm not sure we should actually add it incrementally just for the API.

Again I'm very conflicted here. What do you mean by full auth? Auth on every HTTP endpoint? Auth on the RPC? A common auth on both? Kerberos and other third-party auth plugins like dask-gateway has?

Generally I think the approach here is more than adequate for simple use cases, and I'm happy to move more read-only routes under the auth if there are genuine security concerns about them. Then for advanced use cases folks can turn it off and put an auth proxy in front.

jacobtomlinson added 4 commits May 24, 2022 11:57

Add authentication to HTTP API

a81f3de

Revert prettier formatting

f31dfca

Fix code language

85f38ab

Reinstate API by default but disallow all auth by default

a723ef2

jacobtomlinson mentioned this pull request May 24, 2022

Don't expose insecure HTTP API #6407

Open

2 tasks

jacobtomlinson added 2 commits May 24, 2022 14:36

Default API key to empty string which also always gives 403

f8fef3b

Flip if statement to simplify things

37eabda

jacobtomlinson mentioned this pull request Jul 25, 2022

Add Jupyter Server to Dask Scheduler #6737

Merged

jacobtomlinson requested a review from fjetter as a code owner January 23, 2024 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add authentication to HTTP API #6431

Add authentication to HTTP API #6431

jacobtomlinson commented May 24, 2022 •

edited

Loading

github-actions bot commented May 24, 2022 •

edited

Loading

ntabris commented May 24, 2022

jacobtomlinson commented May 24, 2022 •

edited

Loading

jacobtomlinson commented May 26, 2022

pentschev commented May 26, 2022

gjoseph92 commented May 26, 2022

jacobtomlinson commented May 27, 2022 •

edited

Loading

gjoseph92 commented May 27, 2022

jacobtomlinson commented May 30, 2022

Add authentication to HTTP API #6431

Are you sure you want to change the base?

Add authentication to HTTP API #6431

Conversation

jacobtomlinson commented May 24, 2022 • edited Loading

github-actions bot commented May 24, 2022 • edited Loading

Unit Test Results

ntabris commented May 24, 2022

jacobtomlinson commented May 24, 2022 • edited Loading

jacobtomlinson commented May 26, 2022

pentschev commented May 26, 2022

gjoseph92 commented May 26, 2022

jacobtomlinson commented May 27, 2022 • edited Loading

gjoseph92 commented May 27, 2022

jacobtomlinson commented May 30, 2022

jacobtomlinson commented May 24, 2022 •

edited

Loading

github-actions bot commented May 24, 2022 •

edited

Loading

jacobtomlinson commented May 24, 2022 •

edited

Loading

jacobtomlinson commented May 27, 2022 •

edited

Loading