Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carbonplan: Provide service account for dask workers and notebook pods #492

Closed
1 of 5 tasks
Tracked by #291
jhamman opened this issue Jun 29, 2021 · 3 comments · Fixed by #618
Closed
1 of 5 tasks
Tracked by #291

Carbonplan: Provide service account for dask workers and notebook pods #492

jhamman opened this issue Jun 29, 2021 · 3 comments · Fixed by #618
Assignees
Labels
Enhancement An improvement to something or creating something new.

Comments

@jhamman
Copy link

jhamman commented Jun 29, 2021

Summary

We would like to be able to write to S3 using Dask Workers. Ideally we do this with some sort of service account that is shared between notebook and worker pods.

User Stories

  • As a Dask Gateway user, I want to read/write to s3 without having to manually pass credentials around
  • As a hub administrator, I want to make sure people don't have to manually pass security credentials to workers

Acceptance criteria

  • The following pseudo code should work:
cluster = GatewayCluster()
cluster.scale(4)
client = cluster.get_client()

ds = xr.open_dataset('s3://path-to-zarr-store', engine='zarr')  # currently works only for public data
ds.to_zarr('s3://path-to-another-store')  # currently fails because workers don't have write credentials

Important information

Tasks to complete

I think this is the basic workflow but I've only done this on GCP.

@jhamman jhamman added the Enhancement An improvement to something or creating something new. label Jun 29, 2021
@choldgraf choldgraf mentioned this issue Jun 29, 2021
7 tasks
@damianavila
Copy link
Contributor

@jhamman
Copy link
Author

jhamman commented Jul 13, 2021

Hi folks! Wondering if there is anything specific we can do to help move this one forward? For most of our use cases, we've been able to work around this but it is starting to hold up a few workflows.

@yuvipanda yuvipanda changed the title [AWS / CarbonPlan] Service account for dask workers [AWS / CarbonPlan] Service account for dask workers and notebook pods Aug 10, 2021
@yuvipanda yuvipanda changed the title [AWS / CarbonPlan] Service account for dask workers and notebook pods Provide service account for dask workers and notebook pods Aug 10, 2021
@yuvipanda yuvipanda changed the title Provide service account for dask workers and notebook pods Carbonplan: Provide service account for dask workers and notebook pods Aug 10, 2021
@choldgraf choldgraf assigned choldgraf and yuvipanda and unassigned choldgraf Aug 17, 2021
@yuvipanda
Copy link
Member

@jhamman try this out now?

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Aug 19, 2021
eksctl
[supports](https://eksctl.io/usage/iamserviceaccounts/#usage-with-config-files)
creating kubernetes service acocunts bound with
[IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
We create one with S3 access, and bind it to our notebook and dask
pods. This should give them full s3 access.

Remove separate eksctl cluster jsonnet object, since it was
not doing anything useful.

Stolen from 2i2c-org#436

Fixes 2i2c-org#492
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement An improvement to something or creating something new.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants