-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dask-gateway compatibility #87
Conversation
This is some great work, @bolliger32! And it warms my heart to see PRs with unit tests in them. You mention "The get_*_cluster commands should be the exact same for almost all use cases." Are there any gotchas our use cases that we might need to watch for with this PR? One potential change on quick look, you have a Regardless, thanks for putting this together! |
Haha thanks! As for the
One other concern with dask-gateway of note - you do still end up passing credentials over the cluster which I would imagine has some potential security issues? I don't really know of a better way to do this... and we were already doing this via the GCSFUSE_TOKENS env var. One concern with our current dask-gateway deployment is that if someone displays the HTML repr of a cluster options object in the notebook, it has a text field that contains their credentials. I don't think this information ever makes it into the json of the notebook so wouldn't get saved in a repo accidentally, but it does seem a little strange. I don't see a great way to pass credentials that would avoid this - and users would never interact with one of those options objects unless they are bypassing the rhg_compute_tools framework to use dask_gateway directly. So I don't think it's a big issue. But just something I wanted to flag in case you thought it was a bigger issue. As for the TODO...I was originally hoping to spin up an actual kubernetes cluster inside the CI (using k3s) and then deploy dask gateway on the kuberentes cluster (rather than using a local backend for dask-gateway). This would allow for some more thorough testing that options were getting passed correctly (some options don't actually have an effect in the local cluster (like changing the assigned worker memory size for some reason, which confuses me...). But that seems like a fair amount of work and potentially something to do for a future PR. In fact, I'll file an issue for that now :) (#88 ) |
And then I think it's worth a v0.3.0 tag when this is merged in. |
test gh actions
@brews would love your post-holiday take on the GH actions I've set up to migrate from travis, as I'm still figuring these out. I've got two workflows. One for every push and pull request, which builds and runs tests. And then one for every release, which just builds and pushes to pypi. I'm definitely learning both GH actions and python package publishing as I'm going, so let me know if things don't look right. I think I've tested out the release workflow though and it seems to successfully push to pypi. Right now theres a v1.0.0a2 release up there which was uploaded when I created a release. I did not include any testing in the release workflow b/c I assume that would only be implemented following a merge into master, so all of the tests would already have been run. But we could change this if you think better to include tests there too. Just two files to look at really - the two under |
Merged in order to be able to Mark current images as stable. The GH actions still could use a look over. We can update those via a follow up PR if needed |
flake8 rhg_compute_tools tests docs
Reworked some of the
kubernetes
module to allow for the use of dask-gateway when it is available (currently only set up to run with a coastal image on adrastea). There will be some related PR's to the helm-chart and docker_images repos.Tried to keep the user-facing api as similar as possible. The
get_*_cluster
commands should be the exact same for almost all use cases.Testing is somewhat limited b/c not all of the options you can pass to a cluster object with a kubernetes backend are available for a local backend (which is what is used on travis for the tests). But I think it's a start.
I think there was a black update that changed the autoformatting behavior a bit. I ran black on everything and it changed a few other files as well (sorry about that). I had thought everything had had black already run on it, but I think the version change may have introduced a change in behavior.
EDIT 12/10/20: Added "tag" argument to supersede #80