Add support for additional cloud providers #88

aculich · 2017-06-14T21:11:48Z

If you're interested in support for this software on AWS, Jetstream, or other cloud providers, please let us know here... or even better, send us a Pull Request with your contributions to getting the code working on your desired cloud provider!

We so far have heard interest in supporting Jetstream using the OpenStack Magnum API, as well as using kubeadm.

We also have heard interest in supporting AWS. Here are some links provided to us by our AWS reps:

https://kubernetes.io/docs/getting-started-guides/aws/
https://aws.amazon.com/quickstart/architecture/heptio-kubernetes/

willingc · 2017-06-14T22:30:36Z

Thanks @aculich. For those that wish to help by submitting a PR, please limit changes that are vendor/cloud provider specific to its own section within https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/doc/source/create-k8s-cluster.rst file. We would like to keep the remainder of the documentation vendor agnostic. Thanks. Please let us know if you have questions.

aculich · 2017-08-07T21:58:33Z

@choldgraf and I tested Heptio based on pointers from our AWS rep, and @yuvipanda mentioned kops as a direction the open source community is moving, however it relies on having a DNS name already registered for its discovery process which can get in the way for quick testing on an IP address.

note that we also had to disable RBAC (which is not desirable in the long-term) with our Heptio install: https://kubernetes.io/docs/admin/authorization/rbac/#permissive-rbac-permissions

There is more to do.... and we'll ask for input from folks at the UCCSC AWS User Group meeting today.

willingc · 2017-08-07T22:04:29Z

Nice to see work happening with Heptio, @aculich and folks.

@rdodev, do you know who would be a good contact if we have additional questions? ☀️

rdodev · 2017-08-07T22:08:16Z

Hey @willingc happy to help and can be point person with any questions or issues relating to our AWS quickstart.

willingc · 2017-08-07T22:21:25Z

Thanks @rdodev. Good stuff happening at Heptio 😄

choldgraf · 2017-08-08T02:33:04Z

FWIW I really need to get something like this working on AWS within a week or so...otherwise we'll need to switch to something else for the bootcamp in early September. @aculich do you have time to give it another go with me this week?

choldgraf · 2017-08-10T16:26:25Z

@rdodev would you have a chance to do a live-chat with @aculich and I as we try to get k8s running on AWS? I'm helping teach a bootcamp to a buncha neuroscientists in early September and was hoping to run a k8s-based jupyterhub on AWS!

willingc · 2017-08-10T16:43:06Z

@choldgraf I got the heptio tutorial https://aws.amazon.com/quickstart/architecture/heptio-kubernetes/ up and running the other day with no issues. I haven't had time to try with JupyterHub but kubectl and helm were working. Heptio's friday podcasts on YouTube are really good too. The first one basically walks you through the tutorial install.

choldgraf · 2017-08-10T16:45:07Z

Huh - that is the same one Aaron and I were using and we ran into a buncha problems in the end (that I of course don't remember now). I'll give it another shot soon though. Been wrestling with binder DNS records all morning :-) --

…

On Thu, Aug 10, 2017 at 9:43 AM Carol Willing ***@***.***> wrote: @choldgraf <https://github.com/choldgraf> I got the heptio tutorial https://aws.amazon.com/quickstart/architecture/heptio-kubernetes/ up and running the other day with no issues. I haven't had time to try with JupyterHub but kubectl and helm were working. Heptio's friday podcasts on YouTube are really good too. The first one basically walks you through the tutorial install. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABwSHQpNNFrXDJ4Q5HrOwpYQ9GG-fhZcks5sWzMbgaJpZM4N6cry> .

willingc · 2017-08-10T16:46:36Z

FYI. I used the new VM option FWIW.

rdodev · 2017-08-10T16:58:23Z

Great to see things are working as expected @willingc one thing worth highlighting is the fact that AWS QS clusters are not "production-grade" and are only meant for testing/staging. Would be glad to help productionize (sic) your environment if and when you folks are ready.

choldgraf · 2017-08-11T18:54:07Z

I've got things running up to the point of the helm install. I followed the heptio guide and got my kubernetes machines running. Helm + kubectl are also installed. Here's the error that I'm getting:

helm install jupyterhub/jupyterhub --version=v0.4 --name=kube --namespace=kube -f config.yaml

    Error: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube". (get namespaces kube)

helm version
    Client: &version.Version{SemVer:"v2.5.1", GitCommit:"7cf31e8d9a026287041bae077b09165be247ae66", GitTreeState:"clean"}
    Server: &version.Version{SemVer:"v2.5.1", GitCommit:"7cf31e8d9a026287041bae077b09165be247ae66", GitTreeState:"clean"}

Any ideas?

yuvipanda · 2017-08-11T19:01:13Z

You should try a helm init again with the service account instructions in #124

…

On Fri, Aug 11, 2017 at 11:54 AM, Chris Holdgraf ***@***.***> wrote: I've got things running up to the point of the helm install. I followed the heptio guide and got my kubernetes machines running. Helm + kubectl are also installed. Here's the error that I'm getting: helm install jupyterhub/jupyterhub --version=v0.4 --name=kube --namespace=kube -f config.yaml Error: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube". (get namespaces kube) helm version Client: &version.Version{SemVer:"v2.5.1", GitCommit:"7cf31e8d9a026287041bae077b09165be247ae66", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.5.1", GitCommit:"7cf31e8d9a026287041bae077b09165be247ae66", GitTreeState:"clean"} Any ideas? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23s-YohRpM_FE1Htx0lHxAK2dV2WZks5sXKNPgaJpZM4N6cry> .

-- Yuvi Panda T http://yuvi.in/blog

choldgraf · 2017-08-11T19:11:32Z

Oh you mean from that PR that I created and have already forgotten that I created? whoops ;-)

that fixes the namespace error...now helm is hanging on install:

helm install jupyterhub/jupyterhub --version=v0.4 --name=kube --namespace=kube -f config.yaml --debug
   [debug] Created tunnel using local port: '61697'

   [debug] SERVER: "localhost:61697"

   [debug] Original chart version: "v0.4"
   [debug] Fetched jupyterhub/jupyterhub to /home/choldgraf/.helm/cache/archive/jupyterhub-v0.4.0+fb6fc47.tgz

   [debug] CHART PATH: /home/choldgraf/.helm/cache/archive/jupyterhub-v0.4.0+fb6fc47.tgz

been stuck on that last one for like 10 minutes, ended with:

Error: timed out waiting for the condition

choldgraf · 2017-08-11T19:22:20Z

UPDATE: I got it working by running this command: https://kubernetes.io/docs/admin/authorization/rbac/#permissive-rbac-permissions

which @yuvipanda mentions makes the cluster insecure. I think there's a better solution coming soon but just putting this here for reference

choldgraf · 2017-08-11T19:50:11Z

OK I think I am close. Got jupyterhub deployed and everything with one snag:

It's not generating a public-facing IP address:

kubectl --namespace=kube get svc
   NAME           CLUSTER-IP      EXTERNAL-IP        PORT(S)        AGE
   hub            10.109.128.19   <none>             8081/TCP       3m
   proxy-api      10.96.110.230   <none>             8001/TCP       3m
   proxy-public   10.100.36.195   a72d589697ecd...   80:31656/TCP   3m

I'd assume that EXERNAL-IP would have a proper IP address. I wonder if this is something about how my AWS instance is set up? Do I need to configure something special to allow public access?

yuvipanda · 2017-08-11T20:00:24Z

The address under external IP is a valid dns name you can use. If it is cut off, try doing a describe svc proxy-public with kubectl to copy the full url.

…

On Aug 11, 2017 12:50 PM, "Chris Holdgraf" ***@***.***> wrote: OK I think I am close. Got jupyterhub deployed and everything with one snag: It's not generating a public-facing IP address: kubectl --namespace=kube get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE hub 10.109.128.19 <none> 8081/TCP 3m proxy-api 10.96.110.230 <none> 8001/TCP 3m proxy-public 10.100.36.195 a72d589697ecd... 80:31656/TCP 3m I'd assume that EXERNAL-IP would have a proper IP address. I wonder if this is something about how my AWS instance is set up? Do I need to configure something special to allow public access? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23r-Ed8WLtZbBKKoifoxSb3SfL0llks5sXLB0gaJpZM4N6cry> .

choldgraf · 2017-08-11T20:10:05Z

boosh! a72d589697ecd11e7b8e202ffae2b2ec-945672095.us-west-2.elb.amazonaws.com

choldgraf · 2017-08-11T20:11:20Z

getting PersistentVolumeChain is not bound errors...I think there's a fix for that in the guide IIRC

yuvipanda · 2017-08-11T20:31:00Z

What is the output of kubectl get storageclass -o yaml? On Aug 11, 2017 1:11 PM, "Chris Holdgraf" <[email protected]> wrote: getting PersistentVolumeChain is not bound errors...I think there's a fix for that in the guide IIRC — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAB23vx9vaUAGdDsX-4fZZfXj1d76GgIks5sXLVogaJpZM4N6cry> .

choldgraf · 2017-08-12T00:48:51Z

apiVersion: v1
items: []
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

choldgraf · 2017-08-12T14:53:38Z

So @yuvipanda and I chatted and it seems like this could be an issue for AWS. We need users to be able to have their own disks and it looks like this isn't something that comes by default.

@willingc when you got this up and running did you figure out a way to allow for people to have disks in their jupyterhub instance? @rdodev any thoughts on how one might enable this w/ the current setup?

rdodev · 2017-08-12T15:04:32Z

@choldgraf I guess I'm not fully abreast what the use case architecture is for jupyterhub. Is it similar to tmpnb.org? If you have literature or diagrams would be greatly helpful.

choldgraf · 2017-08-12T15:09:23Z

hmmm, well there's lotsa docs describing JupyterHub and the tools it utilizes here:

https://zero-to-jupyterhub.readthedocs.io/en/latest/

As an example, a common use-case is a classroom setting. A teacher puts together a Docker image that contains all the requirements/dependencies/code/data etc needed for the class, and that image is served to students via JupyterHub. When students log in, kubernetes spins up a pod for them and attaches it to a persistent disk that contains the student's files (so that they can modify their notebooks and those changes will persist in time). It sounds like we're having trouble with the persistent-disk-attaching part.

rdodev · 2017-08-12T15:10:26Z

@choldgraf great, thanks for the info. Let me look into it.

rdodev · 2017-08-12T15:12:21Z

@choldgraf are the manifest files you've used in the master branch of the repo?

choldgraf · 2017-08-12T15:17:30Z

which repo? at this point I'm not actually working from any repo. just following the instructions post-kubernetes-install from here: https://zero-to-jupyterhub.readthedocs.io/en/latest/

(also just FYI I think that @yuvipanda will be of more help than I here, he's a lot better at debugging kube stuff)

choldgraf · 2017-08-14T21:50:24Z

pinging you @rdodev in case you're only paying attention to parts of this thread in which you're mentioned ;-)

rdodev · 2017-08-14T21:53:32Z

@choldgraf no, never seen consistent failures w/ any type of instance. Those types of errors are usually on AWS' side.

choldgraf · 2017-08-14T21:54:09Z

ok, I'll give it a shot again...

choldgraf · 2017-08-14T22:15:12Z

hmmm...I got the same failure to create + rollback. @aculich have you experienced any issues like this on AWS before?

rdodev · 2017-08-14T22:34:39Z

Strange. Are you trying to launch into an existing VPC? What's the exact errors you're seeing?

choldgraf · 2017-08-14T22:45:09Z

nope - I'm creating a new one (the button on the left in the guide). It was hard to pin down a specific error message, but it seemed like a subset of the machines being requested didn't succeed (like 3 out of 7) so the whole thing failed and rolled back...

One theory is that this is related to some kind of limit on my AWS account...not sure how to test that out though. This works fine for all the tN machines

rdodev · 2017-08-14T22:48:31Z

@choldgraf A lot of people bump on this issue:

https://aws.amazon.com/ec2/faqs/#How_many_instances_can_I_run_in_Amazon_EC2

choldgraf · 2017-08-14T22:52:25Z

hmm - we were requesting r3.large, which isn't listed on that page, so not sure what kind of limits it has. :-/

rdodev · 2017-08-14T22:54:27Z

@choldgraf "All Other Instance Types | 20" this is total per region so if you have any other deployed in a different AZ will count against quota.

choldgraf · 2017-08-14T22:57:31Z

Gotcha - yeah we were only requesting 7 so I guess this isn't the issue...hmmm, I can try and ask someone in a different part of the country to deploy w/ heptio and the same computational config

rdodev · 2017-08-14T22:58:14Z

Let me give it a try :D

choldgraf · 2017-08-14T23:00:16Z

:-)

rdodev · 2017-08-14T23:01:26Z

Spinning up a cluster with 7 x r3.larges as we speak. Will update when done (or error).

rdodev · 2017-08-14T23:12:51Z

@choldgraf

Region: Oregon (us-west-2)

choldgraf · 2017-08-14T23:13:50Z

damnit!

choldgraf · 2017-08-14T23:14:21Z

I mean.....that's great! :-)

hmmm, OK I can give it another shot with us-west-2b. This makes me wonder if it is something with my account...

rdodev · 2017-08-14T23:16:13Z

If your account is a child/sub account it's possible other users under the same umbrella account have VMs running in that region and are invisible to you (thus bumping on the quota).

choldgraf · 2017-08-14T23:27:46Z

well either way, that's good news - let me send these instructions to another guy we're working with at UW and see if he can get the machines set up...I'm trying to do this so that we can use AWS + JupyterHub for a training camp in early September...so really it just needs to work for him :-)

rdodev · 2017-08-14T23:31:48Z

@choldgraf so it worked, I presume? Please ping me if need be. Though I'm on Eastern time so probably won't check until tomorrow morning.

choldgraf · 2017-08-14T23:36:49Z

I still haven't got it working with r3 but it's working with the two machines... I'll let you know if my colleague can get it working. Thanks so much for your help! I'll report back w an update but either way I owe ya a 🍺 or two!

choldgraf · 2017-08-28T17:11:21Z

hey @rdodev - I wonder if you're still around for a quick question!

First off - the AWS deployments are working quite well, I think...thanks so much for the great guide/template and all the help!

A question: somebody is asking about how to rescale thier AWS cluster after deploying (specifically the "1-20" nodes). I looked through the guide but couldn't find a clear way to do this. Do you have any intuition for how to do this?

choldgraf · 2017-08-28T17:11:30Z

ping @arokem since he's interested in this

rdodev · 2017-08-28T18:04:08Z

@choldgraf looking into this. Give me 1/2 hour or so to test solution.

rdodev · 2017-08-28T18:39:27Z

The most graceful way is:

log into aws console and go to CloudFormation
Find the stack that you want to scale out (name ends in 12 uppercase alphanumeric string, both stacks share the same prefix name)
Select above mentioned stack, then from Actions menu, click on Update Stack
Click Next
In parameters, change value of Node Capacity to desired value.
Click Next twice
Confirm change and click on Update.

//cc @choldgraf

arokem · 2017-08-28T18:50:31Z

Thanks! I will give this a try later today. I assume that other parameters can also be changed? For example, instance type, etc.?

rdodev · 2017-08-28T18:55:51Z

@arokem it is possible, but that's a bit more complicated since changing instance type will nuke existing nodes and any data or workloads therein will be lost.

choldgraf · 2018-05-01T01:56:00Z

Hey all - as we now have more mature docs for a number of providers, I'm going to close this. If people would like to re-open, please feel free to do so! Though I think it'll be more useful if we have issues for specific cloud providers we haven't supported, rather than one-catch all (especially since this one is quite long already!)

aculich added the help wanted label Jun 14, 2017

willingc changed the title ~~supporting additional cloud providers~~ Add support for additional cloud providers Jun 20, 2017

willingc added the documentation label Jun 20, 2017

yuvipanda mentioned this issue Mar 20, 2018

Figure out what Kubernetes installation method we 'support' #593

Closed

choldgraf closed this as completed May 1, 2018

Add support for additional cloud providers #88

Add support for additional cloud providers #88

Comments

aculich commented Jun 14, 2017

willingc commented Jun 14, 2017

aculich commented Aug 7, 2017

willingc commented Aug 7, 2017

rdodev commented Aug 7, 2017

willingc commented Aug 7, 2017

choldgraf commented Aug 8, 2017

choldgraf commented Aug 10, 2017

willingc commented Aug 10, 2017

choldgraf commented Aug 10, 2017 via email

willingc commented Aug 10, 2017

rdodev commented Aug 10, 2017

choldgraf commented Aug 11, 2017

yuvipanda commented Aug 11, 2017 via email

choldgraf commented Aug 11, 2017 • edited Loading

choldgraf commented Aug 11, 2017

choldgraf commented Aug 11, 2017

yuvipanda commented Aug 11, 2017 via email

choldgraf commented Aug 11, 2017

choldgraf commented Aug 11, 2017

yuvipanda commented Aug 11, 2017 via email

choldgraf commented Aug 12, 2017

choldgraf commented Aug 12, 2017

rdodev commented Aug 12, 2017

choldgraf commented Aug 12, 2017

rdodev commented Aug 12, 2017

rdodev commented Aug 12, 2017

choldgraf commented Aug 12, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017 • edited Loading

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

rdodev commented Aug 14, 2017 • edited Loading

choldgraf commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

rdodev commented Aug 14, 2017

choldgraf commented Aug 14, 2017

choldgraf commented Aug 28, 2017

choldgraf commented Aug 28, 2017

rdodev commented Aug 28, 2017

rdodev commented Aug 28, 2017 • edited Loading

arokem commented Aug 28, 2017

rdodev commented Aug 28, 2017

choldgraf commented May 1, 2018

choldgraf commented Aug 11, 2017 •

edited

Loading

rdodev commented Aug 14, 2017 •

edited

Loading

rdodev commented Aug 14, 2017 •

edited

Loading

rdodev commented Aug 28, 2017 •

edited

Loading