-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Hub: OceanHackWeek 2021 #549
Comments
@choldgraf please let us know when/how we can test it. (Folks are getting anxious to run pre-test their tutorials on the hub.) |
Sounds good - will try and deploy the hub tomorrow. (We are all on a European time zone currently) |
(also just to clarify, the target start date listed was the 28th, do you need the hub earlier than this?) |
If we can get it on the 27th, tomorrow, it would be nice so we can make the instructors test their notebooks against it. The 28th would be tight but it works too. |
Not quite ready to close this yet! We need confirmation from @ocefpaf that all seems well :-) @ocefpaf see the hub URL above (https://ohw.pilot.2i2c.cloud/) and confirm you can log in etc! |
Awesome! I was able to login (super fast) and I'll play with it ASAP. I'll probably return with tons of questions. I'll try to read the docs first ;-p |
Edit: Sorry, read the docs and doing it now. |
I hope the lack of extra questions means you figured out how to do stuff as an admin, and not that things have gone down in flaming glory 😬🔥 also I added @GeorgianaElena on this one to track who is working on this hub deploy! |
Yes. Surprisingly easy to manage so far. Great work! I'll experiment with adding package today.
Good to know. Thanks @GeorgianaElena! |
@ocefpaf is this hub now ready to go from your end? we'd like to close out this issue if all looks OK |
Yes. Please close it. I cannot because @abkfenris created it. Any comments/feedback @abkfenris ? |
We have some late breaking issues with Dask, but that may be a package we need in our image. |
You can experimentally change the image deployed to your hub at https://ohw.pilot.2i2c.cloud/services/configurator/. After building and pushing your image, try the new image tag there? Some preliminary docs at https://pilot.2i2c.org/en/latest/admin/howto/configurator.html |
Ya, we've been playing with adjusting the image in configurator as we get requests for new packages. I think we were missing It would be sweet if there was a webhook endpoint for the configurator we could use to adjust the image, or if we could do gitops-ish things against https://github.com/2i2c-org/pilot-hubs/blob/cc71cbd47bf79c90e96a86d2983bfaed51ba3703/config/hubs/2i2c.cluster.yaml#L108-L110 |
After from dask_gateway import GatewayCluster
cluster = GatewayCluster()
cluster.scale(4) it can take about 5 min to scale up since we basically I've done some work trying to slim down the image (it's 5.5 GB now), but it's mainly the variety of conda packages that our tutorials or dask users may need. The other way to speed things up would be to have images closer to the hub. From poking around the repo, that looks like it zone Does 2i2c have a Google Artifact/Container Registry that we could push images too? I'm also inquiring about if we have access to a Google Cloud project that we could access to run one ourselves. |
@GeorgianaElena I'm getting a dead kernel when I try to load the dataset in the last line of this notebook: https://nbviewer.jupyter.org/gist/ocefpaf/d9253a4dcd74ee651bf55598044d9cf1 Everything works OK in a fresh pull of our image locally. |
@ocefpaf I'm guessing that's because you don't have enough RAM. Do you have a sense of how much RAM your notebooks might need? I think the default is pretty small (1G) and that might be it? I'm bumping it up to 4G for the duration of the workshop - turn your server on / off and give it a shot? |
4G sound reasonable. I'm testing it and I'll get back to you. |
We want spinups of dask and notebook nodes to be much faster. Ref 2i2c-org#549
Unfortunately I won't be able to set up the node placeholders until later today. The quotas and stuff are set up tho, and I tested that we can scale up to at least 50 nodes |
A slow startup on the first day will help drive the point in that folks should log in early. I think getting crazy with Dask doesn't happen until the visualization session tomorrow, but we haven't structured our schedule around which exact packages/resources are getting used by what tutorial. |
Folks, we are hitting an odd issue. There is a data source, very common for oceanographic data, named OPeNDAP. It works locally on the same docker image, exactly the same packages but it fails in the jupyterhub. The steps to reproduce are: from netCDF4 import Dataset
url = "http://goosbrasil.org:8080/pirata/B19s34w.nc" # any OPenDAP URL will fail with an odd curl error.
nc = Dataset(url) Any advice on how we can even debug this? |
Hmm, if I try to use |
If I try from our OpenDAP server (which I have never actually used in anger before), that works for me: ds = xr.open_dataset("http://www.neracoos.org/opendap/A0143/A0143.met.realtime.nc")
ds nc = Dataset("http://www.neracoos.org/opendap/A0143/A0143.met.realtime.nc")
nc |
@abkfenris it's possible that port 8080 outbound is turned off, let me investigate |
@abkfenris @ocefpaf there was an outbound port restriction. I opened port 8080 and 22 (#576), and this seems to work now. |
ok so I've setup node placeholders (PR coming soon) to have 2 spare notebook nodes and 3 spare dask worker nodes, with the images pre-pulled. Can you test out dask spinup time now? |
Thanks so much Yuvi! |
@ocefpaf yw! How was the dask-gateway spinup time? |
I did not test it myself but the projects will start today and folks will report how it goes. I'll be sure to get back to you as soon as we know. PS: Quick question. What is the best practice to allow folks to create conda environments in the hub? Giving them permission to write at |
This is actually my preferred method - repo2docker does this too. Putting it in $HOME is probably just going to be super slow thanks to NFS. If their container goes wonky, they can simply restart the server. It won't persist past restarts though :( |
Good to know that not all my ideas are bad 😄 (I tired and it worked. Thanks!) BTW, we have two OPeNDAP server in our demo that use 808 port. One worked, the other one ( |
@ocefpaf I can't access |
If we can skip the port 8080 one, I'd like to leave that be until after the workshop is over. That sound ok? |
Don't worry about it. It is a problematic server anyway. I'll try to re-write the example. (Although the whole point of that example is to show a bad data/metadata out there. And guess what? Now I have another point to make with it 😄) |
When a user creates an environment, I believe it just touches some metadata in |
Indeed. I believe it reads and updates the url.txt file in there. There is probably a way to make conda read that from another directory :-/ |
so how did the hackathon go? |
I'm actually going to close this, as the hub itself is set up! I opened #595 to debrief and learn about how the hub went. |
Hub Description
OceanHackWeek (OHW) is a 4-day collaborative learning experience aimed at exploring, creating and promoting effective computation and analysis workflows for large and complex oceanographic data. It includes tutorials, data exploration, software development, collaborative projects and community networking.
We will be using the hub to teach tutorials and develop projects with both in-person (EST) and worldwide participants.
Community Representative
@ocefpaf
Important dates
Target start date
2021-07-28
Preferred Cloud Provider
No preference (default)
Do you have your own billing account?
Hub Authentication Type
GitHub Authentication (e.g., @MyGitHubHandle)
Hub logo
No response
Hub logo URL
No response
Hub image service
hub.docker.com
Hub image
uwhackweeks/oceanhackweek:28d1c7b
Extra features you'd like to enable
Hub Engineer information
The Hub Engineer should fill in the metadata below when it available. The Community Representative shouldn't worry about this section, but may be asked to provide help answering some questions.
Deployment information
Hub ID:
ohw
Hub Cluster:
pilot
Hub URL:
ohw.pilot.2i2c.cloud
Hub Template:
daskhub
Actions to deploy
The text was updated successfully, but these errors were encountered: