-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate UToronto hub to our pilot-hubs repository #638
Comments
We've had another incident on this hub, that reminded us again that we have an imbalance of team access to this infrastructure. I think we should try and make this move ASAP or this will keep on happening. I've added some to-dos to the top, and we can prioritize this on the backlog. I've also reached out to the U.T people to ask if there's a time window when there will be reduced activity where we could try to make the switch. |
I've heard back from U. Toronto, they said that Canadian Thanksgiving is on October 11th (a Monday), so if we wanted to, we could plan a migration at that time. Otherwise we can wait until later They had a few questions:
QuestionsSo two questions for you all:
|
We had a recent request from them about shared directories that are natively supported in the new infrastructure and missing from the current deployment. |
I've updated this issue with some new migration steps, as well as some things that we could do sooner than later since they likely won't be as disruptive. |
Update: preferred date for migrationJust FYI, it sounds like the preferred date for migration (if we assume there will be down time) would be between December 13th and 15th. |
This is pretty close to the "winter break" I presume a lot of @2i2c-org/tech-team members will take. We should keep that in mind at the time to execute on this one, IMHO. |
@damianavila similar for UofT. We want to get the update sorted before most of our staff take their winter break as well. Hope the timing lines up. |
I think that if we can take some baby steps in this direction before doing a full migration (like setting up a repo2docker image repository) it might also reduce the uncertainty and effort around the hub migration itself) |
I've went through the old repo's config, and figured out a more concrete set of actions. I've updated the issue with a task list, one that can be done before Dec 13 and then a migration during that time. |
@GeorgianaElena is going to give deploying the Azure infra a shot in this sprint |
I've done this for the UToronto repo, splitting https://github.com/2i2c-org/utoronto-image/ off https://github.com/utoronto-2i2c/jupyterhub-deploy/ Ref 2i2c-org#638
The migration happened this morning 🎉 There are still a few boxes that need to checked and I also added some additional ones in the |
Amazing - thanks for the update @GeorgianaElena and bravo on all of the progress we've made on migration already! Do you think we are on track to have all of these done before next Wednesday? Trying to figure out if our original sprint estimate is reasonable. I also opened up 2i2c-org/team-compass#331 to track some blog post / docs that I'd like to work on that will pull from the stuff we did here as well, so I may ask y'all for some help giving me guidance there! |
This is so great @choldgraf! Happy to help with anything I can.
My response comes a bit late, sorry about that. But I believe the only things remaining to do now are these two items:
I was thinking that maybe the |
Question: are we still running the old cluster VMs?We got some feedback from the U.Toronto folks that they saw a noticeable increase in cloud costs after the migration. I looked at the grafana but don't see anything out of the ordinary there. To that extent, a few questions:
|
I think there are two possible causes for the cost increase:
(2) is hopefully not a huge component, and we can get by with just finishing up (1) |
@yuvipanda, are you aware of anything important that we'd need to save before destroying the cluster and NFS VM? |
@GeorgianaElena nah let's bust it |
Thanks @yuvipanda! I will hit |
I believe that the old cluster has now been deleted, and I've updated the Toronto team with this information. If so, can we consider this project complete??? 🚀 |
Yes! I just deleted the last remaining bit of the original Azure resource group! So let's close this 🎉 🚀 |
congrats @GeorgianaElena :-) |
Indeed, huge congratulations @GeorgianaElena! |
Problem statement
We currently deploy the University of Toronto hub via a dedicated GitHub organization and repository (https://github.com/utoronto-2i2c/jupyterhub-deploy/), using hubploy for deployment. This pre-dates our
pilot-hubs/
infrastructure which is why it is special-cased.The problem is that, because it is on a dedicated repository / deployment infrastructure, it:
Solution
We should move this infrastructure to be deployed from
pilot-hubs/
, the same as any other Azure deployment (see the #288 for one example). This will involve a few steps, described belowStep 1: Setup new hub Infra and prepare
We now have Azure terraform support in this repo (#800) and can set up a new hub infra with that, to run in parallel with current setup. The primary infra difference will be the use of Azure File for home directories rather than the hand-spun NFS VM we have there right now - I think this is a big positive change.
Step 1 is all the things we can do that don't involve any downtime, and can be done early on.
When these tasks are completed, I think we should have a staging hub that can be used by UToronto, but runs out of this cluster.
Step 2: Migrate production hub
This should be done between Dec 13-15.
TADA! New hub!
Step 3: Clean up any loose ends
jupyterhub.log
➡️gzip
it, and leave it in the new hub db dir. We can do usage analysis on this later, just gotta save it.how to migrate a hub
Appetite
2 sprints
The final hub transition should be completed on Dec 13th or so, which means that we should do all of the prep work needed before then. It is hard to know exactly how much work this will entail until we dig into the current U. Toronto hub setup and see what is different from our
infrastructure/
setup, so we'll give 2 sprints to this process.Information
Issues board for this project
The text was updated successfully, but these errors were encountered: