-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the base & minimal notebook containers not amd specific (e.g. support building for arm64) #1368
Conversation
Let's run the CI first to make sure it works. |
Ok, so there is a problem, please, fix it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I do approve this PR, it's a first step in making jupyter images work under arm64 and I think the direction is right.
There are several things I would like to ask though:
-
Does it break something for current users (I mean, simple
docker pull
orFROM jupyter/some_image
) works as it was, right? I mean, it's ok to ask people building our images to havedockerx
. But I don't want to ask all the users to installdockerx
and hope it's still gonna work. -
Let's update docs or add the information to the docs, that we have some support of arm64-based images.
-
Does the tagging work well with 2 arches?
-
Let's not forget about build times - we need to be careful here because we will be building for the not-host platform.
-
Manifest creation needs to be fixed - I think if tags stay the same between images (they will stay the same if we use the same versions of python packages), then it makes sense to only change
Build manifest
sections here https://github.com/jupyter/docker-stacks/wiki because they are definitely gonna be different. -
Please, fix my comments and I would like this PR to have at least 2 approvals. So, @consideRatio @romainx @parente tell us what you think.
Makefile
Outdated
@@ -29,7 +30,7 @@ help: | |||
|
|||
build/%: DARGS?= | |||
build/%: ## build the latest image for a stack | |||
docker build $(DARGS) --rm --force-rm -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) | |||
docker buildx build $(DARGS) --rm --force-rm -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) --push --platform linux/arm64,linux/amd64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does --push
mean here?
I hope it doesn't actually push to the Docker Hub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently if you use docker buildx
to build for multiple architectures there's no way of accessing any of the images without pushing them somewhere, but it has the big advantage of creating a manifest for the tag: this means anyone running docker pull image:tag
will automatically pull a matching architecture. You could for example push it to a local registry, which is what we do in JupyterHub:
The alternative is to build each architecture one at a time, and create the docker manifest separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! So, I think this PR will also have to add a local registry to make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So provided the user sets the "owner" variable to an account that they own it should work without needing to setup a local registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we want to be able to build images without pushing them to the real docker hub.
That's how it's working in the CI right now.
We only push to the hub, after the merge to master (you can see it in github docker workflow).
So, to be able to do that, I think we will have to setup a local registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manics how do you manage local docker image build?
You have to set up the localhost registry manually, right?
If I'm right, we will have to ask our developers, if they want to run the build locally, to run the registry first (it's quite easy I suppose).
That's fine by me, but I would like to know all the implications of this change before merging it.
And I think it's much better than having to force everyone to push to the hub, that's why I'm asking for changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, let me look into what setting up a local registry would be like (so far I've just always pushed to my own dockerhub account) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example given by @manics is really useful.
They're dealing with the same problem as we.
In case anyone missed it there's a discussion in #1019 about ARM64 support |
vanilla docker pull "does the right thing" and pulls the matching arch.
Makes sense, I assume I should update the wiki would be done after we're all on board with the PR?
Docker tagging does if that's what we're referencing?
True, one option would be to let users specify which archs they want to build for instead of just forcing cross-build arm64 & amd64, what do you think?
I'm a little fuzzy on this part can you elaborate? Sorry relatively new to the project, thanks for your help :)
|
Makefile
Outdated
@@ -29,7 +30,7 @@ help: | |||
|
|||
build/%: DARGS?= | |||
build/%: ## build the latest image for a stack | |||
docker build $(DARGS) --rm --force-rm -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) | |||
docker buildx build $(DARGS) --rm --force-rm -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) --push --platform linux/arm64,linux/amd64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we want to be able to build images without pushing them to the real docker hub.
That's how it's working in the CI right now.
We only push to the hub, after the merge to master (you can see it in github docker workflow).
So, to be able to do that, I think we will have to setup a local registry.
docs/contributing/features.md
Outdated
OWNER={yourdockerhubusername} | ||
make build/somestack-notebook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work, because make will still use OWNER?=jupyter
line instead of a variable you set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I'll need to update the owner refs (per https://stackoverflow.com/questions/24263291/define-a-makefile-variable-using-a-env-variable-or-a-default-value ) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably won't need that after dealing with local registry as suggested above.
@holdenk regarding #1368 (comment). Can you update that comment so that it is clear what you say in response to what @mathbunnyru sais? |
Thanks, sorry I forgot about the github formatting, updated the comment. |
I think we were not actually telling anything about the arch of our images.
Yes, I meant docker tagging.
Actually, for now, let's first try to make it work on GitHub - for that you will probably have to change our docker github workflow. Then we'll see how much it takes to build the images.
https://github.com/jupyter/docker-stacks/tree/master/tagging |
The most important thing, for now, is to make it work in GitHub actions (creating a registry like in the jupyterhub example should work and dockerx issues I suppose). |
Makes sense, I'll give it a shot with getting it working in GHA :) |
@holdenk please, tell me, if you need a hand with this PR. |
Thanks @mathbunnyru I hurt my hand on Friday so I've been running a little slower than planned. I'm going to make this default to local registry as we discussed. |
Actually, now that I look at the "build docker image" CI step which naturally doesn't have mixed arch support, I think I'll switch this is use buildx with single image mode by default and then for "publishing" people can switch to registry. |
Good luck with the recovery process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, tell me when it's ready to review, I will convert it to a draft for now.
@mathbunnyru I think it should be ready for review, it works for me locally and it looks like it works in CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I will review the current state.
This PR seems to add arm support, but it actually doesn't add it.
What I mean by adding arm support:
- images are built both for x86_64 and arm
- it's possible both locally and in the GitHub CI
- images are pushed to the Docker hub
- wiki build manifest is updated
- docs are updated to represent that we support 2 arches
I mean, it's great, that with this small change we can build arm images.
But until we make it work ass I told I can't even tell if your approach is gonna work or not.
Thanks for the reminder about the documentation. I've updated it. The language isn't quite what you suggested since I don't have permission to push to the docker hub, and I don't want to get people confused trying to use the images before they are pushed. So it can push if you follow the instructions in the For "wiki build manifest is updated," I'd be happy to do that, but I don't know what you are asking for to be updated in the build manifest wiki. For "images are pushed to the Docker hub," I don't have permission to push images to your docker hub account, but I'm hoping showing that they can be pushed to my account as above does the trick. Building ARM images inside x86 containers on GithubActions is probably doable, but I'm not the person for that task. Similarly, it looks like it's possible to get ARM GitHub action runners, but I don't work with Azure cloud, so I'm not the person for that task either. Also, both these tasks sound painful and do not spark joy for me. I want to think this PR is useful in and of itself, e.g., people can use it to build cross-platform images, and the existing single arch still works as is, but if you want to wait for someone better at qemu magic with GitHub actions, I totally get it. |
It sounds like you might not be interested in this PR @mathbunnyru and I totally get that. I'm happy to maintain this in my own fork although of course, I'd rather upstream it. |
After the merge to master in this repo, the commit is built once again and it's pushed to the DockerHub in a fully automatic way. And, to make it easy for our end users to use these images it would be best to push it and not force them to clone this repo and build images ourselves. Also, if we update some image (for example, base image) and it stops working for arm, we won't know about it. Hope you will understand my point of view.
For each image, we create a build manifest here: https://github.com/jupyter/docker-stacks/wiki
Yes, I think we have to build these images on GithubActions using runners / or probably qemu (as other people do).
That's a bit sad to hear, given the time I spent to review these files and thinking about what to do best :( |
@parente @consideRatio @romainx how do you feel about this change? |
This is exactly what |
Yes, thanks, I've taken a look the last time you gave this link and I was hoping to see something similar in this PR as well. |
Yeah I also do not want to have separate x86 and ARM dockerfiles. I was thinking of keeping just one set of files but having it only cross-build the docker containers where the porting was done as a way of allowing incremental development because I'm not interested in doing all of the things, just some of the things. That being said, I would love to hand this over to someone else to continue forward who has more experience with the CI & R & binder 👍 Happy to leave the PR open or closed whatever's best. |
…NER so that buildx can build arm64 and amd64 cleanup
This reverts commit a8d9c46.
… and add a comment about how to do cross-platform builds.
… add ARG OWNER to the Dockerfile's so people can more easily push to their own, add docker buildx & ARM64 support to the CI
So I've redone it a little bit to hopefully be more amenable to incremental development. The github actions change is mostly copy-pasta though and I need someone to sign-off on that running to see if it does what y'all want. I know there was the hope that porting all of the containers could be easily done in one go, but there are package conflicts in the base R container & the scipy container on ARM that need to be resolved. If anyone is looking for (unofficial) cross-built arm64/amd64 containers you can get them from https://hub.docker.com/repository/docker/holdenk/minimal-notebook :) |
.github/workflows/docker.yml
Outdated
@@ -67,6 +88,3 @@ jobs: | |||
with: | |||
username: ${{secrets.DOCKERHUB_USERNAME}} | |||
password: ${{secrets.DOCKERHUB_TOKEN}} | |||
- name: Push Images to DockerHub | |||
if: github.ref == 'refs/heads/master' || github.ref == 'refs/heads/main' | |||
run: make -C main push-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I have a hard time reviewing this PR. I think we all have struggled a bit on finding a scope for a PR to work towards publishing arm64 images. Perhaps we can narrow this PR to just update the base-notebook Dockerfile to be buildable on x86 and arm64?
If this PR is narrowed to the changes in the base-notebook Dockerfile, I'd be happy to approve this PR quickly, but for changes to the CI system I'm far more hesitant. I consider the change above for example - I think we would end up no longer doing the previous CI sequence of build->test->push which should be unchanged no matter if we build / publish a new kind of architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OMG I'd love to backout the CI changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about for the makefile I'll switch build & build-all back to their original behaviour, but leave in build-cross for folks who want to build cross-compiled images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha ;D If you are okay with it @holdenk, I'd be happy to strip this PR for relevant changes in another PR focused on CI stuff - making it come to use in the end. I have a local copy of this PR checked out on my computer atm.
For this PR though, I think it makes sense to not let the scope be broader than just making the base-notebook/Dockerfile be buildable on an arm64 architecture. It could be a single change, a single file.
Btw I've not read up on the purpose of OWNER arg or if its in scope for arm64/x86 stuff, but if it isn't I suggest excluding that as well at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so the point of the OWNER arg is to allow people to build to their own dockerhub account. It's already in the MAKEFILE it just only affects the tagging not the FROM (so e.g. if you wanted to build your own base-notebook & minimal notebook on top of it that isn't supported without the owners ARG), but I'm happy to back it out too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying that! Let's go minimalistic and remove it from this PR as well, and let's get this PR approved and merged!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @holdenk! This LGTM!
@holdenk @mathbunnyru I suggest we squash this PRs commits on merge, as 16 commits going back and forth can make git history a bit hard to follow when using |
Yeah I think a squash merge makes sense here 👍 |
Ok, let's squash merge this. Hope someone will finish this work soon :) @holdenk thanks for your efforts, great job 👍 |
…pport building for arm64) (jupyter#1368) * Dynamically select the right miniforge arch and unpin the ROOT_CONTAINER so that buildx can build arm64 and amd64 cleanup * Remove the commented out hard set of arch * Address code review comments from mathbunnyru * Add setting the owner to respective dockerhbub username * Revert "Add setting the owner to respective dockerhbub username" This reverts commit a8d9c46. * Fix up the dockerfile, make the default buildx compatible with the CI and add a comment about how to do cross-platform builds. * Update the docs * Refactor the Makefile to support cross-building images incrementally, add ARG OWNER to the Dockerfile's so people can more easily push to their own, add docker buildx & ARM64 support to the CI * Simplify build-test-all rule * Match patch version * Run prettier on docker.yml * Declare and export seperately per docker lint * Skip CI changes * Revert the makefile changes * Update the Arch comment to match * back out unrelated changes
I run a small Kubernetes cluster with a mix of arm64 & x86_64 hosts so I can use Jupyter I changed the Dockerfile to support being cross-built. (for additional context see https://scalingpythonml.com/2020/12/12/deploying-jupyter-lab-notebook-for-dask-on-arm-on-k8s.html )
Even without the cross-build, this is still useful for anyone who wants to build the Jupyter container on any other non-x86_64 platform.
Right now this currently has two lists of containers: those that support cross-building & those who's Dockerfiles need to be updated to support cross-building.
make build-all
"does the right thing" in that it cross-builds containers which support cross-building and builds single arch containers for those that are not yet ported.