Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ARM architecture (multi-arch images) #1019

Closed
6 of 8 tasks
jiwidi opened this issue Feb 19, 2020 · 30 comments · Fixed by #1446
Closed
6 of 8 tasks

Support ARM architecture (multi-arch images) #1019

jiwidi opened this issue Feb 19, 2020 · 30 comments · Fixed by #1446
Labels
tag:Community Stack We recommend this contribution become a community stack. tag:Documentation Related to user, developer, and maintainer documentation type:Arm Issue specific to arm architecture type:Enhancement A proposed enhancement to the docker images

Comments

@jiwidi
Copy link

jiwidi commented Feb 19, 2020

Hi!

I was working recently with some Docker images for jupyterlab to run in raspberry pis and I was wondering if an image like that one could be worth including in this notebook or if its of value for the Jupyter team. I will very happy to work towards a contribution for this but first want to check wether there is value for it.

Thanks!


Update by Erik

#1399 has been merged - we have published a few amd64 and arm64 compatible images!

Remaining work

Update by Ayaz

scipy-notebook is waiting for conda-forge/bottleneck-feedstock#35
scipy-notebook and r-notebook are ready now to be built #1444
I will wait for the PR and update the tree above.

Update: scipy-notebook and r-notebook are now available for arm!

@parente
Copy link
Member

parente commented Feb 22, 2020

@jiwidi Thanks for your interest in extending the Jupyter ecosystem. #899 has some discussion and movement toward supporting rpi images as community-maintained docker stacks. Perhaps you'd be interested in contributing to that effort?

@jiwidi
Copy link
Author

jiwidi commented Feb 22, 2020

@jiwidi Thanks for your interest in extending the Jupyter ecosystem. #899 has some discussion and movement toward supporting rpi images as community-maintained docker stacks. Perhaps you'd be interested in contributing to that effort?

I think its a good idea if we both join forces, I left him a comment and lets see how it goes :)

Thanks for the referal!

@AvverbioPronome
Copy link

@jiwidi don't these images run there?

What's the issue? pip takes too long compiling? (add piwheels)

@jiwidi
Copy link
Author

jiwidi commented Mar 25, 2020

@jiwidi don't these images run there?

What's the issue? pip takes too long compiling? (add piwheels)

What do you mean? If you mean if the normal images can be run in raspberry pi, no they can't. Because of the ARM architecture of rpis there is need for some special tricks or bypass to compile nodejs and other libraries

@AvverbioPronome
Copy link

I've started looking into it and the first obvious problem is that base-notebook uses miniconda x86_64 explicitly. https://github.com/jupyter/docker-stacks/blob/master/base-notebook/Dockerfile#L79

Latest miniconda armv7 is 2015-08-24 11:01:14 no good. https://repo.continuum.io/miniconda/

So, yeah, making this image "platform independent" means rebuilding on ubuntu packages or pip (pip is fragile, is fragile a lot.)

@step21
Copy link

step21 commented Jun 9, 2020

You can check my work here https://hub.docker.com/repository/docker/step21/jupyter-minimal-notebook (base and minimal so far, soon also scipy-notebook once scikit-image gets merges)
It has both aarch64 and armv7l images, but unless someone gets armv7l packages into conda-forge, I do not think it makes sense to maintain armv7l images.

@parente parente added tag:Community Stack We recommend this contribution become a community stack. tag:Documentation Related to user, developer, and maintainer documentation type:Enhancement A proposed enhancement to the docker images labels Jun 13, 2020
@sakuraiyuta
Copy link

I'm interested this topic too.
aarch64 docker images builded with my repository are here.

Also my docker hub repositories contain all needed by zero-to-jupyter-hub(Jupyter Hub for kubernetes, Z2JH) images.
Check out them prefixed as jupyterhub-k8s-.

I'm using Z2JH on kubernetes, Raspberry Pi 4(8GB) x4 cluster installed Ubuntu 20.04 64bit(arm64/aarch64). It seems work well for now.

aarch64 images can build on GitHub Actions. (but very slow)
My GitHub repository includes for building aarch64 images.
https://github.com/sakuraiyuta/docker-stacks/blob/fix/aarch64/.github/workflows/docker-aarch64.yml

I expected for maintainer team releases aarch64 images officially.
If you have an interest in this topic, I create Pull-Request.

See also:
https://discourse.jupyter.org/t/ztjh-on-a-raspberry-pi-k8s-cluster/3043

@step21
Copy link

step21 commented Mar 16, 2021

Cool! So far I had them built on docker hub after testing locally.

@romainx
Copy link
Collaborator

romainx commented Apr 17, 2021

Hello, if you don't mind I will rename this issue to something more general like "Support ARM architecture".
Please also note that I made an attempt in #1202 to build multi-arch images. It works however the build time is a pain point in this case.

@step21
Copy link

step21 commented Apr 17, 2021

Sure, please do. Cool that you did that. What did you use for building that the build time was such an issue? I just used dockerhub and that seemed to work relatively quickly. And how big is the difference?

@romainx
Copy link
Collaborator

romainx commented Apr 26, 2021

@step21 in fact here we are testing the images before before pushing them. In the case of multi-arch images we built an image for each architecture and tested them separately. Everything is in the PR but it takes ~3x more time to build the minimal-notebook for the 3 architectures because nothing was done in parallel.

@romainx romainx changed the title Possible docker image for raspberry pis Support ARM architecture (multi-arch images) Apr 26, 2021
@step21
Copy link

step21 commented May 1, 2021

How about splitting it? Would be happy to help, but right now I cannot work on it.

@consideRatio
Copy link
Collaborator

consideRatio commented May 2, 2021

In the JupyterHub organization on GitHub, we have now published (thanks @manics!!!) arm64/aarch64 compatible images alongside the amd64/x86_64 for multiple repositories.

I would like to help this repo do the same, but at the same time I find it crucial that future maintenance is sustainable. I created the enhancement proposal below with that in mind.

Enhancement proposal

  1. Principle - to maintain only a single Dockerfile per image (done by Removal of outdated ppc64le arch Dockerfile #1290)
  2. Action - to remove the dedicated and outdated ppc64le Dockerfiles and .patch files (done by Removal of outdated ppc64le arch Dockerfile #1290)
  3. Implementation detail - maintain a list of the images' compatible architectures
  4. Action - update the Makefile to be able to, in an opt-in fashion, to build the same Dockerfile for all compatible architectures (like in: Z2JH, JupyterHub, ConfigurableHTTPProxy)

Current arm64/aarch64 compatibility status

I went ahead and tried building all Dockerfile's (no patches applied etc) with --platform linux/arm64 using docker buildx instead of docker and created this arm64 compatibility list.

  • base-notebook
    • --build-arg mambaforce_arch=aarch64 --build-arg mambaforge_checksum=... also passed
    • @... suffix in FROM ubuntu statement removed
  • minimal-notebook
  • r-notebook
    conda can't resolve/install r packages - a fix to this may be to use --channel conda-forge based on insights shared by @skumagai
  • scipy-notebook
  • tensorflow-notebook
    conda can't install tensorflow of pinned version
  • datascience-notebook
    • hardcoded x86_64 installation of julia
    • conda can't resolve/install r packages (rpy2=3.4 is causing issues I think) - a fix to this may be to use --channel conda-forge based on insights shared by @skumagai
  • pyspark-notebook
  • all-spark-notebook

@sakuraiyuta
Copy link

@consideRatio r-* packages exists on conda-forge repository.
It's better if the project policy can agree to use another(not default) repository.

See also:
https://github.com/sakuraiyuta/docker-stacks/blob/fix/aarch64/r-notebook/Dockerfile.aarch64

@consideRatio
Copy link
Collaborator

@sakuraiyuta can you clarify if I understood you correctly:

You meant that the conda-forge conda channel includes both amd64 and arm64 compatible versions of r-* packages, while the default conda channel only includes amd64 compatible versions of r-* packages. And, due to this, we should consider switching to using conda-forge conda channel by default instead of the default conda channel by default?

@sakuraiyuta
Copy link

sakuraiyuta commented May 3, 2021

@consideRatio Sorry, I don't read your comments carefully so I wrote simply a solution to resolve r-* packages.
After reading your reply, I understood that you tried building with applying no patches to Dockerfile.

This project seems already supporting the patch process for another architecture.
So, on my forked repository, I created Dockerfile.aarch64.patch and patched for aarch64 .

As you said, some python packages for aarch64/arm64 are not found on default conda repositories.
It means some notebook images need to switch to another channel for supporting aarch64 or wait default channel supports it.
Maybe, the simplest way is adding -c argument to conda/mamba default install commands.
In other words, you completely understood my comment.
(Sorry, I'm not clear at English)

But, strictly say, it means that changes what the project serves, supports, and testing.
I think we need to consider that this approach is really appropriate.

@romainx
Copy link
Collaborator

romainx commented May 3, 2021

@sakuraiyuta and @consideRatio since the PR #1189 we have switched to miniforge instead of miniconda and so the default channel is conda-forge. So all the packages are already installed from the conda-forge channel.

@consideRatio
Copy link
Collaborator

@romainx @mathbunnyru do you think #1019 (comment) would be something I could work towards?

@mathbunnyru
Copy link
Member

mathbunnyru commented May 3, 2021

@consideRatio if you don't mind, I would like to take a few days to give this a thought, because I want to be more familiar with the process of building for other archs in GitHub/Docker environment.

@manics
Copy link
Contributor

manics commented May 3, 2021

Docker buildx uses qemu to build images for non native architectures.

There are a few issues when building multi-arch images simultaneously under the same tag that may have an impact here. This is a good introduction to the overall process https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

docker build can build and push multiple architectures at the same time, and takes care of creating a manifest, but it can't load the built image into the local docker host for testing. This means docker run <built-image> won't work unless you've pushed to a registry so it can be pulled straight back. This kind of makes sense, since you can't load an aarch64 image into an amd64 Docker host, but it makes testing a bit more of a pain. The alternative would be to build one architecture at a time, then create your manifest manually with docker manifest, or rerun buildx and rely on the docker layer cache.

The final issue is testing, docker buildx builds for multiple architectures, but as far as I know there's no way to run an image for a non-native platform. In JupyterHub and Z2JH we manually tested the aarch64 images before and after the PRs were merged, but they're not automatically tested- we assume that if there are no build errors and the amd64 image successfully runs then it's probably fine.

@romainx
Copy link
Collaborator

romainx commented May 4, 2021

Hello @manics I ran into the same issue in my PR #1202. I had to build the image for each architecture one by one and with nothing done in parallel, so it was a bit long. In fact it was the main blocker. However, I was able to run the unit tests on each architecture variant successfully thanks to qemu. I had even modified the tests because pandoc was not available on for each architecture.

@pytest.mark.skip_arch(["arm64", "ppc64le"])
def test_pandoc(container):
    """Pandoc shall be able to convert MD to HTML."""
    LOGGER.info(container.image_name)

@consideRatio I can confirm that the only modification made to the base-notebook Dockerfile was to remove the SHA pin on the upstream ubuntu image. Everything then can be managed through ARGS -- they have been defined on purpose.

@mathbunnyru
Copy link
Member

I would like to help this repo do the same, but at the same time, I find it crucial that future maintenance is sustainable. I created the enhancement proposal below with that in mind.

We would be kind for any help with the new archs so your help would be great.

I will comment on each element of your proposal

  1. Principle - to maintain only a single Dockerfile per image

💯 agree

  1. Action - to remove the dedicated and outdated ppc64le Dockerfiles and .patch files

💯 agree

  1. Implementation detail - maintain a list of the images' compatible architectures

💯 agree

  1. Action - update the Makefile to be able to, in an opt-in fashion, to build the same Dockerfile for all compatible architectures (like in: Z2JH, JupyterHub, ConfigurableHTTPProxy)

This is where maintenance is sustainable comes in.
I don't like how Makefile looked after #1202.
At the same time, I respect @romainx's work because it's an important thing for the community.
Removing patches will definitely look Makefile better.
But adding a lot of multiarch steps is not really good for future maintenance.

As far as I understand, we could for example have our self-hosted runners for arm.
If someone knows the pros/cons of qemu vs self-hosted runners, that would be great.

@consideRatio
Copy link
Collaborator

consideRatio commented May 6, 2021

@mathbunnyru thanks for taking the time to deliberate on my proposal!

I think the proposal I made was quite course and needs some additional exploration on the implementation - but it is very central to me that it is sustainable to maintain as you emphasis.

@romainx @mathbunnyru perhaps I could start working on a separate PR scoped just about point 1 and 2?


Regarding creating maintenance sustainable support for multiple architectures, I think I need to do some practical exploration and readup on past work to become more clearly opinionated on what I think is a good approach. But, here are some thoughts at this point.

  1. While working towards supporting more architectures, I think we should think of it as a bonus rather than a requirement along the way.
  2. If we can support running the full build/test suite locally it would be good, but I think running against non-amd64 architectures could be an optional stretch goal. It feels important to not let the local build/test experience to build amd64 images is disrupted.
  3. I'm considering minimizing the Dockerfiles' ARGs and just a single one: arch, and instead of embedding a single default checksum for amd64/x86_64, we embed one checksum per supported arch.
  4. I'm thinking that we want to be able to use docker buildx build --platform linux/amd64,linux/arm64 ... just like we use docker build ..., and that the only customization to the Makefile is one to make it capable of using the multi-platform build in an opt-in manner, perhaps controlled via a ARCH environment variable.

@mathbunnyru
Copy link
Member

@consideRatio I don't want to put any pressure on you, but how is it going? :)
Now, I'm more interested in these arm images, because I have recently switched to m1 mac, so it's not as easy development as it was, but with arm, it would probably be easy again :)

@mathbunnyru
Copy link
Member

I created myself a VM on Amazon to have much faster builds.
I found the cause of issue for scipy-notebook, created an upstream issue.

@consideRatio
Copy link
Collaborator

consideRatio commented Jul 19, 2021

Did you create a arm64 VM on Amazon and used the normal make build commands - and it worked successfully as a arm64 build, and it also gave you a lot better performance than using the emulator strategy? 🎉

@mathbunnyru can you provide a link to that upstream issue?

@mathbunnyru
Copy link
Member

@mathbunnyru can you provide a link to that issue?

It's in the top message of this issue.

@mathbunnyru
Copy link
Member

Everything except datascience and tensorflow images builds fine in CI, so I think we can close this issue.

Datascience notebook builds fine on a real arm machine or vm, but not in qemu.
When we switch to arm workers it will be easy to support this image.

There is no official wheel for arm tensorflow, so for now I don't think there is strong need for this image. If there will be an official wheel one day, then adding support for arm would be a one line change in Makefile.

@florianbaer
Copy link

Hi @mathbunnyru
According to actions/runner#805 - ARM Runners are now pre-released. But it looks as if they are only available for self-hosted runners? What's the status of providing the DataScience Notebook with the arm tag?

@mathbunnyru
Copy link
Member

Hi @mathbunnyru
According to actions/runner#805 - ARM Runners are now pre-released. But it looks as if they are only available for self-hosted runners? What's the status of providing the DataScience Notebook with the arm tag?

Hi @florianbaer

  1. First of all, for now, we don't have any self-hosted runners here and use buildx + QEMU to build arm images.
  2. datascience-notebook actually builds fine under native aarch64 (I built it on my M1 Mac), but fails under qemu emulated environment.
  3. I think the issue you mentioned doesn't help us. I mean, we don't specifically need macos aarch64, simple linux aarch64 would work for us and it's been available for a while, if I'm right. But still, we will probably have to have self-hosted runners.

Also, it might be worth checking again, that datascience-notebook fails under QEMU, because QEMU has recently release 7.0 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tag:Community Stack We recommend this contribution become a community stack. tag:Documentation Related to user, developer, and maintainer documentation type:Arm Issue specific to arm architecture type:Enhancement A proposed enhancement to the docker images
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants