Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build multi-arch image with amd64 + arm64 for M1 Macs #35

Closed
tsibley opened this issue Nov 16, 2021 · 14 comments · Fixed by #50
Closed

Build multi-arch image with amd64 + arm64 for M1 Macs #35

tsibley opened this issue Nov 16, 2021 · 14 comments · Fixed by #50
Assignees
Labels
enhancement New feature or request os: macOS Mac support

Comments

@tsibley
Copy link
Member

tsibley commented Nov 16, 2021

Images are currently linux/amd64, which means running on M1 Macs requires Intel emulation to work on the arm64 architecture. Emulation is documented to be poorly supported, though @victorlin has successfully run zika-tutorial using the amd64 image on an M1 Mac (but has not tried anything larger yet).

We can use docker buildx instead of docker build to produce multi-arch images (linux/amd64 + linux/arm64) so that emulation isn't required.

Reference material:

@tsibley tsibley added enhancement New feature or request os: macOS Mac support labels Nov 16, 2021
@victorlin
Copy link
Member

Benchmark with a larger workflow (n=262 sequences):

So, M1 mac users would benefit greatly from having this.

@fanninpm
Copy link

Would this require upgrading the Python version in the image? IIRC, Python 3.9.1 is the first Python version to support M1 Macs natively.

For reference:

FROM python:3.7-slim-buster AS builder

FROM python:3.7-slim-buster

@tsibley
Copy link
Member Author

tsibley commented Mar 15, 2022

@fanninpm I believe so, yes. Thanks for pointing that out. All our Python code should be compat with 3.9, but of course there will probably be some knock-on effects like new warnings, missing wheels for deps, etc. Hopefully those are minimal given that 3.9 is not the newest kid on the block anymore.

@victorlin victorlin moved this from New to Prioritized in Nextstrain planning (archived) Mar 16, 2022
victorlin added a commit to nextstrain/docs.nextstrain.org that referenced this issue Mar 18, 2022
victorlin added a commit to nextstrain/docs.nextstrain.org that referenced this issue Mar 21, 2022
@victorlin
Copy link
Member

I'll look at this after getting some other things off my plate, unless anyone else wants to start on it.

@victorlin
Copy link
Member

Stumbled upon this relevant Reddit thread which suggests that even with a linux/arm64 image, virtualization is still used and performance still might not be comparable to the "native" Docker experience on Linux, or even the WSL experience on Windows.

@tsibley
Copy link
Member Author

tsibley commented Jul 8, 2022

Yeah, it's always been the case on macOS that you pay for the cost of OS virt. The expectation (at least mine) with a linux/arm64 image is that instead of paying the increased OS virt + arch virt costs on M1 systems, we'd be back to paying just the OS virt (as ever). I don't know if that expectation is correct, but I'd hope it'd be possible for the VM to run on the native M1 architecture and host linux/arm64 images!

@sacundim
Copy link

It's worth noting that in addition to M1/M2 Macs, another interesting hardware platform that could use these ARM images is AWS Graviton. In fact it could well be one of the best platforms for automating the build, testing and publication of these images!

@tsibley
Copy link
Member Author

tsibley commented Jul 15, 2022

@sacundim Graviton is interesting, though for CI here we'll likely stick to using GitHub Actions runners, even when producing linux/arm64 images. While Actions only has x86_64 hosts, I believe BuildKit can use qemu to emulate arm64. Would like to avoid the overhead of setting up and managing our own (teeny tiny) build farm. 🙃

That said, Graviton might be interesting for our production compute jobs on AWS Batch, but we'd have to do some serious benchmarking/tuning to compare!

@corneliusroemer
Copy link
Member

@victorlin It'd be good to benchmark docker vs native for an Intel mac, too, to see how much worse it is with apple silicon vs intel in the case of macOS.

What's the workflow you ran? I could do the benchmark on my 2018 Intel MBP.

@victorlin
Copy link
Member

@corneliusroemer in #35 (comment) I ran the ncov example data tutorial workflow. The reference data changes daily, but sample size should be similar enough for a rough comparison.

@sacundim
Copy link

sacundim commented Aug 3, 2022

@tsibley What I mean is that I played around a bit with using docker buildx to build multi-arch images with CPU emulation, and (a) I found it painfully slow even for building very simple images, and (b) this is subjective but CPU emulation just gives me the heebie-jeevies, I get a headache just from wondering how mind-bending the troubleshooting could turn out if I did run into a glitch.

Whereas I also played with the following AWS Codebuild tutorial that builds the two platforms in native hardware and found it straightforward enough. Notably there's no servers to set up and maintain:

Just my brief experience, of course. Another factor is that GitHub Actions is free...

@tsibley
Copy link
Member Author

tsibley commented Aug 3, 2022

@sacundim Ah! Well, I guess we'll see how painfully slow it is and evaluate from there… I'm loathe to add another service like CodeBuild to the mix here, but it's good to know that's an option.

@corneliusroemer
Copy link
Member

I just ran our current hmpxv-1 on my new M1 Pro, once with conda (Intel emulation) and once with docker.

Docker: 2hr
Conda: 10min

Running the same workflow on apple silicon docker takes only ~16min.

So I get the same factor of ~10 slowdown as @victorlin above - this time on a real production workflow. Definitely not usable as is with docker.

@victorlin victorlin moved this from Prioritized to In Progress in Nextstrain planning (archived) Sep 28, 2022
Repository owner moved this from In Progress to Done in Nextstrain planning (archived) Dec 16, 2022
@victorlin
Copy link
Member

Docker Desktop 4.25.0, released on 2023-10-26, includes emulation improvements:

Rosetta is now Generally Available for all users on macOS 13 or later. It provides faster emulation of Intel-based images on Apple Silicon. To use Rosetta, see Settings. Rosetta is enabled by default on macOS 14.1 and later.

With Rosetta, I suspect performance of the linux/amd64 image on Apple silicon would be comparable to the Rosetta emulation used by the current Conda/ambient environment setups. If this were available earlier, maybe we wouldn't have prioritized #50 which added some complexity to the image build process. I don't think it's worth reverting that change though - the complexity has been manageable and there should still be some performance benefits when using a linux/arm64 image on Apple silicon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request os: macOS Mac support
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants