-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose an alternative process for docker image generation. #40
base: master
Are you sure you want to change the base?
Conversation
@ruffsl I'd also be very interested on your thoughts here since you're quite the docker whisperer in my esteem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I complete support deprecating this repository now that docker actually supports the different architectures. This was mostly a workaround so that we could do the single architecture.
I don't think that we have to inject QEMU. Here's it running w/o that and I can build and run aarch64 executables inside the environment.
$ docker run --platform=linux/aarch64 -ti ubuntu:focal bash
root@64bf983dcffe:/# apt-get update -qqq && apt-get install -qqqy gcc file
# CLIPPED
root@64bf983dcffe:/# cat << EOF > /tmp/helloworld.c
> #include <stdio.h>
>
> int main()
> {
> printf("Hello World!");
> return 0;
> }
> EOF
root@64bf983dcffe:/#
root@64bf983dcffe:/# gcc /tmp/helloworld.c
root@64bf983dcffe:/# ./a.out
Hello World!root@64bf983dcffe:/#
root@64bf983dcffe:/# file a.out
a.out: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=5535c724f2dcd20f98c6dcbe6bf5a94cab810960, for GNU/Linux 3.7.0, not stripped
I also am able to run the rclcpp examples w/o mounting in qemu-arm-static
The official ros images can be run as is just declaring the platform. I ran into some issues but qemu is clearly being invoked appropriately. |
Sure thing, I'd happy to help!
You are partly correct, as the old practice was to copy in a qemu-arm-static binary into the container before you run any binaries. See this post by @ computermouth (Ben Young) for more context: Following the links referenced in computermouth's post leads to an upstream issue in debian (since fixed) that was previously preventing the docker container runtime from seamlessly using the system installed/registered qemu binfmt.
In fact , I recall this topic bubbling over onto ROS Discourse a few years ago as well: With that resolved, using any modern debian/ubuntu distro, it's now relatively simple to use tools like buildx to set multiple targets platform for building: https://docs.docker.com/engine/reference/commandline/buildx_build/#platform As an example using GitHub actions, check out the multi-platform documentation for
Note that the setup-qemu-action must be invoked for the host VM prior to invoking Looks like the action installs the QEMU static binaries a little unconventionally via a privileged docker container using the Or can also be installed on the host OS simply via a package manager:
I agree. There are a lot a repos named after every arch that clutter up the osrf DockerHub org, with many stale images to boot. So a bit of a drag for security updates and whatnot. I'm not sure what else you all need to install/pre-bake on top of the debian/ubuntu base images for the ROS Buildfarm CI jobs, but I'm sure it could be generalized into a single Dockerfile template and a single multi arch docker registry repo. |
We use the base images from upstream directly on amd64. In theory any work that we do in these images should already be replicated in ros_buildfarm dockerfile generation since it has to support amd64 anyway.
Thanks for tracing that and linking it up for us! The context brings me much more confidence. The ROS build farm agents are Ubuntu 20.04 and have qemu-user-static installed. I tested this behavior (no explicit mounting of qemu-aarch64-static into the container) on one of our agents directly and it worked. So it sounds like just passing |
I think that what this all means is that to satisfy the immediate need for multi-arch containers for Jammy and Bullseye we can republish the official images under |
…tatic. Since the host's installation of qemu-user-static is entirely sufficient for Ubuntu 20.04 and the current build farm deployment requires 20.04 there is no longer a need to sideload the qemu-user-static binaries.
cc3aa80
to
fdce975
Compare
I've updated the instructions to forego adding the qemu-user-static binary since that is not needed at all on current 20.04 build farms. I've also used these instructions to push Future work on the ros_buildfarm scripts can bring "native" docker multiplatform support and we can retire the use of these images entirely. I can also start working on a scripted version off this to run so that we can trigger it whenever upstream images are pushed to stay in sync. |
In the past we've built custom docker images to support running armhf, arm64, and i386 containers using the scripts in this repository. In order to support running these images on amd64 hosts part of that image creation process has also involved bundling an amd64 qemu-user-static binary in the target image to enable the transparent running of non-native executables when paired with a host that supports binfmt-misc and has qemu-user-static binaries properly registered.
In the years since that process Docker has devised its own multi-platform scheme using image manifests and now the platforms that we've traditionally been creating cross-platform images for provide their own. Additionally, the official ROS build farm has transitioned to relying on native ARM build hosts rather than using qemu on AMD64 hosts and i386 is no longer supported by any ROS distribution. However, I think we should try and maintain support for fully amd64 and qemu based builds as long as it is feasible to do so.
Empirically, it appears that the official images are either doing something in order to allow the host's qemu-user-static to pass through or is bundling a copy that is otherwise not discoverable in the image. Since I can't explain this behavior yet I'm hesitant to rely on it.
The ros_buildfarm scripts and templates also assume that custom osrf images should be used for non-amd64 platforms. We can alter that assumption by changing one or two snippets but that would break anyone still relying on qemu for running builds.
As a long term solution I think we should update all of our buildfarm docker invocations to specify the target platform, however that requires many more modifications than the update of a single template and will require much more scrutiny.
It's possible, although somewhat convoluted for us to re-publish the official docker images together with an injected qemu-user-static binary using the same naming convention that we've historically used for these multi-arch images created with debootstrap. By doing so, we dramatically reduce the difference between our images and the upstream ones, since we would be, in effect, just distributing them with a different name and with a qemu-user-static binary in overlay, as well as reducing the image size as the official images are significantly more compact. Since they're already on the docker registry we also don't have to re-upload them ourselves.
Since this is still a proposal and the technique for getting this done is quite brittle I have not yet fully automated the process. But I've outlined the steps to be performed so that either directly or gradually we can automate more and more or find better ways of accomplishing the same thing.
My ultimate goal would be to retire this repository entirely and rely on docker's builtin multi-arch handling together with changes to ros_buildfarm to run docker in a platform aware fashion and overlay the qemu-user-static binary from the host when it is required. So ultimately a docker run on the buildfarm would look something like the below when running an arm64 target container on amd64.