Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/Add binfmt and qemu #2095

Closed
3 of 7 tasks
igagis opened this issue Nov 18, 2020 · 27 comments
Closed
3 of 7 tasks

Update/Add binfmt and qemu #2095

igagis opened this issue Nov 18, 2020 · 27 comments

Comments

@igagis
Copy link

igagis commented Nov 18, 2020

Tool information

  • Tool name: qemu and binfmt
  • Tool license: GPL-2, GPL-3
  • Add or update? Add
  • Desired version: whatever is latest
  • Approximate size:
  • If this is an add request:
    • Brief description of tool: support for running code compiled for another CPU architecture on x86_64 architecture.
    • URL for tool's homepage: http://qemu.org, http://binfmt-support.nongnu.org/
    • Provide a basic test case to validate the tool's functionality: launch hello world app compiled for ARM on x86_64.

Area for Triage:
C/C++

Question, Bug, or Feature?:
Feature

Virtual environments affected

  • Ubuntu 16.04
  • Ubuntu 18.04
  • Ubuntu 20.04
  • macOS 10.15
  • macOS 11.0
  • Windows Server 2016 R2
  • Windows Server 2019

Can this tool be installed during the build?
sudo apt install --assume-yes binfmt-support qemu-user-static

Tool installation time in runtime
5-10 seconds

Are you willing to submit a PR?
possibly

Why is this needed?
This will allow running jobs inside of docker images built for ARM architecture, i.e. when all binaries inside of the image are for ARM.

@umarcor
Copy link

umarcor commented Nov 20, 2020

Refs:

@maxim-lobanov
Copy link
Contributor

@umarcor thank you for providing useful links.

@igagis , thank you for your proposal.
Usually, we don't pre-install tools that can be installed very quick in runtime.
For qemu, there is an community action to install it in runtime: https://github.com/docker/setup-qemu-action
Considering the fact that this tool is pretty specific, we wouldn't like to bake it into the image.

Thank you for understanding.

@igagis
Copy link
Author

igagis commented Nov 20, 2020

@maxim-lobanov I understand that you want to keep the image as minimal as possible, but consider the following arguments:

  1. if I use the approach with installing qus as a step, then I'm not able to use github actions's jobs.<job-id>.container feature
  2. it complicates usage of build matrix, when I want same build steps to be run on different images which also differ by CPU architectures, I'd have to use if's for some steps (like installing the qus step)
  3. when actually running the build procedure, I'm not able to split it into separate steps (like installing dependencies, building, testing), because I'd have to use uses: docker://<build-image> (see point n.1) which will run each step in a separate container, so I'd have to do all the stuff as a long long single command
  4. github does not provide public runners for any other architecture than amd64, so it is crucial to compensate that with preinstalled qemu-user-static and binfmt-support, considering this, I would not call the tool "pretty specific", it is pretty common that people want to run builds on different architectures

@maxim-lobanov
Copy link
Contributor

@igagis thank you for providing additional information!
I would like to reopen it for now and we will revisit this decision next week.

@maxim-lobanov maxim-lobanov reopened this Nov 20, 2020
@umarcor
Copy link

umarcor commented Nov 20, 2020

@igagis, the important point to understand is that installing binfmt-support qemu-user-static is just one of the alternatives for configuring QEMU. GitHub Actions do already include all the (kernel) features and permissions for using QEMU, without those packages.

Precisely, dbhi/qus uses exactly the same binaries (from Debian repos) and a customised script (from QEMU). Therefore, the dbhi/qus Action is an optimised subset. In fact, the main advantage of dbhi/qus is that it allows registering the interpreter for a single target architecture, instead of installing and registering all of them. Furthermore, since statically built interpreters are portable, download is limited to 2MB per target architecture.

The purpose of dbhi/qus is to showcase over a dozen different approaches to using QEMU static: https://dbhi.github.io/qus/#tests. As you can see, a quarter of the examples do use qemu-user-static, some other use qemu-user, others none of them. In all cases, you can use existing Actions or a 5 line script.

  1. if I use the approach with installing qus as a step, then I'm not able to use github actions's jobs.<job-id>.container feature

This is not exclusive to qus, but to any solution which is not "registering the interpreters by default".

On the one hand, it might be easily fixed if jobs.<job-id>.container supported --privileged.

On the other hand, registering the interpreters by default might prevent usage of the persistent option, which loads the interpreters in memory. This is of utmost importance for running foreign containers (arm32v7/, armv64v8/, s390x/*, etc.) non-intrusively.

  1. it complicates usage of build matrix, when I want same build steps to be run on different images which also differ by CPU architectures, I'd have to use if's for some steps (like installing the qus step)

Any non-trivial workflow will require ifs for some steps. It might be for supporting different OS, different architectures, etc.

  1. when actually running the build procedure, I'm not able to split it into separate steps (like installing dependencies, building, testing), because I'd have to use uses: docker://<build-image> (see point n.1) which will run each step in a separate container, so I'd have to do all the stuff as a long long single command

I would say that you can start a container at the beginning and then docker exec each step, similar to how you would use "service containers". Nonetheless, I suggest using a custom test script, and ::group::|::endgroup:: for keeping the log browsable. It's unfortunate that these groups cannot be nested nor timed yet.

  1. github does not provide public runners for any other architecture than amd64, so it is crucial to compensate that with preinstalled qemu-user-static and binfmt-support, considering this, I would not call the tool "pretty specific", it is pretty common that people want to run builds on different architectures

Although I agree that the number of people willing to build tools for non-amd64 is increasing, IMHO it is still negligible in CI. Don't take me wrong. I think it is very important, and that's why I maintain dbhi/qus (and use it in dbhi/docker). However, I would say it is not crucial at all. Moreover, there is a 5-10x overhead when using QEMU. That's why the vast majority of builds for non-amd64 architectures are generated through cross-compilation. This is not exclusive to GHA. Almost all the Linux distributions build packages for all the supported architectures on amd64.

Overall, I think that point 1 and 3 should be addressed. That would provide a better user experience not only for this use case, but for many others involving non-trivial and non-web-focused usage of containers.

@igagis
Copy link
Author

igagis commented Nov 20, 2020

@umarcor I've just had a thought, is it possible to run dbhi/qus as a service (i.e. via jobs.<job-id>.services)? As I understand, services are started before the container indicated in jobs.<job-id>.container is launched? That way dbhi/qus will configure the qemu before the workflow execution container is started. Will it work that way?

@umarcor
Copy link

umarcor commented Nov 20, 2020

@igagis, that's an interesting idea! I guess the only issue might be --privileged. If a service can be started with privileges, it should work, and it would be a cleaner solution than what I had envisioned for 1. However, if privileged is not supported, we are in the same situation as with 1.

@igagis
Copy link
Author

igagis commented Nov 20, 2020

@umarcor according to https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idoptions it is possible to pass options to docker create, and --privileged seems to be one of them.

So, what should I try? Would something like the following be the right thing?

# docker run --rm --privileged aptman/qus -s -- -p

jobs:
  linux:
    services:
      qemu:
        image: aptman/qus
        options: --rm --privileged
    container: my/image:arm
    steps:
      ....

I took the command line for docker run from your example, but I'm not sure what do with these -s -- -p mean and where should I pass those in the service description. Any idea?

@umarcor
Copy link

umarcor commented Nov 20, 2020

@igagis, "service" is normally used as a synonym of "daemon container". Therefore, we need to take two constraints into account:

  • -s -- -p are the positional arguments, not options. With container steps, the input is named args. GitHub might not support specifying these, since they might expect all "daemon" containers to have a default command. Then, you/we need to customize aptman/qus for accepting envvars as an alternative to cli args.

  • GitHub might expect the service containers to remain alive until the execution of the workflow. That's not required for our use case, because aptman/qus runs in a few seconds and it is then terminated. So, we might need to cheat by keeping aptman/qus alive.

I will do some tests and I'll let you know.

@igagis
Copy link
Author

igagis commented Nov 21, 2020

Also docker run's health check with --health-cmd etc. will be needed to make sure github waits until qus stuff is completed before starting the job steps execution.

See example for redis service: https://docs.github.com/en/free-pro-team@latest/actions/guides/creating-redis-service-containers#configuring-the-container-job

@umarcor
Copy link

umarcor commented Nov 21, 2020

@igagis, have a look at https://github.com/dbhi/qus/actions/runs/375520958 and dbhi/qus@56c0107.

  • As said, the service container needs to remain alive until the execution of the workflow. Otherwise, the workflow won't start. See https://github.com/dbhi/qus/runs/1434087443?check_suite_focus=true#step:2:78.
  • Args seem not to be supported. In this case, I just hardcoded the general case that registers the interpreters for all available platforms. If moved to main, envvars should be supported.
  • I built and pushed aptman/qus:service in that same workflow. Should this work, I would move it to main.
  • As you see:
    • Using aptman/qus:service as a service does work.
    • Executing Docker steps/actions using foreign architectures does work.
  • Unfortunately, it does not work with container jobs. That is probably because the job container is started before the services, and the entrypoint is set to tail: https://github.com/dbhi/qus/runs/1434502532?check_suite_focus=true#step:2:32. That tail is for the foreign architecture (because it is inside the container), but the service was not executed yet. Therefore it fails.

Hence, using a service might be feasible if the startup order was changed from:

  1. Starting job container
  2. Starting service container
  3. Waiting for all services to be ready

to:

  1. Starting service container
  2. Waiting for all services to be ready
  3. Starting job container

@umarcor
Copy link

umarcor commented Nov 21, 2020

Also docker run's health check with --health-cmd etc. will be needed to make sure github waits until qus stuff is completed before starting the job steps execution.

The execution of qus is a few seconds. It takes longer for GitHub to move from 'Starting service container' to 'Waiting for all services to be ready' that it takes for qus to execute. That's why https://github.com/dbhi/qus/runs/1434087443?check_suite_focus=true#step:2:78 failed. Therefore, although a corner case might need a health cmd, I don't think that's the main problem. Starting containers with tail before the services are ready is a design decission of GitHub runner developers/maintainers.

@igagis
Copy link
Author

igagis commented Nov 21, 2020

Did you contact runner developers about possibility of changing that container/services startup order? You mentioned that it is done deliberately by some design decision, right? Or should we submit a feature request to them to change that order?

To me it makes no sense to start job container before services are ready, but maybe I'm missing something...

@umarcor
Copy link

umarcor commented Nov 21, 2020

Did you contact runner developers about possibility of changing that container/services startup order?

I did not. In fact, I had not investigated this until I did the tests above.

You mentioned that it is done deliberately by some design decision, right? Or should we submit a feature request to them to change that order?

I say it's a design decission because someone did need to decide the order. Submitting a feature request sounds sensible.

To me it makes no sense to start job container before services are ready, but maybe I'm missing something...

It seems that it's just because it's the first one added to the array: https://github.com/actions/runner/blob/c18c8746db0b7662a13da5596412c05c1ffb07dd/src/Runner.Worker/ContainerOperationProvider.cs#L143-L147. So, all the service containers are started one after the other, and the main container is considered the first service.

@igagis
Copy link
Author

igagis commented Nov 21, 2020

Thanks! I'll submit the feature request then.

@igagis
Copy link
Author

igagis commented Nov 28, 2020

it looks like the best solution would be still to just pre-install the qemu tools to the ubuntu-latest image.
I investigated about how to manually install qemu without workarounds and ended up submitting these issues against gitlab actions runner:

actions/runner#816
actions/runner#820
actions/runner#822
actions/runner#821
actions/runner#826
actions/runner#831

why do I need all that functionality? Well, I'm in the process of setting up a universal workflow for all my repositories and after I switch to github actions CI for all of my repos I want to minimize the need for any modifications I will have to do to each of the repos in the future, when this and that issue is resolved (revert workarounds), when I drop in self-hosted runners to be used along with public ones (yes, I considered using self-hosted runners only for ARM builds), and for whatever else reason.

I'm not sure that all those listed issues will be resolved in any near future.

On the other hand, just pre-installing qemu tools would resolve all of my problems.

@maxim-lobanov
Copy link
Contributor

Hello everyone, just an update:
We have decided to pre-install these packages into the image. I have moved work item to backlog.
As I understand, these packages don't require maintenance but should allow new use-cases on images. Also as I understand, it won't break any existing customers' builds / workflows. Please correct me If I am wrong.

@dibir-magomedsaygitov dibir-magomedsaygitov self-assigned this Dec 7, 2020
@umarcor
Copy link

umarcor commented Dec 7, 2020

@maxim-lobanov, I believe that those packages do register the interpreters. Therefore, users who expect those not to be registered will find issues when registering theirs. They'll need to first remove the existing interpreter. I suggest to consider reworking the container provisioning instead, since that will enable other use cases too.

@maxim-lobanov
Copy link
Contributor

@umarcor , Oh, didn't catch that it will be a breaking changes for those customers.
In this case, it is better to work with actions/runner team to fix it from their side.

@igagis
Copy link
Author

igagis commented Dec 7, 2020

@umarcor Why would someone expect interpreters to be not installed? What is the example use case?

@umarcor
Copy link

umarcor commented Dec 7, 2020

@igagis, any user who is currently using GHA for building/testing foreign applications/containers. Very specially, the users that are loading the interpreters in persistent mode, in order to run foreign containers without contaminating them. I believe that the default installation would not load them in persistent mode.

@igagis
Copy link
Author

igagis commented Dec 7, 2020

@umarcor I'm not sure I fully understand that. What is "persistent mode" of the interpreters?
How can software installed on the host contaminate the container (docker image?) which is built on that host?

Though I still don't understand why would someone expect interpreters to be not registered in their build (and why would someone care at all about installed interpreters in case they don't use them, i.e. run only amd64 binaries), those who care could uninstall those with just one step:

- run: apt remove qemu-user-static binfmt-support

before doing the rest of the stuff.

While in case someone want to use arm images one hits problems (see my post #2095 (comment)).

@umarcor
Copy link

umarcor commented Dec 7, 2020

I'm not sure I fully understand that. What is "persistent mode" of the interpreters?

Please, see dbhi.github.io/qus. The last paragraph of section Context refs https://kernelnewbies.org/Linux_4.8?highlight=%28binfmt%29.

How can software installed on the host contaminate the container (docker image?) which is built on that host?

In a typical binfmt/qemu-user setup, whenever a foreign instruction/app needs to be interpreted, the interpreter needs to be executed. That is done implicitly by the kernel, but the interpreter needs to be available in the environment where the app is being executed. Hence, by default, binfmt will tell the foreign app inside the container to use an interpreter which does not exist in that environment. The path that the kernel passes corresponds to a path on the host, which was set when the interpreter was registered.

Before the persistent mode was added, users did need to make qemu-user binaries available inside the containers. That could be done by copying them, or by binding a path from the host. That's what I mean which "contaminating the container". Copying the binaries and removing them is a requirement for building foreign containers, unless persistent mode or docker experimental features are used. See the table in dbhi.github.io/qus/#tests. All the -p n rows required making binaries available inside the container.

The great advantage of loading interpreters persistently is that you don't need to care about exposing the binaries inside the containers. You run a single setup command and then you can use and build as many foreign containers/apps as you want, without tweaking each of them.

Though I still don't understand why would someone expect interpreters to be not registered in their build (and why would someone care at all about installed interpreters in case they don't use them, i.e. run only amd64 binaries),

I would want to remove non persistently loaded interpreters and load then persistently instead. That is, I want to keep using dbhi/qus. I would need to change the procedure because registering interpreters twice can fail. Is it a great problem if I need to remove them first? It's not. Is it worth the hassel? I believe it's not, because having qemu-user-static and binfmt-support installed by default would still not solve your use case.

those who care could uninstall those with just one step:
before doing the rest of the stuff.

Currently, all those who care are using qemu-user in GitHub Actions with a single one-liner. Therefore, this is not a stopper by any means. Instead of advocating for a not-so-easy patch that fixes your specific style preference, I believe it is better to address a fundamental enhancement that would allow not only your preferred style, but also other use cases which are currently not possible. The most obvious one is building an image in one step and using it in the next one (actions/runner#814).

Don't take me wrong, I understand your frustration with GHA workflow syntax and how limiting it is. However, I believe that adding tools needs to be done carefully, and someone needs to look after them. @maxim-lobanov explicitly said that they were going to add them because they need no maintenance, which is not the case as I have just explained.

@igagis
Copy link
Author

igagis commented Dec 7, 2020

Well, in that case, we need to pre-install interpreters in persistent mode then, to ensure widest compatibility.
Maybe not with qemu-user-static package, but in some other way... I found there is another package called qemu-user-binfmt which registers non-statically linked user mode interpreters, maybe it will register interpreters in persistent mode, I did not investigate yet though.

The great advantage of loading interpreters persistently is that you don't need to care about exposing the binaries inside the containers. You run a single setup command and then you can use and build as many foreign containers/apps as you want, without tweaking each of them.

But then, images built in such a way, will always require interpreters in persistent mode when using such images. Doesn't that limit usability of such built images? Isn't it better to include statically linked qemu for amd64 arch to each image you build (put to /usr/bin, where else can it be?) to ensure the image could be used in case interpreters are registered in non-persistent mode on the user's host?

I would need to change the procedure because registering interpreters twice can fail

Right, and that would be just one simple step I wrote above.

In contrast, consider the following workflow:

name: ci
on: [push, pull_request]
jobs:
  linux:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        include:
          - image: debian:buster
          - image: i386/debian:buster # i386 arch image
          - image: ubuntu:eoan
          - image: igagis/raspbian:buster # ARM arch image
    container: ${{ matrix.image }}
    steps:
      - name: git clone
        uses: actions/checkout@main
      - name: install build dependencies
        run: apt install --assume-yes devscripts equivs bla-bla-bla
      - name: build
        run: make
      - name: deploy
        run: deploy.sh
        if: startsWith(github.ref, 'refs/tags/') # only deploy for tagged build

as you can see the build matrix lists 4 images to build on, some are AMD64, some are i386 and some are ARM.
This approach currently does not work because GHA is unable to start the container for foreign architecture, even i386 does not work.
As I understand, you suggest to re-write it using run: docker://${{ matrix.image }} approach. But I bet you can imagine how much more changes are needed for that, and also how ugly will it look (again, see #2095 (comment) for the list of problems).

Currently, all those who care are using qemu-user in GitHub Actions with a single one-liner.

See my example workflow above, it's not just one liner. While the workflow I used for example seems pretty common to me.

I believe it is better to address a fundamental enhancement that would allow not only your preferred style

I agree that it is always better to solve the problem in a right way, but in real world it does not look like it will be solved in a foreseeable future, considering number of issues in the runner's issue tracker. So, pre-installing qemu interpreters in persistent mode look like a good compromise. And it looks like the number of users who really care about which interpreters are registered is quite limited, and that would not harm for them to use the - run: apt remove qemu-user-static step.

having qemu-user-static and binfmt-support installed by default would still not solve your use case

Why do you think so? As I said, this approach worked for me on Travis-CI. Yes, in ARM images I used the qemu-arm-static bianry is present in /usr/bin.

@umarcor
Copy link

umarcor commented Dec 8, 2020

Well, in that case, we need to pre-install interpreters in persistent mode then, to ensure widest compatibility.

Now you are starting to grasp the complexity of this topic. Who is "we"? Someone needs to be willing to understand, implement and maintain this feature. Willingness is not enough: good knowledge of the virtual environment(s), runner, self-hosted runners, etc. is required. The response from GitHub so far is that they are not putting any additional effort on it. I don't have the bandwidth for testing modifications to the virtual environments, and I don't have a testing environment where to deploy them. Are you up to the challenge? Note the three verbs, tho: understand, implement and maintain.

Maybe not with qemu-user-static package, but in some other way... I found there is another package called qemu-user-binfmt which registers non-statically linked user mode interpreters, maybe it will register interpreters in persistent mode, I did not investigate yet though.

Registering non-statically linked interpreters is a no-go. Binaries are loaded in memory and passed by the kernel to the container, but executed inside the container. Therefore, if you use dynamically linked binaries, you need to make all the dependencies (shared libs) available inside the container. The only exception is when the container is the same OS and arch as the host; yet, in that use case you don't need QEMU at all.

Your guess is not wrong: something needs to be done after installing qemu-user-static and binfmt. But that something is not installing additional packages. Please, read dbhi.github.io/qus carefully and follow the references provided there. As I said above, the main purpose of dbhi/qus is didactic. You don't need to use dbhi/qus if you don't want to, but you should learn how it works if you want to implement any alternative solution. Very precisely:

  • As mentioned in Provided docker images:

    When QEMU is installed from distribution package managers, it is normally set up along with binfmt_misc. Nonetheless, in the context of this project we want to configure it with custom options, instead of relying on the defaults. A script provided by QEMU, qemu-binfmt-conf.sh, can be used to do so. Among other options, the flag that tells binfmt to hold interpreters in memory is supported in qemu-binfmt-conf.sh as -p.

    This project uses a modified version of qemu-binfmt-conf.sh, which includes the following enhancements:

  • Hence, you can study either the upstream qemu-binfmt-conf.sh or my fork (~300 lines of shell/bash in any case). While doing so, you will learn the there is a "default" registration procedure, but there are specific modes for "systemd" and "debian".

  • Ubuntu is Debian based, so I would expect creating/modifying update-binfmts templates to be required. However, unlike qemu_generate_systemd and qemu_register_interpreter, which do call qemu_generate_register (where PERSISTENT is used), it seems that qemu_generate_debian does not take persistency as an option. Maybe the systemd procedure needs to be used?

  • Ubuntu/Debian packages should be checked, since they might tweak QEMU for setting up automatic registration on startup.

  • Then, and only then, a PR can be proposed/reviewed for:

    • Installing QEMU by default in virtual environments while ensuring that interpreters are loaded into memory persistently.
    • Documenting how users can remove the interpreters and add their own without some background process interferring.

But then, images built in such a way, will always require interpreters in persistent mode when using such images. Doesn't that limit usability of such built images?

Not at all! Containers are self-contained user-space environments that rely on an "external" kernel which can pass the instructions to a valid interpreter. Interpreters can be software, which is what we normally refer to when talking about QEMU user mode. However, interpreters can also be hardware, and that's what we call CPU.

For instance, arm32v7 containers can be executed on armv7 or armv8 devices, without any software interpreter, because those devices understand the instruction sets natively. By the same token, i386 containers can be executed on amd64 hosts without software interpreters. Of course, there are aarch64 only and amd64 only devices too (specially in datacenters).

Therefore, containers should be agnostic to the usage of QEMU. That belongs to the host, as a workaround for the inability of a given CPU to understand some specific instruction set.

Isn't it better to include statically linked qemu for amd64 arch to each image you build (put to /usr/bin, where else can it be?) to ensure the image could be used in case interpreters are registered in non-persistent mode on the user's host?

That's a very short-sighted approach. You are assuming that you will always want to use foreign containers on amd64 hosts. There is a very straightforward counter example: users building container images for Raspberry Pi, Rock960, Pine64, and other SBCs on GitHub Actions. For instance:

  • Three parallel jobs are executed, one for amd64 (without QEMU) and two for arm32v7 and arm64v8 (with QEMU).
  • In each job, a different image is built and pushed, each for an specific architecture.
  • A manifest is pushed, which collects all three images in a single tag.
  • Users on any host can pull the manifest image, and they will get the specific image for their arch, without any unnecessary binaries inside.

The procedure above is how dbhi/docker works. See a GHA run and the resulting images/manifests.

On the other hand, apart from amd64 as a host, aarch64, ppc64 and s390x are also very used for running their "native" or amd64/arm containers. Hence, the corresponding interpreters for all those might potentially need to be put into /usr/bin. That is also the contamination I meant before.

Nevertheless, the procedure you are suggesting is also supported in dbhi/qus. Container images and releases provide statically compiled binaries (precisely, extracted from Debian packages), which you can add to the container through volumes. That is, you can use containers with qemu-user without persistent mode and without adding binaries to the images permanently. Naturally, for docker build you need experimental features to avoid contamination, but it's possible too.

I would need to change the procedure because registering interpreters twice can fail

Right, and that would be just one simple step I wrote above.

As I said before, I agree that it's not a big problem and I could easily work around that. As I also said, my main concern is that I don't see a feasible proposal on the table yet. So, I'm not opposing to installing qemu through apt per se, but someone needs to work hard for achieving an acceptable solution.

In contrast, consider the following workflow:

As I also said before, I understand why you like that style. Still, there is nothing preventing you from using GHA and achieving your targets. Therefore, this is not critical, but a matter of taste. Conversely, there are other use cases which are currently not possible, and which would benefit from looking at this issue with a wider perspective, instead a shortest term look.

This is just one possible alternative style:

 name: ci
 on: [push, pull_request]
 jobs:
   linux:
     runs-on: ubuntu-latest
     strategy:
       fail-fast: false
       matrix:
         include:
           - { arch: amd64, image: debian:buster },
           - { arch: i386,  image: i386/debian:buster }, # i386 arch image
           - { arch: amd64, image: ubuntu:eoan },
           - { arch: arm,   image: igagis/raspbian:buster }, # ARM arch image
     steps:
     - uses: actions/checkout@main

     - run: register_interpreter_for_matrix.arch
       if: matrix.arch != 'amd64'

     - run: |
          docker run --rm \
            -v $(pwd):/src -w /src \
            -e IS_TAGGED="startsWith(github.ref, 'refs/tags/')" \
            ${{ matrix.image }} \
            ./.github/test.sh
#!/usr/bin/env sh

apt install --assume-yes devscripts equivs bla-bla-bla;
make
[ -n "$IS_TAGGED" ] && deploy.sh || echo "Skip deploy"

The advantage of this style is that it is portable. That is, anyone can run the test script either on their host or inside a container or in some other CI service (by copying the command in the workflow).

As I understand, you suggest to re-write it using run: docker://${{ matrix.image }} approach. But I bet you can imagine how much more changes are needed for that, and also how ugly will it look

Using docker:// would make the example above slightly cleaner:

 name: ci
 on: [push, pull_request]
 jobs:
   linux:
     runs-on: ubuntu-latest
     strategy:
       fail-fast: false
       matrix:
         include:
           - { arch: amd64, image: debian:buster },
           - { arch: i386,  image: i386/debian:buster }, # i386 arch image
           - { arch: amd64, image: ubuntu:eoan },
           - { arch: arm,   image: igagis/raspbian:buster }, # ARM arch image
     steps:
     - uses: actions/checkout@main

     - run: register_interpreter_for_matrix.arch
       if: matrix.arch != 'amd64'

     - uses: docker://${{ matrix.image }}
       with:
         args: ./.github/test.sh
         env:
           IS_TAGGED: "startsWith(github.ref, 'refs/tags/')"

Unfortunately, this would not work because the runner will try to start the docker step at the beginning. That's what actions/runner#814 is about. And that's something that would allow you to have a clean enough workflow, while enabling other use cases which are currently not possible.

(again, see #2095 (comment) for the list of problems).

See #2095 (comment) for a detailed analysis of those specific problems. See also #2095 (comment). I think that reordering the service initialisation is the easiest solution to this problem.

See my example workflow above, it's not just one liner. While the workflow I used for example seems pretty common to me.

Your example workflow above is not a valid reference, because it is a synthetic representation of how you envision that GHA workflows should work. For that same reason, that style is very far from being common, because it never worked. Conversely, the pattern I find most is the "portable" style I showed above. That's probably because many projects were migrated from Travis and other services where "actions" as JavaScript/Docker modules did not exist.

NOTE: actions/checkout behaves differently depending on the git version. Be careful when using it inside a container. That is, when using a container job instead of container step(s).

Anyway, once again, don't take me wrong, I agree with your ideal style preference. I'd love if that worked and we could have foreign container support off the shelf. Going back to the beginning, my concern is who has the resources and willingness for implementing and maintaining the feature in GHA.

NOTE: In fact, all Windows 10 users of Docker Desktop do have persistent qemu-user support for ARM by default since a couple of years ago, and I really like that. However, they don't use qemu-binfmt-conf.sh, but a golang tool (which is statically built for amd64 hosts only).

I agree that it is always better to solve the problem in a right way, but in real world it does not look like it will be solved in a foreseeable future, considering number of issues in the runner's issue tracker.

Honestly, in the real world we will revisit this in some weeks, and months, and maybe years. Human resources for all actions repos are similar, if not the same.

So, pre-installing qemu interpreters in persistent mode look like a good compromise. And it looks like the number of users who really care about which interpreters are registered is quite limited, and that would not harm for them to use the - run: apt remove qemu-user-static step.

Should you achieve pre-installing interpreters in persistent mode, I believe that'd be acceptable. Otherwise, it should be very clearly documented which is the one-liner that users need to use, due to breaking backwards behaviour.

having qemu-user-static and binfmt-support installed by default would still not solve your use case

Why do you think so? As I said, this approach worked for me on Travis-CI. Yes, in ARM images I used the qemu-arm-static bianry is present in /usr/bin.

I assumed that you didn't want to contaminate the containers, since that's what most users pursue. If you are ok with being forced to put the binary in the same location that GHA does (and use that same location in any future host), then installing those packages alone might work.

@AlenaSviridenko
Copy link
Contributor

Hi there,
after some internal discussion we'd stick to the plan not to pre-install this tool on our images. It may have unpredictable impact on existing customers and with given the tool popularity, we'd rather not to add it.

I am closing this issue. Please, feel free to create a new one in case of any other questions. Thanks.

@umarcor
Copy link

umarcor commented Dec 18, 2021

Ref: community/community#9056

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants