Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI:BUILD] Cirrus: Migrate multiarch build off github actions #1661

Merged
merged 2 commits into from
Jun 29, 2022

Conversation

cevich
Copy link
Member

@cevich cevich commented Jun 3, 2022

The github actions workflow for this operation is complex and difficult
to maintain. For several months now a replacement has been running well
in the podman repository. It's scripts/components are centralized,
versioned, unit, and integration tested. Add cirrus tasks to run the
build, and another task to allow test builds in a PR.

This also adds support for a new magic CI string: [CI:BUILD].
With this string in the PR title, automation will only do basic build
verification, and enable testing of the multi-arch build process.

Otherwise, many tasks were updated to not be created when running the
cirrus-cron multi-arch image builds, since this would simply be a waste
of time and invitation for flakes.

Lastly, since only native tooling is used in the new build process,
rename all the recipes to Containerfile.

Signed-off-by: Chris Evich [email protected]

@cevich cevich force-pushed the multiarch_build branch 2 times, most recently from 999c75b to 036f3d5 Compare June 7, 2022 19:49
@cevich
Copy link
Member Author

cevich commented Jun 9, 2022

Note to me: Needs credentials added. Also need to add a 'multiarch' cron-job.

@cevich cevich force-pushed the multiarch_build branch from 036f3d5 to cd63cee Compare June 14, 2022 17:24
@cevich cevich changed the title [WIP] [CI:BUILD] Cirrus: Migrate multiarch build off github actions [CI:BUILD] Cirrus: Migrate multiarch build off github actions Jun 14, 2022
@cevich cevich marked this pull request as ready for review June 14, 2022 19:14
@cevich
Copy link
Member Author

cevich commented Jun 14, 2022

This is ready for review/merge. Note: The contrib/skopeoimage/upstream build takes nearly 1:30 hours. We should seriously consider fixing that somehow. Maybe by installing pre-build packages based on main instead of compiling it all from scratch?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 14, 2022

The contrib/skopeoimage/upstream build takes nearly 1:30 hours.

And the others almost an hour, which feels even more curious to me, considering that it’s basically just some RPM installations.

Hopefully the computers have the patience for that.

We should seriously consider fixing that somehow. Maybe by installing pre-build packages based on main instead of compiling it all from scratch?

Where would the pre-built packages come from? I’d expect them to come from some cron job :)

Any idea where is most of the time spent? I can maybe imagine something like pre-building a build container every week, amortizing the time to install RPMs and maybe to pull some version of the repo (with the daily builds just pulling the last commits and actually building), but that’s more complexity and it could mean older = less secure RPMs in the image.

@mtrmac
Copy link
Contributor

mtrmac commented Jun 14, 2022

pre-built packages

Uh, are we talking about RPMs or about /usr/bin/skopeo?

@cevich
Copy link
Member Author

cevich commented Jun 14, 2022

And the others almost an hour

This is fairly normal, dnf itself is running in emulation mode across 3 foreign architectures. It's not fast (and neither was the old github-action workflow).

Hopefully the computers have the patience for that.

Lol, they do. And people are protected: The job will only run as part of cirrus-cron, or if manually triggered AND [CI:BUILD] is in the PR subject line.

Where would the pre-built packages come from? I’d expect them to come from some cron job :)

Yeah exactly, that would be the thing to solve, and Buildah has the exact same challenge. The thing I'm most worried about is Cirrus has a hard-coded max of 2-hours runtime. So spreading that out with pre-built RPMs or binaries elsewhere would be good.

Lokesh has a fancy thing setup for podman-next in COPR, but no buildah or skopeo yet. @lsm5 is this on your radar at all for "the future" sometime?

Another option is to add a job to the cirrus-cron for main that builds multi-arch binaries. Then just consume those in the Containerfile (would work for manual image builds too).

@cevich
Copy link
Member Author

cevich commented Jun 14, 2022

Clarification: I'm not super concerned about the 1:30 build time, only slightly mildly concerned 😀

@cevich
Copy link
Member Author

cevich commented Jun 14, 2022

Note to me: Once this merges, need to add a cirrus-cron job named 'multiarch'.

@lsm5
Copy link
Member

lsm5 commented Jun 15, 2022

Lokesh has a fancy thing setup for podman-next in COPR, but no buildah or skopeo yet. @lsm5 is this on your radar at all for "the future" sometime?

If we're talking about building the latest stuff merged upstream, then rhcontainerbot/podman-next would be the copr to use. I don't have packages for skopeo and buildah there yet, but I can work on those within the next few days.

Would we need this for ubuntu tests as well? I can try to enable bleeding-edge ubuntu packages as well.

@cevich
Copy link
Member Author

cevich commented Jun 16, 2022

Would we need this for ubuntu tests as well? I can try to enable bleeding-edge ubuntu packages as well.

I don't really see a need there yet, but thanks for asking.

@cevich
Copy link
Member Author

cevich commented Jun 16, 2022

Clarification: I believe this is ready to go.

@mtrmac
Copy link
Contributor

mtrmac commented Jun 16, 2022

I’m sorry — I have spent most of this week on that OCI artifact hydra. I should be able to review tomorrow.

@cevich
Copy link
Member Author

cevich commented Jun 16, 2022

No worries, this is replacing something that's "working" so not much rush.

@cevich
Copy link
Member Author

cevich commented Jun 28, 2022

@mtrmac is it tomorrow yet? 😁
(No worries, I'm sure you're busy)

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtrmac is it tomorrow yet? 😁

It is not tomorrow (that will be tomorrow), but it is totally my fault. I’m sorry.

LGTM, and I very much appreciate the improved error detection in the Containerfile scripts.

On the shadow-utils question, feel free to just merge as is if it is not something you happen to know about.

# directories used by dnf that are just taking
# up space.
RUN dnf -y update && \
rpm --setcaps shadow-utils 2>/dev/null && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you happen to have a reference to why this is necessary? I could only find RHEL bugs and at least https://bugzilla.redhat.com/show_bug.cgi?id=1995337#c3 (private) suggests this should have been fixed, for RHEL.

(Absolutely non-blocking, we didn’t have a reference before.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something Dan knows about, we do the exact same thing in the podman and buildah images. It replaces an old restorecon command we had for a long while. I think I might be able to dig up a reference discussion at least (which may be important for your bugzilla finding)...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...ahh it was Nalin, containers/podman#14437 (comment) and is Fedora specific. Though on second reading, we probably don't need this in a skopeo image. I'll remove it to save anybody from wondering or looking for related issues/documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… and looking at https://src.fedoraproject.org/container/fedora-container-image/blob/rawhide/f/image-build.conf (although the F30 references are quite suspect, so I might be looking at the wrong place), the necessary tar options seem to be missing. Thanks, that explains things.


As a maintainer, my primary concern with a workaround is being able to maintain it over time. I.e. “how do I know that the workaround broke, and how can I verify that it was fixed again”. This thread definitely helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnovy do you know if https://bugzilla.redhat.com/show_bug.cgi?id=1995337#c3 still affects Fedora, or where to confirm?

The github actions workflow for this operation is complex and difficult
to maintain.  For several months now a replacement has been running well
in the podman repository.  It's scripts/components are centralized,
versioned, unit, and integration tested.  Add cirrus tasks to run the
build, and another task to allow test builds in a PR.

This also adds support for a new magic CI string: `[CI:BUILD]`.
With this string in the PR title, automation will only do basic build
verification, and enable testing of the multi-arch build process.

Otherwise, many tasks were updated to not be created when running the
cirrus-cron multi-arch image builds, since this would simply be a waste
of time and invitation for flakes.

Lastly, since only native tooling is used in the new build process,
rename all the recipes to `Containerfile`.

Signed-off-by: Chris Evich <[email protected]>
@cevich cevich force-pushed the multiarch_build branch from cd63cee to 6ae06f8 Compare June 28, 2022 21:43
@cevich
Copy link
Member Author

cevich commented Jun 28, 2022

Force-push: Rebased, removed rpm --setcaps lines as nobody should ever need to actually run a nested container inside a skopeo image.

@mtrmac
Copy link
Contributor

mtrmac commented Jun 28, 2022

Force-push: Rebased, removed rpm --setcaps lines as nobody should ever need to actually run a nested container inside a skopeo image.

Skopeo does have a case (copies from/to containers-storage: when the process doesn’t have enough capabilities) where it would trigger a rootless run, AFAICS needing newuidmap and the like.

I have no data to suggest whether it’s used frequently, rarely, or not at all.

But actually a fairly large part of the Containerfile deals with making local storage usable (configuring FUSE, and especially setting up /etc/subuid) — so dropping just the newuidmap part seems inconsistent. Admittedly that’s not quite evidence that it was intended to be used, maybe it was just copy&pasted…

Still, maybe, now that we know what it was for, it might be easier to leave the workaround in than to worry too much about whether we need it, and to re-visit that every time the three projects’ Containerfiles are compared.

@cevich
Copy link
Member Author

cevich commented Jun 29, 2022

@mtrmac Ack. I can put it back in with a comment and references if needed. Then do the same thing for both podman and buildah (which also use this same workaround).

@mtrmac
Copy link
Contributor

mtrmac commented Jun 29, 2022

At this point, I’m indifferent between putting that back, and waiting for #1661 (comment) so that we know whether it’s necessary (and possibly to nudge Fedora WRT fixing this in the images). Up to you, I don’t know how merging much this PR makes your work easier/harder.

These changes substantially mirror similar updates made recently to both
podman and buildah.  Besides renaming `Dockerfile` -> `Containerfile`,
there are much needed updates to docs, and the build instructions.

Signed-off-by: Chris Evich <[email protected]>
@cevich cevich force-pushed the multiarch_build branch from 6ae06f8 to 2024e2e Compare June 29, 2022 15:42
@cevich
Copy link
Member Author

cevich commented Jun 29, 2022

I don’t know how merging much this PR makes your work easier/harder.

Keeping the Containerfiles consistent with podman & buildah is probably my only argument for putting it back. Certainly adding a comment with references also seems important. Also, since you raised there is a case where a rootless container is run (news to me).

@mtrmac mtrmac merged commit bc84a02 into containers:main Jun 29, 2022
@mtrmac
Copy link
Contributor

mtrmac commented Jun 29, 2022

For future reference, at least https://www.redhat.com/sysadmin/how-run-skopeo-container documents using containers-storage: with the Skopeo container, but not relying on the rootless mode feature; instead, the container is run with --privileged to avoid that mode.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants