Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builds in docker fail with 'Invalid argument' when doing RUN after VOLUME on overlayfs #3281

Closed
mh21 opened this issue Jun 3, 2021 · 20 comments · Fixed by #3750
Closed

Builds in docker fail with 'Invalid argument' when doing RUN after VOLUME on overlayfs #3281

mh21 opened this issue Jun 3, 2021 · 20 comments · Fixed by #3750

Comments

@mh21
Copy link

mh21 commented Jun 3, 2021

When running buildah in GitLab CI (docker), using RUN after VOLUME fails since buildah 1.20.1.

Dockerfile:

FROM registry.fedoraproject.org/fedora:33

RUN mkdir -p /logs
VOLUME /logs

RUN mkdir -p /data

CMD ["/bin/bash"]

Reproducer: https://gitlab.com/cki-project/experimental/buildah-1-20-1-fc33-volume-issue-reproducer

Works/fails (https://gitlab.com/cki-project/experimental/buildah-1-20-1-fc33-volume-issue-reproducer/-/pipelines/314126067):

  • quay.io/buildah/stable:v1.19.6: OK
  • quay.io/buildah/stable:v1.19.8: OK
  • quay.io/buildah/stable:v1.20.1: FAILS
  • quay.io/buildah/stable:v1.21.0: OK
  • registry.fedoraproject.org/fedora:33 with 1.20.1: FAILS
  • registry.fedoraproject.org/fedora:34 with 1.21.0: FAILS

The important difference between the 1.21.0 quay.io image and the plain fedora:34 seems to be the /var/lib/containers volume (https://quay.io/repository/buildah/stable/manifest/sha256:da34723916c0b53186230af839ff339c6a4f4857ca54c1984e6e927fb4100478). This causes /var/lib/containers to be backed by ext4 (GitLab/docker) instead of the root overlayfs.

Without the volume, the error message is (registry.fedoraproject.org/fedora:34 with 1.21.0, from https://gitlab.com/cki-project/experimental/buildah-1-20-1-fc33-volume-issue-reproducer/-/jobs/1314718416#L331):

$ buildah bud -f Dockerfile .
STEP 1: FROM registry.fedoraproject.org/fedora:33
Trying to pull registry.fedoraproject.org/fedora:33...
Getting image source signatures
Copying blob sha256:2df8ffe0bf4632dc6b3c749a45145c30ba9d92f47853ce3769c94e7c476c1fcc
Copying config sha256:5ca3a10c6c8f69d82c1e7f0cf61e3aef2967874a1a7dc10660a11517a1a9d31a
Writing manifest to image destination
Storing signatures
STEP 2: RUN mkdir -p /logs
STEP 3: VOLUME /logs
STEP 4: RUN mkdir -p /data
error running container: error from crun creating container for [/bin/sh -c mkdir -p /data]: mount `/var/lib/containers/storage/overlay-containers/b023bcc6ce049afe129148bd6c6fd4b576b0e84f682f479bf89dfc97e7315e6b/userdata/overlay/100391584/merge` to `/logs`: Invalid argument
: exit status 1
error building at STEP "RUN mkdir -p /data": error while running runtime: exit status 1

Originally posted by @inakimalerba in #3153 (comment)

@github-actions
Copy link

github-actions bot commented Jul 4, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jul 4, 2021

@giuseppe PTAL

@mh21
Copy link
Author

mh21 commented Jul 23, 2021

Still fails on rawhide with buildah 1.22.0-0.13.dev.gitec35bc4.fc35.

@rhatdan
Copy link
Member

rhatdan commented Jul 25, 2021

@giuseppe what do you think the invalid option failure is?

@savitojs
Copy link

savitojs commented Aug 2, 2021

This affects us as well. One of our customer would like to use buildah image in the GitLab CI (docker executor) to build images.

@savitojs
Copy link

savitojs commented Aug 2, 2021

Environment details

docker-ce-rootless-extras-20.10.7-3.el7.x86_64
libseccomp-2.3.1-4.el7.x86_64
docker-ce-cli-20.10.7-3.el7.x86_64
docker-ce-20.10.7-3.el7.x86_64
containers-common-0.1.40-12.el7_9.x86_64
docker-scan-plugin-0.8.0-3.el7.x86_64
containerd.io-1.4.6-3.1.el7.x86_64
libseccomp-devel-2.3.1-4.el7.x86_64
gitlab-runner-14.0.1-1.x86_64
container-selinux-2.119.2-1.911c772.el7_8.noarch

Kernel:

$ uname -r

3.10.0-1160.25.1.el7.x86_64

OS Details: Red Hat Enterprise Linux Server release 7.9 (Maipo)

We have a GitLab runner where everyone can run jobs internally using GitLab using specific tags. GitLab runner project supports docker as a container platform. Podman is not yet supported, however, Red Hat and GitLab working together to add podman as a supported runner option.

Important:

  • GitLab runners are not using docker socket to mount for the dind.
  • DinD is not supported in any way due to risks involved.
  • Runners are consuming docker with privileded=false mode.

Problem Statement

One of our customers would like to build the docker images using buildah within GitLab CI pipeline. They are using buildah latest docker image.

Steps to reproduce

  1. Pull the docker image
$ docker run --rm -it -v /var/lib/containers4:/var/lib/containers:z -v /root/Dockerfile:/Dockerfile --security-opt seccomp=unconfined -v  /dev/fuse:/dev/fuse --device /dev/fuse --security-opt seccomp=unconfined --security-opt label=disabled quay.io/buildah/stable bash
  1. Create any Dockerfile such as:
FROM registry.fedoraproject.org/fedora:32

RUN useradd sav
  1. Try to build the image
buildah bud -t tft-cnc -f Dockerfile .
  1. Build error:
buildah bud -t tft-cnc -f Dockerfile .
STEP 1: FROM registry.fedoraproject.org/fedora:32
STEP 2: RUN useradd sav
error building at STEP "RUN useradd sav": error mounting container "3fb88b7046d033103702d16757a7ad7e274d483dc6511d9746240573bd398beb": error mounting build container "3fb88b7046d033103702d16757a7ad7e274d483dc6511d9746240573bd398beb": failed to canonicalise path for "/var/lib/containers/storage/overlay/27bc8611f3d6b7a1ae7715d23c5afb8bb28d08c807ece11eda102d89759e3a67/merged": lstat /var/lib/containers/storage/overlay/27bc8611f3d6b7a1ae7715d23c5afb8bb28d08c807ece11eda102d89759e3a67/merged: invalid argument
ERRO exit status 125

@mh21
Copy link
Author

mh21 commented Aug 3, 2021

Related to #3263.

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2021

@giuseppe is this one of the new flags being added when it shoud not?

@giuseppe
Copy link
Member

giuseppe commented Aug 4, 2021

could be a dup of containers/fuse-overlayfs#311

@jerryTJ
Copy link

jerryTJ commented Jan 25, 2022

buildah bud --tls-verify=false --layers --format docker -f Dockerfile -t docker-image-name
buildah:v1.20.1,
image
docker:19.03.12
image

@giuseppe
Copy link
Member

I've just tried with buildah 1.23 from Fedora and I cannot reproduce the issue:

# uname -a
Linux centos-s-1vcpu-2gb-fra1-01 3.10.0-1160.45.1.el7.x86_64 #1 SMP Wed Oct 13 17:20:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# docker --version
Docker version 1.13.1, build 7d71120/1.13.1

# cat /root/Dockerfile 
FROM registry.fedoraproject.org/fedora:33

RUN mkdir -p /logs
VOLUME /logs

RUN mkdir -p /data

CMD ["/bin/bash"]

# docker run --rm -it -v /var/lib/containers4:/var/lib/containers:z -v  /root/Dockerfile:/Dockerfile --security-opt seccomp=unconfined -v  /dev/fuse:/dev/fuse --device /dev/fuse --security-opt seccomp=unconfined --security-opt label=disable quay.io/buildah/stable bash

# buildah bud --tls-verify=false --layers --format docker -f Dockerfile              
STEP 1/5: FROM registry.fedoraproject.org/fedora:33
Trying to pull registry.fedoraproject.org/fedora:33...
Getting image source signatures
Copying blob 2861e4f9ec1a done  
Copying config 3e786a3571 done  
Writing manifest to image destination
Storing signatures
STEP 2/5: RUN mkdir -p /logs
--> 5bc988056fe
STEP 3/5: VOLUME /logs
--> 506e6b0b3c5
STEP 4/5: RUN mkdir -p /data
--> c6aa1e56a45
STEP 5/5: CMD ["/bin/bash"]
COMMIT
--> e3178a7e482
e3178a7e4824c44402c4876b0fc8e5d21f32777ff80cfd0347108e9b247e0ccc

@mh21
Copy link
Author

mh21 commented Jan 25, 2022

Hi @giuseppe iiuc, you are passing through /var/libcontainers from the host, which means it will not be hosted on overlayfs?

@mh21
Copy link
Author

mh21 commented Jan 25, 2022

I updated the reproducer to FC35 at https://gitlab.com/cki-project/experimental/buildah-1-20-1-fc33-volume-issue-reproducer/-/jobs/2013629982, still failing (as expected).

@giuseppe
Copy link
Member

sorry I got confused by the reproducer here: #3281 (comment)

Let me try without a volume

@giuseppe
Copy link
Member

I still cannot reproduce locally.

I think you need a cp /usr/share/containers/storage.conf /etc/containers/storage.conf before you run sed though. /etc/containers/storage.conf is not created by default.

@giuseppe
Copy link
Member

also, how is the /dev/fuse device passed inside the container? That is required before you can use fuse-overlayfs

@mh21
Copy link
Author

mh21 commented Jan 25, 2022

for the fuse device, gitlab-runner uses privileged containers on single-use VMs in GCP.

@mh21
Copy link
Author

mh21 commented Jan 25, 2022

@mh21
Copy link
Author

mh21 commented Jan 25, 2022

As another data point, moving /var/lib/containers to ext4 via a symlink to /builds (which is a docker volume on GitLab jobs) fixes the FC35 failure: https://gitlab.com/cki-project/experimental/buildah-1-20-1-fc33-volume-issue-reproducer/-/jobs/2014709415

giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 26, 2022
pass down the graph options to the chroot backend, so that if the
storage driver is configured to use a mount program for overlay, it is
honored for volumes as well.

Closes: containers#3281

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe
Copy link
Member

thanks, I think I see what is going on.

Opened a PR: #3750

giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 26, 2022
pass down the graph options to the chroot backend, so that if the
storage driver is configured to use a mount program for overlay, it is
honored for volumes as well.

Closes: containers#3281

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 26, 2022
pass down the graph options to the chroot backend, so that if the
storage driver is configured to use a mount program for overlay, it is
honored for volumes as well.

Closes: containers#3281

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 26, 2022
pass down the graph options to the chroot backend, so that if the
storage driver is configured to use a mount program for overlay, it is
honored for volumes as well.

Closes: containers#3281

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 27, 2022
if a mountProgram is specified, use it also in rootfull mode.

Closes: containers#3281

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/buildah that referenced this issue Jan 27, 2022
if a mountProgram is specified, use it also in rootfull mode.

Closes: containers#3281

[NO NEW TESTS NEEDED]

Signed-off-by: Giuseppe Scrivano <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants