-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerd's TestPodUserNS
fails with runc v1.2 (succeeds with crun) on SELinux distro: setxattr /[...]/dev/mqueue: operation not permitted
#4466
Comments
TestPodUserNS
fails with runc v1.2 (succeeds with crun): setxattr /[...]/dev/mqueue: operation not permitted
TestPodUserNS
fails with runc v1.2 (succeeds with crun) on SELinux distro: setxattr /[...]/dev/mqueue: operation not permitted
Sorry, "succeeds with crun" wasn't true. It wasn't tested with crun + SELinux. |
Just for the record, the bug seems to be in containerd, see here for more info: containerd/containerd#10877 (comment) The summary is: containerd recently changed how user namespaces are created and it seems this is a bug with that. Before rc5 it was asking runc to create the userns (and all the namespaces), but in rc5 containerd changed to create the userns themselves (it makes sense for different reasons, although at rc5 is more questionable). However, the userns and netns are created together in one unshare call, and that seems to break with SELinux distros when we want to the mqueue mount later. When runc was doing it, it has a special handling for that (apparently a kernel bug that will be nice to fix at some point): runc/libcontainer/nsenter/nsexec.c Lines 866 to 871 in e37371e
|
If we can do integration test with |
I actually ran the test with crun + SELinux, and it seems green. |
fix: opencontainers#4466, in containerd, the net and user ns has been created before start the container, and let run join these two ns when starting the init process, it works for normal system, except the system with selinux enabling and has mount label configed. We can resolve it with two steps: 1. Join the user ns after joined all other namespaces; 2. If we have joined a user ns path, we should also become root in the namespace, like what we do in unsharing a new user ns. Signed-off-by: lifubang <[email protected]>
fix: opencontainers#4466, in containerd, the net and user ns has been created before start the container, and let runc join these two ns when starting the init process, it works for normal system, except the system with selinux enabling and has mount label configed. We can resolve it with two steps: 1. Join the user ns after joined all other namespaces; 2. If we have joined a user ns path, we should also become root in the namespace, like what we do in unsharing a new user ns. Signed-off-by: lifubang <[email protected]>
fix: opencontainers#4466, in containerd, the net and user ns has been created before start the container, and let runc join these two ns when starting the init process, it works for normal system, except the system with selinux enabling and has mount label configed. We can resolve it with two steps: 1. Join the user ns after joined all other namespaces; 2. If we have joined a user ns path, we should also become root in the namespace, like what we do in unsharing a new user ns. Signed-off-by: lifubang <[email protected]>
@AkihiroSuda Could you please help to see whether #4473 has fixed your issue or not?
```bash
root@iZrj92lfz91pzit984cd5tZ:~/go/src/github.com/containerd/containerd# /usr/local/go/bin/go test --count=1 -v -test.v -timeout 30s -run ^TestPodUserNS$ github.com/containerd/containerd/v2/integration
=== RUN TestPodUserNS
=== RUN TestPodUserNS/userns_gid_mapping
pod_userns_linux_test.go:246: Create a sandbox with userns
time="2024-10-24T23:15:45+08:00" level=info msg="Using the following image list: {Alpine:ghcr.io/containerd/alpine:3.14.0 BusyBox:ghcr.io/containerd/busybox:1.36 Pause:registry.k8s.io/pause:3.10 ResourceConsumer:registry.k8s.io/e2e-test-images/resource-consumer:1.10 VolumeCopyUp:ghcr.io/containerd/volume-copy-up:2.2 VolumeOwnership:ghcr.io/containerd/volume-ownership:2.1 ArgsEscaped:cplatpublic.azurecr.io/args-escaped-test-image-ns:1.0 DockerSchema1:registry.k8s.io/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff}"
main_test.go:731: Image "ghcr.io/containerd/busybox:1.36" already exists, not pulling.
pod_userns_linux_test.go:274: Create a container for userns
pod_userns_linux_test.go:283: Start the container
pod_userns_linux_test.go:286: Wait for container to finish running
pod_userns_linux_test.go:301: Running check function
=== RUN TestPodUserNS/rootfs_permissions
pod_userns_linux_test.go:246: Create a sandbox with userns
main_test.go:731: Image "ghcr.io/containerd/busybox:1.36" already exists, not pulling.
pod_userns_linux_test.go:274: Create a container for userns
pod_userns_linux_test.go:283: Start the container
pod_userns_linux_test.go:286: Wait for container to finish running
pod_userns_linux_test.go:301: Running check function
=== RUN TestPodUserNS/volumes_permissions
pod_userns_linux_test.go:246: Create a sandbox with userns
main_test.go:731: Image "ghcr.io/containerd/busybox:1.36" already exists, not pulling.
pod_userns_linux_test.go:274: Create a container for userns
pod_userns_linux_test.go:283: Start the container
pod_userns_linux_test.go:286: Wait for container to finish running
pod_userns_linux_test.go:301: Running check function
=== RUN TestPodUserNS/fails_with_several_mappings
pod_userns_linux_test.go:246: Create a sandbox with userns
E1024 23:15:49.408259 255944 remote_runtime.go:132] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to create network namespace for sandbox "b2c5d638eff48193d7d4188a22c938845f4dacaa7828196d03668d4534d540e8": required only one uid mapping, but got 2 uid mapping(s)
=== RUN TestPodUserNS/userns_uid_mapping
pod_userns_linux_test.go:246: Create a sandbox with userns
main_test.go:731: Image "ghcr.io/containerd/busybox:1.36" already exists, not pulling.
pod_userns_linux_test.go:274: Create a container for userns
pod_userns_linux_test.go:283: Start the container
pod_userns_linux_test.go:286: Wait for container to finish running
pod_userns_linux_test.go:301: Running check function
--- PASS: TestPodUserNS (5.91s)
--- PASS: TestPodUserNS/userns_gid_mapping (1.51s)
--- PASS: TestPodUserNS/rootfs_permissions (1.45s)
--- PASS: TestPodUserNS/volumes_permissions (1.46s)
--- PASS: TestPodUserNS/fails_with_several_mappings (0.01s)
--- PASS: TestPodUserNS/userns_uid_mapping (1.48s)
PASS
ok github.com/containerd/containerd/v2/integration 5.938s
root@iZrj92lfz91pzit984cd5tZ:~/go/src/github.com/containerd/containerd# cat /etc/containerd/config.toml
version = 2
[plugins] Process contexts: File contexts:
|
fix: opencontainers#4466, in containerd, for user ns pod, the net and user ns has been created before start the container, and let runc join these two ns when starting the init process, it works for normal systems, except systems with selinux enabling and has mount label configed. We can resolve it with two steps: 1. Join the user ns after joined all other namespaces, there may be some namespaces are not owned by the user namespace; 2. Should also become root in the namespace, if we have joined a user ns path like what we do in unsharing a new user ns. Signed-off-by: lifubang <[email protected]>
This is just to run CI in order to check if opencontainers/runc#4474 fixes opencontainers/runc#4466. Signed-off-by: Kir Kolyshkin <[email protected]>
This is just to run CI in order to check if opencontainers/runc#4474 fixes opencontainers/runc#4466. Signed-off-by: Kir Kolyshkin <[email protected]>
Containerd pre-creates userns and netns before calling runc, which results in the current code not working when SELinux is enabled, resulting in the following error: > runc create failed: unable to start container process: error during container init: error mounting "mqueue" to rootfs at "/dev/mqueue": setxattr /path/to/rootfs/dev/mqueue: operation not permitted The solution is to become root in the user namespace right after we join it. Fixes opencontainers#4466. Co-authored-by: Wei Fu <[email protected]> Co-authored-by: Kir Kolyshkin <[email protected]> Co-authored-by: Aleksa Sarai <[email protected]> Signed-off-by: lifubang <[email protected]> (cherry picked from commit c78f3f2) Signed-off-by: Kir Kolyshkin <[email protected]>
On Fedora 40 and Rocky Linux 9, containerd's
TestPodUserNS
fails with the following change on top of the main branch of containerd (containerd/containerd@bc3ce87):Failure:
https://github.com/containerd/containerd/actions/runs/11457221604/job/31880030218?pr=10877
This failure does not happen after reverting:
setgroups
in user namespaces containerd/containerd#10741internal/cri: simplify netns setup with pinned userns containerd/containerd#10607
However, as the same test has been passing for crun without reverting them, probably this issue has to be rather fixed on runc's side.
The text was updated successfully, but these errors were encountered: