Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg: fix an error of cgroup path removal #4520

Closed
wants to merge 3 commits into from

Conversation

lifubang
Copy link
Member

For cgroup path removal, when we try a fast way to remove the cgroup
path, there may be a small possibility race to return an error, though
the path was removed successfully, for example: EINTR, or other errors.
Then we will fall back to the traditional path walk removal, but there
is a regression if the path has been sucessfully removed, which was
introduced by d3d7f7d in #4102. We should erase the ErrNotExist error
when we open the cgroup path.

Fix: #4518

@lifubang lifubang changed the title fix an error of cgroup path removal libct/cg: fix an error of cgroup path removal Nov 10, 2024
@lifubang
Copy link
Member Author

@AkihiroSuda Could you please help to see this patch has fixed the issue #4518 or not?

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Nov 10, 2024

Sorry, I'm on trip to kubecon, can't test this PR right now

cc @samiam (the original reporter of the issue)

if os.IsNotExist(err) {
// Please keep this error eraser, or else it will return ErrNotExist
// for cgroupv2.
// Please see https://github.com/opencontainers/runc/issues/4518
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to have a bats test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’s hard to simulate the similar error.
But maybe we can split this func, and add a go unit test. I don’t know it’s worth to do or not. But we should verify this patch is correct or not at first.

@AkihiroSuda AkihiroSuda added the backport/1.2-todo A PR in main branch which needs to be backported to release-1.2 label Nov 10, 2024
@samiam
Copy link

samiam commented Nov 10, 2024

It appears to work via buildkit. But I'm new to buildkit, so my testing could be flawed.

Buildkit changes include changing the Dockerfile and the name of the image to ensure I was using my local one.

$ git diff
diff --git a/Dockerfile b/Dockerfile
index e18057683..1cd8ccba8 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -48,7 +48,8 @@ RUN --mount=target=/root/.cache,type=cache \
 FROM gobuild-base AS runc
 WORKDIR $GOPATH/src/github.com/opencontainers/runc
 ARG RUNC_VERSION
-ADD --keep-git-dir=true "https://github.com/opencontainers/runc.git#$RUNC_VERSION" .
+#ADD --keep-git-dir=true "https://github.com/opencontainers/runc.git#$RUNC_VERSION" .
+ADD --keep-git-dir=true "https://github.com/lifubang/runc.git#fix-cgroup-rm-notexist" .
 ARG TARGETPLATFORM
 # gcc is only installed for libgcc
 # lld has issues building static binaries for ppc so prefer ld for it
diff --git a/Makefile b/Makefile
index 24ef2b1ff..e236630fb 100644
--- a/Makefile
+++ b/Makefile
@@ -22,8 +22,8 @@ cross:
 .PHONY: images
 images:
 # moby/buildkit:local and moby/buildkit:local-rootless are created on Docker
-       hack/images local moby/buildkit
-       TARGET=rootless hack/images local moby/buildkit
+       hack/images local moby/buildkittst
+       TARGET=rootless hack/images local moby/buildkittst

 .PHONY: install
 install:

Build local images:

docker buildx prune
make images

Run the tests again:

$ cat Dockerfile
FROM alpine
RUN mkdir /tmp/empty_directory

$ docker run --name buildkitd-local-rootless-tst -d --security-opt seccomp=unconfined   --security-opt apparmor=unconfined   moby/buildkittst:local-rootless --oci-worker-no-process-sandbox

$ buildctl --addr docker-container://buildkitd-local-rootless-tst build --frontend dockerfile.v0 --local context=. --local dockerfile=. --no-cache
[+] Building 0.4s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                                                   0.0s
 => => transferring dockerfile: 80B                                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                                                       0.3s
 => [internal] load .dockerignore                                                                                                                                                                      0.0s
 => => transferring context: 2B                                                                                                                                                                        0.0s
 => CACHED [1/2] FROM docker.io/library/alpine:latest@sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d                                                                          0.0s
 => => resolve docker.io/library/alpine:latest@sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d                                                                                 0.0s
 => [2/2] RUN mkdir /tmp/empty_directory
$

Who knew removing a directory could be so complicated? 😁

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang
Copy link
Member Author

Who knew removing a directory could be so complicated?

Yes, so complicated, because cgroupfs is a pseudo-filesystem, not a real physical fs. Thanks for your check.

// Please keep this error eraser, or else it will return ErrNotExist
// for cgroupv2.
// Please see https://github.com/opencontainers/runc/issues/4518
err = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return nil here? Resetting the err combined with the comment made me go look if the err was used in a defer (or if there was something special); returning nil here makes it more clear it's a regular "we can ignore this".

Copy link
Member

@thaJeztah thaJeztah Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that matter, we should do the same below in the infos loop, and also return early if there's an error;

for _, info := range infos {
	if info.IsDir() {
		// We should remove subcgroup first.
		if err := RemovePath(filepath.Join(path, info.Name())); err != nil {
			return err
		}
	}
}

That would allow use to remove the if err == nil check outside of the loop, and just;

return rmdir(path, true)

❓ Question though; should the infos loop also ignore os.IsNotExist() ?

for _, info := range infos {
	if !info.IsDir() {
		continue
	}
	// We should remove subcgroup first.
	if err := RemovePath(filepath.Join(path, info.Name())); err != nil && !os.IsNotExist(err) {
		return err
	}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question though; should the infos loop also ignore os.IsNotExist() ?

I think there is no chance we will get ErrNotExist error in here.

On that matter, we should do the same below in the infos loop, and also return early if there's an error;

👍

@lifubang lifubang force-pushed the fix-cgroup-rm-notexist branch 2 times, most recently from 80008d7 to af8c523 Compare November 11, 2024 10:13
@kolyshkin
Copy link
Contributor

kolyshkin commented Nov 11, 2024

  1. Can we split this into two commits? First one is the minimal fix (making RemovePath not return ENOENT from ReadDir), second one is a refactor suggested by @thaJeztah. Otherwise it's slightly complex to review.

  2. One interesting thing to learn is what was the original error from rmdir. It's not EINTR because rmdir retries on that. I'll try to reproduce this.

@kolyshkin
Copy link
Contributor

kolyshkin commented Nov 11, 2024

2. One interesting thing to learn is what was the original error from rmdir. It's not EINTR because rmdir retries on that. I'll try to reproduce this.

And the error from rmdir is ... 🥁🥁🥁 ... EROFS ("read-only file system").

Indeed, /sys/fs/cgroup is mounted read-only in the container in question:

cgroup on /sys/fs/cgroup type cgroup2 (ro,nosuid,nodev,noexec,relatime)

I'm not quite sure how the cgroup directory is created in the first place in such case, but it's definitely there. If someone can shed some light that would be great.

My guess is we need some special handing for such case. It's read only meaning we can't remove it no matter what. But it can be gone if there are no processes, so we can wait.

(the readdir logic is obviously wrong and need to be fixed nevertheless)

@lifubang
Copy link
Member Author

If someone can shed some light that would be great.

How does runc write the init process pid to a ro cgroup?

@lifubang
Copy link
Member Author

My guess is we need some special handing for such case. It's read only meaning we can't remove it no matter what. But it can be gone if there are no processes, so we can wait.

Yes, it indeed retry for any errors(except ErrNotExist) in ‘release-1.1’, how about sleep and retry in rmdir for all errors except ENOENT? @kolyshkin

@kolyshkin
Copy link
Contributor

Some of my analysis above are wrong. But let's continue the discussion in the issue (#4518).

lifubang and others added 3 commits November 12, 2024 03:57
When fall back to the traditional path walk removal after rmdir, there
is an error if the path suddenly gone, we should ignore this ErrNotExist
error when we open the cgroup path.

Signed-off-by: lfbzhm <[email protected]>
Co-authored-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: lifubang <[email protected]>
If we remove a non-exist dir in a ro mount point, it will
return EROFS in `unix.Rmdir`, so we need to check first.

Test step:
```bash
root@acmubuntu:/opt/bb# mkdir from to
root@acmubuntu:/opt/bb# touch from/test
root@acmubuntu:/opt/bb# mount -o bind,ro from to
root@acmubuntu:/opt/bb# ls to
test
root@acmubuntu:/opt/bb# touch to/test1
touch: cannot touch 'to/test1': Read-only file system
root@acmubuntu:/opt/bb# mkdir to/test1
mkdir: cannot create directory ‘to/test1’: Read-only file system
root@acmubuntu:/opt/bb# ls to/test1
ls: cannot access 'to/test1': No such file or directory
root@acmubuntu:/opt/bb# rmdir to/test1
rmdir: failed to remove 'to/test1': Read-only file system
root@acmubuntu:/opt/bb#
```

Signed-off-by: lifubang <[email protected]>
@lifubang lifubang removed the backport/1.2-todo A PR in main branch which needs to be backported to release-1.2 label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants