Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroupv2 fs driver does not work with default cgroups path #2298

Closed
kolyshkin opened this issue Apr 7, 2020 · 5 comments · Fixed by #2305
Closed

cgroupv2 fs driver does not work with default cgroups path #2298

kolyshkin opened this issue Apr 7, 2020 · 5 comments · Fixed by #2305

Comments

@kolyshkin
Copy link
Contributor

When the cgroupv2 fs driver is used without setting cgroupsPath, it picks up a path from /proc/self/cgroup. On a host with systemd, such path looks like /user.slice/user-1000.slice/session-4.scope.

It fails to enable controllers for sub-cgroup. The error from the kernel is either EBUSY or EOPNOTSUPP. It has probably something to do with systemd, but I'm not sure.

Here's the error (with more debug added):

# runc run -d --console-socket /tmp/console.sock test_cgroups_permissions
CreateCgroupPath: res="+cpuset +cpu +io +memory +hugetlb +pids"
CreateCgroupPath: echo $res > "/sys/fs/cgroup/user.slice/user-1000.slice/session-4.scope/cgroup.subtree_control"
# time="2020-04-07T12:57:57-07:00" level=warning msg="signal: killed"
# time="2020-04-07T12:57:57-07:00" level=error msg="container_linux.go:349: starting container process caused \"process_linux.go:306: applying cgroup configuration for process caused \\\"write /sys/fs/cgroup/user.slice/user-1000.slice/session-4.scope/cgroup.subtree_control: device or resource busy\\\"\""

Here's the same thing, reproduced manually from a shell:

$ sudo -s

# cat /proc/self/cgroup 
0::/user.slice/user-1000.slice/session-4.scope
# cd /sys/fs/cgroup/user.slice/user-1000.slice/session-4.scope

# cat cgroup.controllers 
cpuset cpu io memory hugetlb pids
# CONTROLS="+cpuset +cpu +io +memory +hugetlb +pids"
# echo $CONTROLS > cgroup.subtree_control
-bash: echo: write error: Device or resource busy

# for C in $CONTROLS; do echo $C; echo $C > cgroup.subtree_control; done
+cpuset
+cpu
+io
-bash: echo: write error: Operation not supported
+memory
-bash: echo: write error: Operation not supported
+hugetlb
-bash: echo: write error: Operation not supported
+pids

# uname -a
Linux f31-test 5.6.2-300.fc32.x86_64 #1 SMP Thu Apr 2 18:34:20 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
@kolyshkin
Copy link
Contributor Author

@AkihiroSuda do you have any idea why is that?

@kolyshkin
Copy link
Contributor Author

Another repro, fresh boot, older kernel.

[kir@f31-test ~]$ uname -a
Linux f31-test 5.3.7-301.fc31.x86_64 #1 SMP Mon Oct 21 19:18:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[kir@f31-test ~]$ sudo -s
[sudo] password for kir: 
[root@f31-test kir]# cat /proc/self/cgroup 
0::/user.slice/user-1000.slice/session-2.scope
[root@f31-test kir]# cd /sys/fs/cgroup//user.slice/user-1000.slice/session-2.scope
[root@f31-test session-2.scope]# cat cgroup.controllers 
memory pids
[root@f31-test session-2.scope]# echo +memory > cgroup.subtree_control 
bash: echo: write error: Device or resource busy
[root@f31-test session-2.scope]# tail cgroup.*
==> cgroup.controllers <==
memory pids

==> cgroup.events <==
populated 1
frozen 0

==> cgroup.freeze <==
0

==> cgroup.max.depth <==
max

==> cgroup.max.descendants <==
max

==> cgroup.procs <==
1528
1552
1559
1599
1603
1677

==> cgroup.stat <==
nr_descendants 0
nr_dying_descendants 0

==> cgroup.subtree_control <==

==> cgroup.threads <==
1528
1552
1559
1599
1603
1677

==> cgroup.type <==
domain

@kolyshkin
Copy link
Contributor Author

I suspect this have to do with systemd, but don't understand how (and so, how to fix it).

@kolyshkin
Copy link
Contributor Author

This has been partially addressed by the last commit in #2299, but the issue is still there.

@kolyshkin
Copy link
Contributor Author

OK, the reason for EBUSY is that the cgroup we're trying to create a sub-cgroup in already have children:

from kernel/cgroup/cgroup.c:

        /*
         * Controllers can't be enabled for a cgroup with tasks to avoid
         * child cgroups competing against tasks.
         */
        if (cgroup_has_tasks(cgrp))
                return -EBUSY;

The reason for EOPNOTSUP is similar (see cgroup_vet_subtree_control_enable() in kernel sources.

The fix is coming.

kolyshkin added a commit to kolyshkin/runc that referenced this issue Apr 9, 2020
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Apr 14, 2020
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/runc/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/runc/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/runc/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/containerd-cgroups that referenced this issue Nov 6, 2024
When the cgroupv2 fs driver is used without setting cgroupsPath,
it picks up a path from /proc/self/cgroup. On a host with systemd,
such a path can look like (examples from my machines):

 - /user.slice/user-1000.slice/session-4.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-launched-xfce4-terminal.desktop-4260.scope
 - /user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service

This cgroup already contains processes in it, which prevents to enable
controllers for a sub-cgroup (writing to cgroup.subtree_control fails
with EBUSY or EOPNOTSUPP).

Obviously, a parent cgroup (which does not contain tasks) should be used.

Fixes opencontainers/runc/issues/2298

Signed-off-by: Kir Kolyshkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant