Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemctl daemon-reload causes cpu.cfs_quota_us of existing containers to be set to a wrong value #1605

Open
goyalankit opened this issue Oct 6, 2017 · 10 comments

Comments

@goyalankit
Copy link

goyalankit commented Oct 6, 2017

In the current version of runc, when using --systemd-cgroup, the cpu.cfs_quota_us of existing containers gets reset to an incorrect value after a systemctl daemon-reload and a creation of new container.


Steps to reproduce

Relevant parts of config.json (full config.json attached as a file)

{
  ...
  "linux": {
    "cgroupsPath": "custom.slice:custom:app_1",
    "resources": {
      "cpu": {
        "period": 200000,
        "shares": 1024,
        "quota": 200000
      },
  ...
}
// Create a container with above config
sh-4.2# runc --systemd-cgroup create container_1

// Check the cpu.cfs_quota_us (looks good)
sh-4.2# cat /sys/fs/cgroup/cpu/custom.slice/custom-app_1.scope/cpu.cfs_quota_us
200000

// Do a daemon reload
sh-4.2# systemctl daemon-reload

// Create another container (after changing the cgroupPath in config.json)
sh-4.2# runc --systemd-cgroup create container_2

// Check the cpu.cfs_quota_us of app1, it gets set to a different value
sh-4.2# cat /sys/fs/cgroup/cpu/custom.slice/custom-app_1.scope/cpu.cfs_quota_us
100000

// Quota for new app looks good though.
sh-4.2# cat /sys/fs/cgroup/cpu/custom.slice/custom-app_2.scope/cpu.cfs_quota_us
200000

Runc version:

# runc --version
runc version 1.0.0-rc4+dev
commit: 0351df1c5a66838d0c392b4ac4cf9450de844e2d
spec: 1.0.0

Same issue with rc3, rc4 and head versions.

Config.json:
https://gist.github.com/goyalankit/4382be41c66579ad7833a983e634aa32#file-config-json

@goyalankit goyalankit changed the title systemd-reload causes cpu.cfs_quota_us of existing containers to be set to a wrong value systemctl daemon-reload causes cpu.cfs_quota_us of existing containers to be set to a wrong value Oct 6, 2017
@cyphar
Copy link
Member

cyphar commented Oct 7, 2017

As much as you're not going to like this response, this is likely caused by using --systemd-cgroup. Usually the default cgroup driver (cgroupfs) is more stable because we don't tell systemd about our container processes -- we've had a very large number of bugs in the past caused by systemd trying (and failing) to be clever.

@goyalankit
Copy link
Author

Right, it works fine without the --systemd-cgroup.

However, it's a good feature to have since it gives a clean way to put all runc containers in their own separate slice without moving processes around different cgroups. I guess the workaround for now is to move the process running the runc command to custom.slice before creating containers.

@cyphar
Copy link
Member

cyphar commented Oct 7, 2017

You could explicitly try explicitly stating the full slice path in cgroupsPath. What version of systemd are you using, does it have Delegate support?

@goyalankit
Copy link
Author

goyalankit commented Oct 8, 2017

sh-4.2# systemctl --version
systemd 219

Do you mean something like this?

 "cgroupsPath": "/sys/fs/cgroup/cpu/custom2.slice/app_1.scope",

Above config change has a weird behaviour. runc created the container successfully but I don't see the cgroup being created.

// Changed the config with a full cgroups path
sh-4.2# runc create container_1

// list the container to get the PID
sh-4.2# runc list
ID            PID         STATUS      BUNDLE                             CREATED                          OWNER
container_1   20604       created     runc_trunk   2017-10-07T23:56:38.950953923Z   root

// Check that cgroup path
sh-4.2# cat /proc/20604/cgroup
5:cpuacct,cpu:/sys/fs/cgroup/cpu/custom.slice/app_1.scope
4:memory:/sys/fs/cgroup/cpu/custom.slice/app_1.scope
3:blkio:/sys/fs/cgroup/cpu/custom.slice/app_1.scope
1:name=systemd:/sys/fs/cgroup/cpu/custom.slice/app_1.scope
...

sh-4.2# ls /sys/fs/cgroup/cpu/custom.slice/app_1.scope
ls: cannot access /sys/fs/cgroup/cpu/custom.slice/app_1.scope: No such file or directory

The cgroup doesn't exist but the proc says it does. Not sure what happened there 😱

@goyalankit
Copy link
Author

So I tried giving this as path and that works:

"cgroupsPath": "/custom.slice/app_1.scope"

So that should provide the same behavior as using --systemd-cgroup flag.

Thanks!

@cyphar
Copy link
Member

cyphar commented Oct 12, 2017

@goyalankit The behavior is slightly different (with --systemd-cgroup we actually create a TransientUnit with systemd, and systemd doesn't always put processes in the cgroup that you tell it to) but it should work in the same way. But yes, you don't pass /sys/fs/cgroup, you just pass the name inside the cgroup hierarchy.

@runcom
Copy link
Member

runcom commented Oct 12, 2017

Do we know why this is happening though? I can reproduce this on RHEL 7.4 as well 😕

@cyphar
Copy link
Member

cyphar commented Oct 15, 2017

@runcom It's some kind of systemd bug. We've seen bugs like this before Delegate was implemented, and I'm a little surprised that it happens now. I would ask upstream systemd about this -- it looks like they're not obeying Delegate=true with TransientUnits. This is even more worrying because we tell systemd about cpu.cfs_quota_us which means that this is a breakage of something systemd is supposed to support (so even without Delegate it still shouldn't break!).

shouts at monitor This really is getting quite tiring. Maybe we should stop supporting --systemd-cgroup -- it's caused us nothing but trouble.

@runcom
Copy link
Member

runcom commented Oct 15, 2017

@rhatdan ptal, same issue we're having

@rhatdan
Copy link
Contributor

rhatdan commented Oct 16, 2017

Uli has dones alot of analysys on this issue in the bugzilla, has https://bugzilla.redhat.com/show_bug.cgi?id=1455071 Does this help at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants