Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default cgroup driver to systemd and verify parity w/docker on preflight #1394

Closed
timothysc opened this issue Feb 7, 2019 · 15 comments · Fixed by kubernetes/website#12638
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Milestone

Comments

@timothysc
Copy link
Member

timothysc commented Feb 7, 2019

Default installs of docker still use cgroupfs and most of our supported userbase is on systemd systems, we should change the defaults and update instructions.

New Installs:
1 - Update the instructions for installation of docker to use --exec-opt native.cgroupdriver=systemd
2 - Verify that the kubelet.service file sets systemd flag to the kubelet.

Upgrades:
TBD

Preflight:
1 - Verify if the system is systemd'd that the docker flags are correct and if they are not, start with a warning

@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Feb 7, 2019
@timothysc timothysc added this to the v1.14 milestone Feb 7, 2019
@neolit123
Copy link
Member

we already have docs for setting systemd as the driver for both ubuntu and centos:
https://kubernetes.io/docs/setup/cri/

i think this preflight check should be warning only.

@rosti
Copy link

rosti commented Feb 7, 2019

Actually, there is no mention there on what and why we do this.

# Setup daemon.
cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

@neolit123
Copy link
Member

yes, we need an expert to write a paragraph on why it's really needed and what could break.

@timothysc timothysc removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Feb 7, 2019
@neolit123 neolit123 added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Feb 8, 2019
@neolit123
Copy link
Member

/assign @mauilion
to drop us a quick write up why the users should match "systemd" everywhere.
i can send a docs PR for that.

@k8s-ci-robot
Copy link
Contributor

@neolit123: GitHub didn't allow me to assign the following users: mauilion.

Note that only kubernetes members and repo collaborators can be assigned and that issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @mauilion
to drop us a quick write up why the users should match "systemd" everywhere.
i can send a docs PR for that.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mauilion
Copy link

mauilion commented Feb 12, 2019

When systemd is chosen as the init system for a linux distribution. The init process generates and consumes a root cgroup and acts as a cgroup manager. Systemd has a tight integration with cgroups and will allocate cgroups per process. While it's possible to configure docker and kubelet to use cgroupfs this means that there will then be two different cgroup managers. At the end of the day, cgroups are used to allocate and constrain resources that are allocated to processes. A single cgroup manager will simplify the view of what resources are being allocated and will by default have a more consistent view of the resources available / in use. When we have two managers we end up with two views of those available resources. We have seen cases in the field where nodes that are configured to use cgroupfs for kubelet and docker and systemd for the rest can become unstable under resource pressure. Changing the settings such that docker and kubelet use systems as a cgroup-driver stabilized the systems.

An issue opened with systemd that discusses this at some length:
systemd/systemd#8645

@mauilion
Copy link

mauilion commented Feb 13, 2019

/assign @timothysc please take a look

@timothysc
Copy link
Member Author

lgtm, @mauilion did you want to make the PR?

@neolit123
Copy link
Member

@timothysc @mauilion i will do the PR later tonight.

@neolit123
Copy link
Member

/assign
/kind documentation

@vsxen
Copy link

vsxen commented Apr 4, 2019

why use systemd?

the kubelet default is cgroupsfs

--cgroup-driver string Driver that the kubelet uses to manipulate cgroups on the host. Possible values: 'cgroupfs', 'systemd' (default "cgroupfs")

@neolit123
Copy link
Member

please read here:
#1394 (comment)

@ssup2
Copy link

ssup2 commented May 12, 2020

@mauilion Thank for the information about cgroup driver. But i have question about your mention " We have seen cases in the field where nodes that are configured to use cgroupfs for kubelet and docker and systemd for the rest can become unstable under resource pressure."

Why using both are unstable?? I think that, from the kernel's point of view, it looks the same as a cgroup created using cgroupfs driver and a cgroup created by systemd. Are there any keywords that I can track on this issue? Or please introduce history or status about this issue?? Thanks

@itkroplis
Copy link

itkroplis commented May 17, 2020

we already have docs for setting systemd as the driver for both ubuntu and centos:
https://kubernetes.io/docs/setup/cri/

i think this preflight check should be warning only.

And how are steps for Ubuntu 20? I watch this page but anything I do doesn't work for me.
Maybe this faq work only for new installation? Where isn't dockerd installed yet?

@b10s
Copy link

b10s commented Dec 21, 2020

@mauilion great explanation! All seems to be correct except the view in the following part:

A single cgroup manager will simplify the view of what resources are being allocated and will by default have a more consistent view of the resources available / in use. When we have two managers we end up with two views of those available resources

If we replace view with read it will be more clear why it's not fully correct.

The problem is because the cgroups manager is not only a viewer reader but also a writer.
If my understanding is correct : )

By writing to /sys/fs/cgroup/systemd/ such cgroup as kubepods we do confuse systemd.

Therefore allowing users to have cgroupfs driver on systemd machines isn't only not recommended but should be, in my humble opinion, prohibited by kubelet with preflight check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants