Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroupv2: don't enable threaded mode by default #2390

Merged
merged 1 commit into from
May 13, 2020

Conversation

lifubang
Copy link
Member

@lifubang lifubang commented May 8, 2020

After this commit: 60c647e
Runc enable threaded mode in cgroup v2 by default.
If the cgroupPath is set to a absolute path like /docker/********, the memory subsystem can't be used by this mode.
So, I think we should use domain mode by default.

Signed-off-by: lifubang [email protected]

@lifubang lifubang changed the title cgroupv2: only enable threaded mode when rootless cgroupv2: enable threaded mode only in rootless May 8, 2020
@lifubang
Copy link
Member Author

lifubang commented May 8, 2020

@AkihiroSuda PTAL

@cyphar
Copy link
Member

cyphar commented May 8, 2020

I'm not sure I understand why we're putting cgroupv2 into threaded mode -- @AkihiroSuda is this to work around some odd permission issue (and why does it fix an EPERM)? The description in 60c647e and #2340 is pretty sparse.

// Otherwise ENOTSUP may happen.
cgType := filepath.Join(current, "cgroup.type")
_ = ioutil.WriteFile(cgType, []byte("threaded"), 0644)
if rootless {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cgType := filepath.Join(current, "cgroup.type")
cgTypeB, _ := ioutil.ReadFile(cgType)
if strings.TrimSpace(string(cgTypeB)) == "domain invalid" {
  _ = ioutil.WriteFile(cgType, []byte("threaded"), 0644)
}

@AkihiroSuda
Copy link
Member

@cyphar
Writing threaded to cgroup.type is required when cgroup.type becomes domain invalid, mostly in nested environment

$ sudo podman run -it --rm --privileged --runtime=crun alpine
/ # cd /sys/fs/cgroup/
/sys/fs/cgroup # cat cgroup.controllers 
cpuset cpu io memory pids
/sys/fs/cgroup # cat cgroup.subtree_control 
/sys/fs/cgroup # echo +cpu > cgroup.subtree_control
/sys/fs/cgroup # mkdir foo
/sys/fs/cgroup # cat foo/cgroup.type
domain invalid

@cyphar
Copy link
Member

cyphar commented May 9, 2020

Okay, but "domain invalid" means that the cgroup is in an invalid state (meaning that one of the cgroup rules has been violated -- most likely the no-internal-processes rule). Putting a cgroup into threaded mode doesn't fix that -- it switches it into an alternative mode which only allows controllers which are thread-aware to be enabled (such as cpu). This is why we can't enable the memory controller -- it isn't thread-aware.

IMHO, a more complete solution would be to figure out how to deal with the parent cgroup having child processes (which is a bit dodgy if we're going to move other programs on the system between cgroups) or to simply give an error if we hit domain invalid and explain what the issue is.

@lifubang
Copy link
Member Author

lifubang commented May 9, 2020

or to simply give an error if we hit domain invalid and explain what the issue is.

Personally, I prefer this one. @AkihiroSuda WDYT? Because in man7 cgroups:

2. "Internal" processes are not permitted.  With the exception of the
          root cgroup, processes may reside only in leaf nodes (cgroups that
          do not themselves contain child cgroups).  The details are
          somewhat more subtle than this, and are described below.
Cgroups v2 "no internal processes" rule
       Cgroups v2 enforces a so-called "no internal processes" rule.
       Roughly speaking, this rule means that, with the exception of the
       root cgroup, processes may reside only in leaf nodes (cgroups that do
       not themselves contain child cgroups).  This avoids the need to
       decide how to partition resources between processes which are members
       of cgroup A and processes in child cgroups of A.

       For instance, if cgroup /cg1/cg2 exists, then a process may reside in
       /cg1/cg2, but not in /cg1.  This is to avoid an ambiguity in cgroups
       v1 with respect to the delegation of resources between processes in
       /cg1 and its child cgroups.  The recommended approach in cgroups v2
       is to create a subdirectory called leaf for any nonleaf cgroup which
       should contain processes, but no child cgroups.  Thus, processes
       which previously would have gone into /cg1 would now go into
       /cg1/leaf.  This has the advantage of making explicit the relation‐
       ship between processes in /cg1/leaf and /cg1's other children.

       The "no internal processes" rule is in fact more subtle than stated
       above.  More precisely, the rule is that a (nonroot) cgroup can't
       both (1) have member processes, and (2) distribute resources into
       child cgroups—that is, have a nonempty cgroup.subtree_control file.
       Thus, it is possible for a cgroup to have both member processes and
       child cgroups, but before controllers can be enabled for that cgroup,
       the member processes must be moved out of the cgroup (e.g., perhaps
       into the child cgroups).

       With the Linux 4.14 addition of "thread mode" (described below), the
       "no internal processes" rule has been relaxed in some cases.

@AkihiroSuda
Copy link
Member

SGTM, but when no domain controller is enabled, we can write "threaded" without retuning error

@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from fd7d8a4 to eea6058 Compare May 10, 2020 03:29
@lifubang lifubang changed the title cgroupv2: enable threaded mode only in rootless cgroupv2: don't enable threaded mode by default May 10, 2020
@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from 106fdd8 to 7ca4d5a Compare May 10, 2020 05:32
if strings.TrimSpace(string(cgType)) == "domain invalid" {
cgTypeParentFile := filepath.Join(current, "../cgroup.type")
cgTypeParent, _ := ioutil.ReadFile(cgTypeParentFile)
if bytes.HasPrefix(cgTypeParent, []byte("domain")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we don't need to check parent, we just need to check whether the current config contains domain controller

@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from 1e97975 to 757bb65 Compare May 10, 2020 15:52
@lifubang
Copy link
Member Author

lifubang commented May 10, 2020

echo threaded > "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"

I think rootless doesn't need threaded mode in default?
I don't know whether my opinion is right or not. @AkihiroSuda

@AkihiroSuda
Copy link
Member

Let's check invalid cgroup.type and set threaded conditionally

echo threaded > "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"
if grep -qw invalid "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"; then
echo threaded > "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"
fi
# Make sure cgroup.type doesn't contain "invalid". Otherwise write ops will fail with ENOTSUP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop L106-L110 now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang ^^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed before LGTM.
I think we should keep these comments to let other people know why we need to write threaded to cgroups.type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the echo threaded > still needed? Seems a bit odd to run the entire test suite under threaded.

@AkihiroSuda
Copy link
Member

AkihiroSuda commented May 10, 2020

LGTM

Approved with PullApprove

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor fixes, otherwise good

@AkihiroSuda
Copy link
Member

AkihiroSuda commented May 11, 2020

LGTM

Approved with PullApprove

@AkihiroSuda
Copy link
Member

@kolyshkin LGTY?

tests/rootless.sh Outdated Show resolved Hide resolved
Because in threaded mode, we can't enable the memory controller -- it isn't thread-aware.

Signed-off-by: lifubang <[email protected]>
@cyphar
Copy link
Member

cyphar commented May 13, 2020

LGTM.

Approved with PullApprove

@giuseppe
Copy link
Member

@AkihiroSuda @kolyshkin PTAL

@kolyshkin
Copy link
Contributor

kolyshkin commented May 13, 2020

LGTM

Approved with PullApprove

@kolyshkin kolyshkin merged commit 85d4264 into opencontainers:master May 13, 2020
@lifubang lifubang deleted the threadedordomain branch May 28, 2020 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants