-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] runc exec --cgroup #3040
Comments
The use case for this is dpdk applications that use a subset of the configured cpuset in the main container entirely and they can't tolerate k8s probes (runc exec) running on those cpus. |
How will this relate to subtree_controllers? |
I think it should be orthogonal (except for the "runtime-spec will add an ability to define sub-cgroups" item which is actually not part of this proposal, and only given as an example). The |
how will this work with cgroup v2? The processes can be only in the leaf nodes, do we need to ensure the existing processes are moved to a sibling cgroup first? |
I don't think it's a runc task (unless the sub-cgroups are created by runc itself, but that's not what this RFC proposes). Currently if runc exec process fails to join the top-level container cgroup, it retries with the cgroup of the container init. I guess this fallback should be disabled when |
An initial implementation is available (#3059). Will work on more tests next week but it's good enough to take a look. |
So does it expect the cgroupfs to be mounted writeable? Otherwise how would the container process live in a separate sub-cgroup? |
Yes (in the future though we might implement sub-cgroup support in runtime spec, in which case runc will pre-create those sub-cgroups and writable cgroupfs won't be needed in case the cgroup tree is static). |
Obviously the use case of that (with read-only cgroupfs) would be limited to putting container init in a non-top cgroup, and |
Any container can have sub-cgroups
This is a proposal to add a feature to have
runc exec
executed in a sub-cgroup of a container, rather than its top-level cgroup as it happens now (except for cgroup v2, which has a fallback to join container init's cgroup).For example:
will run
cmd args
inside containerCID
, puttingcmd
in container'sfoo/bar
cgroup, relative to container's top-level cgroup.Obviously, the default value for
--cgroup
is/
, which is how it's working now.In a similar manner, runtime-spec's
Process
structure need to add aCgroup
field with the same meaning as runc exec's--cgroup
. For container init it doesn't make sense to have Cgroup set to any value other than/
. For other execs, it can be changed.One other implementation detail is, I guess
runc exec --cgroup
should NOT create the sub-cgroup if it does not exist, but return an error.The text was updated successfully, but these errors were encountered: