-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make use of systemd delegate cgroup when possible. #18211
Comments
Another challenge with setting delegation is that we have a |
This was the bit from that document:
In an experiment I'm hacking on, turns out this doesn't really matter because we're not creating a "slice unit", we're just creating our own cgroup directory that happens to be called "slice". If we did want to create a slice unit in the package (which we will if we want to allow less-privileged Nomad agents), we'd instead want to have 2 |
Nomad clients manage a cpuset cgroup for each task to reserve or share CPU cores. But Docker owns its own cgroups, and attempting to set a parent cgroup that Nomad manages runs into conflicts with how runc manages cgroups via systemd. Therefore Nomad must run as root in order for cpuset management to ever be compatible with Docker. However, some users running in unsupported configurations felt that the changes we made in Nomad 1.7.0 to ensure Nomad was running correctly represented a regression. This changeset disables cpuset management for non-root Nomad clients. When running Nomad as non-root, the driver will not longer reconcile cpusets with Nomad and `resources.cores` will behave incorrectly (but the driver will still run). Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. This PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: #18211 Ref: #13669 Ref: https://hashicorp.atlassian.net/browse/NET-10652 Ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md
During Nomad client initialization with cgroups v2, we assert that the required cgroup controllers are available in the root `cgroup.subtree_control` file by idempotently writing to the file. But if Nomad is running with delegated cgroups, this will fail file permissions checks even if the subtree control file already has the controllers we need. Update the initialization to first check if the controllers are missing before attempting to write to them. This allows cgroup delegation so long as the cluster administrator has pre-created a Nomad owned cgroups tree and set the `Delegate` option in a systemd override. If not, initialization fails in the existing way. Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. I've intentionally not documented setting up cgroup delegation in this PR, as this PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: #18211 Ref: #13669
Nomad clients manage a cpuset cgroup for each task to reserve or share CPU cores. But Docker owns its own cgroups, and attempting to set a parent cgroup that Nomad manages runs into conflicts with how runc manages cgroups via systemd. Therefore Nomad must run as root in order for cpuset management to ever be compatible with Docker. However, some users running in unsupported configurations felt that the changes we made in Nomad 1.7.0 to ensure Nomad was running correctly represented a regression. This changeset disables cpuset management for non-root Nomad clients. When running Nomad as non-root, the driver will not longer reconcile cpusets with Nomad and `resources.cores` will behave incorrectly (but the driver will still run). Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. This PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: #18211 Ref: #13669 Ref: https://hashicorp.atlassian.net/browse/NET-10652 Ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md
…23803) During Nomad client initialization with cgroups v2, we assert that the required cgroup controllers are available in the root `cgroup.subtree_control` file by idempotently writing to the file. But if Nomad is running with delegated cgroups, this will fail file permissions checks even if the subtree control file already has the controllers we need. Update the initialization to first check if the controllers are missing before attempting to write to them. This allows cgroup delegation so long as the cluster administrator has pre-created a Nomad owned cgroups tree and set the `Delegate` option in a systemd override. If not, initialization fails in the existing way. Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. I've intentionally not documented setting up cgroup delegation in this PR, as this PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: #18211 Ref: #13669
Per https://systemd.io/CGROUP_DELEGATION/ (at the bottom)
Currently Nomad does exactly this - it creates the
nomad.slice
cgroup under the root cgroup regardless if systemd is in use or not. We should modify our linux packaging to set the Delegate in the systemd unit file so that we are in line with the expected usage of systemd.However we'll need to continue supporting the mode of operation we have today - not all Linux operating systems use systemd (and thus have no delegate mechanism), and not all users use our Linux packaging. We'll also want to update our production documentation to make recommendations for such users.
The text was updated successfully, but these errors were encountered: