Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc delete -f: fix for cg v1 + paused container #3217

Merged
merged 1 commit into from
Sep 20, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Sep 15, 2021

runc delete -f is not working for a paused container, since in cgroup v1
SIGKILL does nothing if a process is frozen (unlike cgroup v2, in which
you can kill a frozen process with a fatal signal).

Theoretically, we only need this for v1, but doing it for v2 as well is
OK.

Proposed changelog entry

Bugfixes:
* runc delete -f now succeeds (rather than timeouts) on a paused container (#3134)

Fixes: #3134

runc delete -f is not working for a paused container, since in cgroup v1
SIGKILL does nothing if a process is frozen (unlike cgroup v2, in which
you can kill a frozen process with a fatal signal).

Theoretically, we only need this for v1, but doing it for v2 as well is
OK.

Signed-off-by: Kir Kolyshkin <[email protected]>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if s, ok := s.(unix.Signal); ok && s == unix.SIGKILL {
_ = c.cgroupManager.Freeze(configs.Thawed)
}
}
return nil
}
return ErrNotRunning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tried, but what happens currently if -f is used and it's not running? (thinking: -f should ignore the case and just proceed?) I see we have a special case for all

Copy link
Member

@cyphar cyphar Sep 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for not allowing kill even if you force it is that on paper you could trick runc to kill an arbitrary process. I think an actual attack that uses this is a bit theoretical but shrug. You can't signal a process that doesn't exist.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// For cgroup v1, killing a process in a frozen cgroup
// does nothing until it's thawed. Only thaw the cgroup
// for SIGKILL.
if s, ok := s.(unix.Signal); ok && s == unix.SIGKILL {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would've been nice to check if the cgroup manager is v1 or v2 here, but still LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to use cgroup.kill for v2 and hybrid hierarchies, this code is just to fix the issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I get that -- my point is more that it would've been nice if we didn't thaw the cgroup unless we really had to.

@kolyshkin kolyshkin merged commit 03244ef into opencontainers:master Sep 20, 2021
@kolyshkin
Copy link
Contributor Author

Fixes: #3134

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runc delete -f fails to remove a paused container on cgroup v1
3 participants