-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drivers/exec: Fix handling of capabilities for unprivileged tasks #16643
Conversation
Currently, the `exec` driver is only setting the Bounding set, which is not sufficient to actually enable the requisite capabilities for the task process. In order for the capabilities to survive `execve` performed by libcontainer, the `Permitted`, `Inheritable`, and `Ambient` sets must also be set. Per CAPABILITIES (7): > Ambient: This is a set of capabilities that are preserved across an > execve(2) of a program that is not privileged. The ambient capability > set obeys the invariant that no capability can ever be ambient if it > is not both permitted and inheritable. Fixes: hashicorp#16642
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @elprans! I was worried we'd have found a vuln here but it looks like even in cases where we did something like cap_drop = ["all"]
we're not getting any capabilities needed to actually do anything with it. So these additional caps have just been plain broken forever.
If you run make cl
it'll create a changelog entry file (the description should be phrased something like "driver/exec: Fixed a bug where cap_drop
and cap_add
would not expand capabilities". I've also left a comment about expanding the test so that we know we've got the right values when the task config is set.
CapInh: 0000000000000400 | ||
CapPrm: 0000000000000400 | ||
CapEff: 0000000000000400 | ||
CapBnd: 0000000000000400 | ||
CapAmb: 0000000000000400`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing a little testing locally without Nomad to compare this approach to what Docker does, and I was a bit surprised to see they seem to have the same problem we do!
$ docker run -it \
--user nobody \
--sysctl net.ipv4.ip_unprivileged_port_start=1024 \
--cap-drop ALL \
--cap-add NET_BIND_SERVICE \
busybox:1 nc -lvp 443
nc: bind: Permission denied
When I dug into that, I found that it looks like they decided not to fix this moby/moby#38664 moby/moby#8460 in lieu of telling folks to set the net.ipv4.ip_unprivileged_port_start
, which is obviously not the only capability that people want (Docker sets it to 0 by default!). I think we're doing the right thing here but I want to pull in @picatz and @shoenig for a consultation on this just to make sure I'm not missing something obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is obviously not the only capability that people want
Yes, another example of where this issue hits us: we sometimes need to use a template
to generate a wrapper shell script over an artifact
-downloaded binary. There is no way to specify the desired file mode in the artifact
stanza, so the binary ends up without an executable bit set. Consequently, without a working cap_add = ["fowner"]
the exec
driver is effectively broken for this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #16692 with some notes about how this same issue impacts the docker
driver and can't be fixed because the problem is upstream, which will result in some docs updates. I'm going to give this one more pass and make sure it works as I'd expect with a restrictive allow_caps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the
exec
driver is only setting the Bounding set, which isnot sufficient to actually enable the requisite capabilities for the
task process. In order for the capabilities to survive
execve
performed by libcontainer, the
Permitted
,Inheritable
, andAmbient
sets must also be set.
Per CAPABILITIES (7):
Fixes: #16642