-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ambient and bounding capability support #675
Conversation
schema/config-schema.json
Outdated
"$ref": "defs-linux.json#/definitions/Capability" | ||
} | ||
"type": "object", | ||
"properties": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to reformat this to get consistent indents (make -C schema fmt
).
schema/config-schema.json
Outdated
"properties": { | ||
"bounding": { | ||
"id": "https://opencontainers.org/schema/bundle/process/linux/capabilities/bounding", | ||
"$ref": "defs-linux.json#/definitions/Capability" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and ambient
) should still be arrays. So:
"type": "array",
"items": {
"$ref": "defs-linux.json#/definitions/Capability"
}
instead of the line I'm commenting on.
config.md
Outdated
Valid values are the strings for capabilities defined in [the man page](http://man7.org/linux/man-pages/man7/capabilities.7.html). | ||
capabilities contains the following properties: | ||
* **`bounding`** (array of strings, OPTIONAL) - the 'bounding' field is the whitelist of bounding capabilities that are kept for the process. | ||
* **`ambient`** (array of strings, OPTIONAL) - the 'ambient' field is the whitelist of ambient capabilities that are kept for the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indent? And you can probably drop “the '…' field is the” and just start in:
ambient
(array of strings, OPTIONAL) A whitelist of …
d9d6641
to
7e6e147
Compare
config.md
Outdated
"CAP_NET_BIND_SERVICE" | ||
], | ||
"capabilities": { | ||
"bounding": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indent (all the rest of the examples use four spaces).
LGTM |
While this looks good, we do manipulate the inherited, permitted and effective sets as well in runc. Should we make a statement about that or rename bounding to something else? |
@mrunalp I guess if we want to be uber explicit with the settings we can add all the fields but I don't think its totally useful for most containers. Everyone will set the additional fields the same anyways so keeping a bounding set makes things much more straightforward. |
I think I can imagine a situation where you want permitted but not effective or inherited, eg you want to allow some suid executables to work, but the capabilities to be only executed by those, not by the initial process. So I think its probably better to have all four sets, even if most applications just set e=i=p. |
@mrunalp what do you think? Should we add the rest and remove bounding? Anyone else have any input on this? |
@crosbymichael We need ambient and bounding for sure. If we add the other 3 sets as well then we will have to make a statement that the runtime will apply these to the init process before it execs the user specified process. The capabilities could change on the basis of being root or not, executed binary being setuid or not and having any filesystem capabilities. |
@mrunalp not much of a difference from today right? |
@crosbymichael Yes, that's what we do today. I think we can just add all the sets and leave it up to the user to set them correctly. For e.g. whatever is set in ambient has to be in both permitted and inherited. We can just point the user to the man page for capabilities. |
e4e08fc
to
c119f26
Compare
Ok, I updated this PR with the changes |
config.md
Outdated
* **`effective`** (array of strings, OPTIONAL) - the 'bounding' field is the whitelist of bounding capabilities that are kept for the process. | ||
* **`inheritable`** (array of strings, OPTIONAL) - the 'bounding' field is the whitelist of bounding capabilities that are kept for the process. | ||
* **`permitted`** (array of strings, OPTIONAL) - the 'bounding' field is the whitelist of bounding capabilities that are kept for the process. | ||
* **`ambient`** (array of strings, OPTIONAL) - the 'ambient' field is the whitelist of ambient capabilities that are kept for the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need bounding as well.
359352a
to
b26f15e
Compare
config.md
Outdated
"CAP_AUDIT_WRITE", | ||
"CAP_KILL", | ||
], | ||
"ambient": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for correctness, we will have set permitted/inheritable as well to atleast the same caps as ambient. We could set p/i/e to same as bounding in this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually ambient has a different cap so would need to include that as well in p/i and optionally in e as well.
5d7188d
to
84955e1
Compare
"capabilities": { | ||
"bounding": [ | ||
"CAP_AUDIT_WRITE", | ||
"CAP_KILL", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the example should have CAP_NET_BIND_SERVICE
in the bounding set or it cannot be applied as an ambient capability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CAP_NET_BIND_SERVICE
also needs to be added to permitted/inherited.
From capabilities(7)
The ambient capability set obeys the invariant that no capability can ever be ambient if it is not both permitted and inheritable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrunalp does that mean we should or should not have it in bounding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crosbymichael We should have it in bounding, permitted and inheritable.
@mrunalp fixed. |
config.md
Outdated
@@ -132,7 +132,13 @@ For Windows, see links for details about [mountvol](http://ss64.com/nt/mountvol. | |||
* **`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2001's `environ`][ieee-1003.1-2001-xbd-c8.1]. | |||
* **`args`** (array of strings, REQUIRED) with similar semantics to [IEEE Std 1003.1-2001 `execvp`'s *argv*][ieee-1003.1-2001-xsh-exec]. | |||
This specification extends the IEEE standard in that at least one entry is REQUIRED, and that entry is used with the same semantics as `execvp`'s *file*. | |||
* **`capabilities`** (array of strings, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. | |||
* **`capabilities`** (object of strings, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any precedence for “object of …”, and would recommend just using “object” (which is what we use for annotations
).
config.md
Outdated
* **`capabilities`** (array of strings, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. | ||
* **`capabilities`** (object of strings, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. | ||
capabilities contains the following properties: | ||
* **`effective`** (array of strings, OPTIONAL) - the 'effective' field is the whitelist of effective capabilities that are kept for the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use backticks around property names, so “'effective'” → “effective
” (and similarly in the following lines). Grepping for '
in our *.md
, I see no examples of single-quoted property names in master:
$ git grep "'[^' ]*'" origin/master -- *.md
origin/master:config.md:For Solaris, the mount entry corresponds to the 'fs' resource in zonecfg(8).
origin/master:config.md: As an example, the ['no_new_privs' kernel doc](https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt) has more information on how this is achieved using a prctl system call on Linux.
Both of those are references to external stuff (not OCI properties), and I think they should have been backticked as well ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the new “whitelist” language would require the removal of this runtime-tools check and give configuration authors no way to clearly drop a given capability. I'd rather we leave these arrays as “this will be the set of process capabilities after create
finishes” and just focus on adding the additional sets in this PR. We can address the change from “this set” to “at least this set” in a separate PR.
// Capabilities are Linux capabilities that are kept for the container. | ||
Capabilities []string `json:"capabilities,omitempty" platform:"linux"` | ||
// Capabilities are Linux capabilities that are kept for the process. | ||
Capabilities *LinuxCapabilities `json:"capabilities,omitempty" platform:"linux"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlbutler just removed the Linux-specific-ness of process.capabilities
in #673 (at least in the Markdown spec). With this change, it becomes even less clear to me how other OSes are supposed to configure their capabilities (although I wasn't clear on that since #673, so it's not a lot worse ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No other systems have capabilities; there was a withdrawn Posix draft that the Linux implementation was based on, and then customised for Linux. They should have remained Linux specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No other systems have capabilities…
capabilies(7) references this for the withdrawn POSIX.1e draft, and that has:
Why Posix.1e was abandoned is difficult to understand from today's (July 2014) point of view. Solaris, Irix, Linux, and probably other Unices seemed to recognize the standard. On the other hand the FreeBSD project found counter arguments and didn't integrate capabilities ('fine grained privileges') by default.
I don't have access to Solaris or Irix for testing, or enough interest in them to find relevant docs, but I'd caution against making this Linux-specific (again) without consulting someone with Solaris access. On the other hand, maybe the policy is “unless you document your valid values, we may drop support for this property on your system”, and I'd be +1 for that, giving the Solaris folks some reasonable period to provide the valid-values docs, and then dropping Solaris capability support if they don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here we are adding the Linux specific types such as ambient. Solaris capabilities are a bit similar by the look of it (they have E, I, P but different capabilities) but that doesn't mean they should be at the top level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solaris docs are http://docs.oracle.com/cd/E23824_01/html/821-1456/prbac-2.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solaris capabilities are a bit similar by the look of it (they have E, I, P but different capabilities) but that doesn't mean they should be at the top level.
Is that “I don't think there is enough overlap to be worth sharing the same namespace”? If the Solaris folks are ok using bounding
instead of their limited
, we could just make ambient
Linux-specific. In other situations where only the values are different (e.g. mounts
entries), the maintainer preference seems to be “share the same property, but define the different per-platform values”.
On the other hand, if Windows has nothing equivalent, the pro-namespacing policy here suggests namespacing the whole capabilities
structure to something POSIX-ish-specific. I tried to raise the “how will this be handled on Windows?” issue in #673, but am still not clear on if/whether Windows supports something capability-like.
config.md
Outdated
@@ -132,7 +132,13 @@ For Windows, see links for details about [mountvol](http://ss64.com/nt/mountvol. | |||
* **`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2001's `environ`][ieee-1003.1-2001-xbd-c8.1]. | |||
* **`args`** (array of strings, REQUIRED) with similar semantics to [IEEE Std 1003.1-2001 `execvp`'s *argv*][ieee-1003.1-2001-xsh-exec]. | |||
This specification extends the IEEE standard in that at least one entry is REQUIRED, and that entry is used with the same semantics as `execvp`'s *file*. | |||
* **`capabilities`** (array of strings, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. | |||
* **`capabilities`** (object, OPTIONAL) is an array that specifies the set of capabilities of the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [CAPABILITIES(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html) man page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reword is an array
to is an object containing arrays of capability types
or something similar?
config.md
Outdated
* **`bounding`** (array of strings, OPTIONAL) - the `bounding` field is a array of bounding capabilities that are kept for the process. | ||
* **`inheritable`** (array of strings, OPTIONAL) - the `inheritable` field is a array of inheritable capabilities that are kept for the process. | ||
* **`permitted`** (array of strings, OPTIONAL) - the `permitted` field is a array of permitted capabilities that are kept for the process. | ||
* **`ambient`** (array of strings, OPTIONAL) - the `ambient` field is a array of ambient capabilities that are kept for the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM when typo is fixed
Closes opencontainers#668 Signed-off-by: Michael Crosby <[email protected]>
2 similar comments
Fix a JSON typo which snuck in with eb114f0 (Add ambient and bounding capability support, 2017-02-02, opencontainers#675). Signed-off-by: W. Trevor King <[email protected]>
…gain) Roll back the genericization from 718f9f3 (minor narrative cleanup regarding config compatibility, 2017-01-30, opencontainers#673). Lifting the restriction there seems to have been motivated by "Solaris supports capabilities", but that was before the split into a capabilities object which happened in eb114f0 (Add ambient and bounding capability support, 2017-02-02, opencontainers#675). It's not clear if Solaris supports ambient caps, or what Solaris API rlimits or noNewPrivileges were punting to [1]. And John Howard has recently confirmed that Windows does not support capabilities and is unlikely to do so in the future [2]. John's statement didn't directly address rlimits or noNewPrivileges, but we can always restore any of these properties to the Solaris/Windows platforms if/when we get docs about which API we're punting to on those platforms. Also add some backticks, remove the hyphens in "OPTIONAL) - the", standardize lines I touch to use "the process" [3], and use four-space indents here to keep Pandoc happy (see 7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495). [1]: opencontainers#673 (comment) [2]: opencontainers#810 (comment) [3]: opencontainers#809 (comment) Signed-off-by: W. Trevor King <[email protected]>
Roll back the genericization from 718f9f3 (minor narrative cleanup regarding config compatibility, 2017-01-30, opencontainers#673). Lifting the restriction there seems to have been motivated by "Solaris supports capabilities", but that was before the split into a capabilities object which happened in eb114f0 (Add ambient and bounding capability support, 2017-02-02, opencontainers#675). It's not clear if Solaris supports ambient caps, or what Solaris API noNewPrivileges were punting to [1]. And John Howard has recently confirmed that Windows does not support capabilities and is unlikely to do so in the future [2]. He also confirmed that Windows does not support rlimits [3]. John's statement didn't directly address noNewPrivileges, but we can always restore any of these properties to the Solaris/Windows platforms if/when we get docs about which API we're punting to on those platforms. Also add some backticks, remove the hyphens in "OPTIONAL) - the", standardize lines I touch to use "the process" [4], and use four-space indents here to keep Pandoc happy (see 7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495). [1]: opencontainers#673 (comment) [2]: opencontainers#810 (comment) [3]: opencontainers#835 (comment) [4]: opencontainers#809 (comment) Signed-off-by: W. Trevor King <[email protected]>
Roll back the genericization from 718f9f3 (minor narrative cleanup regarding config compatibility, 2017-01-30, opencontainers#673). Lifting the restriction there seems to have been motivated by "Solaris supports capabilities", but that was before the split into a capabilities object which happened in eb114f0 (Add ambient and bounding capability support, 2017-02-02, opencontainers#675). It's not clear if Solaris supports ambient caps, or what Solaris API noNewPrivileges were punting to [1]. And John Howard has recently confirmed that Windows does not support capabilities and is unlikely to do so in the future [2]. He also confirmed that Windows does not support rlimits [3]. John's statement didn't directly address noNewPrivileges, but we can always restore any of these properties to the Solaris/Windows platforms if/when we get docs about which API we're punting to on those platforms. Also add some backticks, remove the hyphens in "OPTIONAL) - the", standardize lines I touch to use "the process" [4], and use four-space indents here to keep Pandoc happy (see 7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495). [1]: opencontainers#673 (comment) [2]: opencontainers#810 (comment) [3]: opencontainers#835 (comment) [4]: opencontainers#809 (comment) Signed-off-by: W. Trevor King <[email protected]>
Roll back the genericization from 718f9f3 (minor narrative cleanup regarding config compatibility, 2017-01-30, opencontainers#673). Lifting the restriction there seems to have been motivated by "Solaris supports capabilities", but that was before the split into a capabilities object which happened in eb114f0 (Add ambient and bounding capability support, 2017-02-02, opencontainers#675). It's not clear if Solaris supports ambient caps, or what Solaris API noNewPrivileges were punting to [1]. And John Howard has recently confirmed that Windows does not support capabilities and is unlikely to do so in the future [2]. He also confirmed that Windows does not support rlimits [3]. John's statement didn't directly address noNewPrivileges, but we can always restore any of these properties to the Solaris/Windows platforms if/when we get docs about which API we're punting to on those platforms. Also add some backticks, remove the hyphens in "OPTIONAL) - the", standardize lines I touch to use "the process" [4], and use four-space indents here to keep Pandoc happy (see 7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495). [1]: opencontainers#673 (comment) [2]: opencontainers#810 (comment) [3]: opencontainers#835 (comment) [4]: opencontainers#809 (comment) Signed-off-by: W. Trevor King <[email protected]>
Closes #668
Signed-off-by: Michael Crosby [email protected]