Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime should WARN / ignore capabilities that cannot be granted #1094

Merged
merged 1 commit into from
Mar 26, 2021

Commits on Mar 9, 2021

  1. Proposal: runtime should ignore capabilities that cannot be granted

    Currently, the specification requires runtimes to produce a (fatal) error if a
    container configuration requests capabilities that cannot be granted (either
    the capability is "unknown" to the runtime, not supported by the kernel version
    in use, or not available in the environment that the runtime operates in).
    
    This causes problems in situations where the runtime is running in a restricted
    environment (for example, docker-in-docker), or if there is a mismatch between
    the list of capabilities known by higher-level runtimes and the OCI runtime.
    
    Some examples:
    
    - Kernel 5.8 introduced CAP_PERFMON, CAP_BPF, and CAP_CHECKPOINT_RESTORE
      capabilities. Docker 20.10.0 ("higher level runtime") shipped with
      an updated list of capabilities, and when creating a "privileged" container,
      would determine what capabilities are known by the kernel in use, and request
      all those capabilities (by including them in the container config).
      However, runc did not yet have an updated list of capabilities, and therefore
      reject the container specification, producing an error because the new
      capabilities were "unknown".
    - When running nested containers, for example, when running docker-in-docker,
      the "inner" container may be using a more recent version of docker than the
      "outer" container. In this situation, the "outer" container may be missing
      capabilities that the inner container expects to be supported (based on
      kernel version). However, starting the container would fail, because the OCI
      runtime could not grant those capabilities (them not being available in the
      environment it's running in).
    
    Workarounds, and motivation
    -------------------------------------
    
    In the current situation, responsibility of detection what capabilities are
    supported is left to the "higher level" runtimes. As an example, containerd
    recently added code to dynamically adjust the list of requested capabilities
    by attempting to detect which capabilities are available in the environment
    it's running. This is only a partial solution, as it will not address
    mismatches between the list of capabilities _known_ by the higher-level and
    lower-level runtime (which cannot be detected).
    
    Not only does this workaround only provide a *partial* fix, it also introduces
    additional complexity in every higher-level runtime.
    
    Proposal: WARN (but otherwise ignore) capabilities that cannot be granted
    -------------------------------------
    
    This patch changes the specification to have runtimes WARN (but otherwise
    ignore) capabilities that are requested in the container config, but cannot
    be granted.
    
    Moving this responsibility to the lower-level (OCI) runtime makes more sense,
    as the OCI runtime _already_ is responsible for interacting with the kernel
    (detecting what capabilities are supported, and performing conversion), and
    only the lower-level runtime itself knows what capabilities it supports itself.
    Making the lower-level runtime responsible for handling "unknown" or "unavailable"
    capabilities keeps the logic central.
    
    Impact on security
    -------------------------------------
    
    Given that `capabilities` is an "allow-list", ignoring unknown capabilities will
    not impose a security risk; worst case, a container does not get all requested
    capabilities granted and as a result, some actions may fail.
    
    Backward-compatibility
    -------------------------------------
    
    Changing this behavior should be backward compatible. Higher-level runtimes that
    already dynamically adjust the list of requested capabilities can continue to do
    so. Runtimes that do not adjust will see an improvement (containers can start
    even if some of the requested capabilities are not granted). Container processes
    MAY fail (as described in "impact on security"), but users can debug this
    situation either by looking at the warnings produces by the OCI runtime, or using
    tools such as `capsh` / `libcap` to get the list of actual capabilities in the
    container.
    
    Signed-off-by: Sebastiaan van Stijn <[email protected]>
    thaJeztah committed Mar 9, 2021
    Configuration menu
    Copy the full SHA
    8c363e8 View commit details
    Browse the repository at this point in the history