Skip to content

Commit

Permalink
Merge pull request kubernetes#27957 from saschagrunert/seccomp-default
Browse files Browse the repository at this point in the history
Add documentation about `SeccompDefault` feature
  • Loading branch information
k8s-ci-robot authored Jun 28, 2021
2 parents 165247a + 27a74df commit 48c2535
Show file tree
Hide file tree
Showing 3 changed files with 70 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ different Kubernetes components.
| `RotateKubeletServerCertificate` | `false` | Alpha | 1.7 | 1.11 |
| `RotateKubeletServerCertificate` | `true` | Beta | 1.12 | |
| `RunAsGroup` | `true` | Beta | 1.14 | |
| `SeccompDefault` | `false` | Alpha | 1.22 | |
| `ServiceInternalTrafficPolicy` | `false` | Alpha | 1.21 | |
| `ServiceLBNodePortControl` | `false` | Alpha | 1.20 | |
| `ServiceLoadBalancerClass` | `false` | Alpha | 1.21 | |
Expand Down Expand Up @@ -783,6 +784,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
instead of the DaemonSet controller.
- `SCTPSupport`: Enables the _SCTP_ `protocol` value in Pod, Service,
Endpoints, EndpointSlice, and NetworkPolicy definitions.
- `SeccompDefault`: Enables the use of `RuntimeDefault` as the default seccomp profile for all workloads.
The seccomp profile is specified in the `securityContext` of a Pod and/or a Container.
- `ServerSideApply`: Enables the [Sever Side Apply (SSA)](/docs/reference/using-api/server-side-apply/)
feature on the API Server.
- `ServiceAccountIssuerDiscovery`: Enable OIDC discovery endpoints (issuer and
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,7 @@ RemoveSelfLink=true|false (BETA - default=true)<br/>
RootCAConfigMap=true|false (BETA - default=true)<br/>
RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>
RunAsGroup=true|false (BETA - default=true)<br/>
SeccompDefault=true|false (ALPHA - default=false)<br/>
ServerSideApply=true|false (BETA - default=true)<br/>
ServiceAccountIssuerDiscovery=true|false (BETA - default=true)<br/>
ServiceLBNodePortControl=true|false (ALPHA - default=false)<br/>
Expand Down Expand Up @@ -1073,6 +1074,13 @@ WindowsEndpointSliceProxying=true|false (ALPHA - default=false)<br/>
<td></td><td style="line-height: 130%; word-wrap: break-word;">Timeout of all runtime requests except long running request - `pull`, `logs`, `exec` and `attach`. When timeout exceeded, kubelet will cancel the request, throw out an error and retry later. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's `--config` flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)</td>
</tr>

<tr>
<td colspan="2">--seccomp-default RuntimeDefault&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Default: `false`</td>
</tr>
<tr>
<td></td><td style="line-height: 130%; word-wrap: break-word;">&lt;Warning: Alpha feature&gt; Enable the use of RuntimeDefault as the default seccomp profile for all workloads. The SeccompDefault feature gate must be enabled to allow this flag, which is disabled per default.</td>
</tr>

<tr>
<td colspan="2">--seccomp-profile-root string&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Default: `/var/lib/kubelet/seccomp`</td>
</tr>
Expand Down
67 changes: 59 additions & 8 deletions content/en/docs/tutorials/clusters/seccomp.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,18 @@ reviewers:
- hasheddan
- pjbgf
- saschagrunert
title: Restrict a Container's Syscalls with Seccomp
title: Restrict a Container's Syscalls with seccomp
content_type: tutorial
weight: 20
min-kubernetes-server-version: v1.22
---

<!-- overview -->

{{< feature-state for_k8s_version="v1.19" state="stable" >}}

Seccomp stands for secure computing mode and has been a feature of the Linux
kernel since version 2.6.12. It can be used to sandbox the privileges of a
kernel since version 2.6.12. It can be used to sandbox the privileges of a
process, restricting the calls it is able to make from userspace into the
kernel. Kubernetes lets you automatically apply seccomp profiles loaded onto a
Node to your Pods and containers.
Expand All @@ -35,16 +36,66 @@ profiles that give only the necessary privileges to your container processes.

## {{% heading "prerequisites" %}}

{{< version-check >}}

In order to complete all steps in this tutorial, you must install
[kind](https://kind.sigs.k8s.io/docs/user/quick-start/) and
[kubectl](/docs/tasks/tools/). This tutorial will show examples
with both alpha (pre-v1.19) and generally available seccomp functionality, so
both alpha (new in v1.22) and generally available seccomp functionality. You should
make sure that your cluster is [configured
correctly](https://kind.sigs.k8s.io/docs/user/quick-start/#setting-kubernetes-version)
for the version you are using.

<!-- steps -->

## Enable the use of `RuntimeDefault` as the default seccomp profile for all workloads

{{< feature-state state="alpha" for_k8s_version="v1.22" >}}

`SeccompDefault` is an optional kubelet
[feature gate](/docs/reference/command-line-tools-reference/feature-gates) as
well as corresponding `--seccomp-default`
[command line flag](/docs/reference/command-line-tools-reference/kubelet).
Both have to be enabled simultaneously to use the feature.

If enabled, the kubelet will use the `RuntimeDefault` seccomp profile by default, which is
defined by the container runtime, instead of using the `Unconfined` (seccomp disabled) mode.
The default profiles aim to provide a strong set
of security defaults while preserving the functionality of the workload. It is
possible that the default profiles differ between container runtimes and their
release versions, for example when comparing those from CRI-O and containerd.

Some workloads may require a lower amount of syscall restrictions than others.
This means that they can fail during runtime even with the `RuntimeDefault`
profile. To mitigate such a failure, you can:

- Run the workload explicitly as `Unconfined`.
- Disable the `SeccompDefault` feature for the nodes. Also making sure that
workloads get scheduled on nodes where the feature is disabled.
- Create a custom seccomp profile for the workload.

If you were introducing this feature into production-like cluster, the Kubernetes project
recommends that you enable this feature gate on a subset of your nodes and then
test workload execution before rolling the change out cluster-wide.

More detailed information about a possible upgrade and downgrade strategy can be
found in the [related Kubernetes Enhancement Proposal (KEP)](https://github.com/kubernetes/enhancements/tree/a70cc18/keps/sig-node/2413-seccomp-by-default#upgrade--downgrade-strategy).

Since the feature is in alpha state it is disabled per default. To enable it,
pass the flags `--feature-gates=SeccompDefault=true --seccomp-default` to the
`kubelet` CLI or enable it via the [kubelet configuration
file](/docs/tasks/administer-cluster/kubelet-config-file/). To enable the
feature gate in [kind](https://kind.sigs.k8s.io), ensure that `kind` provides
the minimum required Kubernetes version and enables the `SeccompDefault` feature
[in the kind configuration](https://kind.sigs.k8s.io/docs/user/quick-start/#enable-feature-gates-in-your-cluster):

```yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
SeccompDefault: true
```
## Create Seccomp Profiles
The contents of these profiles will be explored later on, but for now go ahead
Expand Down Expand Up @@ -108,7 +159,7 @@ docker exec -it 6a96207fed4b ls /var/lib/kubelet/seccomp/profiles
audit.json fine-grained.json violation.json
```
## Create a Pod with a Seccomp profile for syscall auditing
## Create a Pod with a seccomp profile for syscall auditing
To start off, apply the `audit.json` profile, which will log all syscalls of the
process, to a new Pod.
Expand Down Expand Up @@ -208,7 +259,7 @@ kubectl delete pod/audit-pod
kubectl delete svc/audit-pod
```
## Create Pod with Seccomp Profile that Causes Violation
## Create Pod with seccomp Profile that Causes Violation
For demonstration, apply a profile to the Pod that does not allow for any
syscalls.
Expand Down Expand Up @@ -255,7 +306,7 @@ kubectl delete pod/violation-pod
kubectl delete svc/violation-pod
```
## Create Pod with Seccomp Profile that Only Allows Necessary Syscalls
## Create Pod with seccomp Profile that Only Allows Necessary Syscalls
If you take a look at the `fine-pod.json`, you will notice some of the syscalls
seen in the first example where the profile set `"defaultAction":
Expand Down Expand Up @@ -339,7 +390,7 @@ kubectl delete pod/fine-pod
kubectl delete svc/fine-pod
```
## Create Pod that uses the Container Runtime Default Seccomp Profile
## Create Pod that uses the Container Runtime Default seccomp Profile
Most container runtimes provide a sane set of default syscalls that are allowed
or not. The defaults can easily be applied in Kubernetes by using the
Expand All @@ -364,5 +415,5 @@ The default seccomp profile should provide adequate access for most workloads.
Additional resources:
* [A Seccomp Overview](https://lwn.net/Articles/656307/)
* [A seccomp Overview](https://lwn.net/Articles/656307/)
* [Seccomp Security Profiles for Docker](https://docs.docker.com/engine/security/seccomp/)

0 comments on commit 48c2535

Please sign in to comment.