diff --git a/keps/sig-storage/1710-selinux-relabeling/README.md b/keps/sig-storage/1710-selinux-relabeling/README.md
new file mode 100644
index 00000000000..9b4d853f386
--- /dev/null
+++ b/keps/sig-storage/1710-selinux-relabeling/README.md
@@ -0,0 +1,714 @@
+# Skip SELinux relabeling of volumes
+
+## Table of Contents
+
+
+- [Release Signoff Checklist](#release-signoff-checklist)
+- [Summary](#summary)
+- [Motivation](#motivation)
+ - [SELinux intro](#selinux-intro)
+ - [SELinux context assignment](#selinux-context-assignment)
+ - [Volumes](#volumes)
+ - [Goals](#goals)
+ - [Non-Goals](#non-goals)
+- [Proposal](#proposal)
+ - [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
+ - [mount -o context
](#)
+ - [New Kubernetes behavior](#new-kubernetes-behavior)
+ - [Shared volumes](#shared-volumes)
+ - [CSIDriver.Spec.SELinuxMountSupported
](#-1)
+ - [Examples](#examples)
+ - [User Stories [optional]](#user-stories-optional)
+ - [Story 1](#story-1)
+ - [Story 2](#story-2)
+ - [Story 3](#story-3)
+ - [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional-1)
+ - [Risks and Mitigations](#risks-and-mitigations)
+- [Design Details](#design-details)
+ - [Test Plan](#test-plan)
+ - [Graduation Criteria](#graduation-criteria)
+ - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
+ - [Version Skew Strategy](#version-skew-strategy)
+- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
+ - [Feature enablement and rollback](#feature-enablement-and-rollback)
+ - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
+ - [Monitoring requirements](#monitoring-requirements)
+ - [Dependencies](#dependencies)
+ - [Scalability](#scalability)
+ - [Troubleshooting](#troubleshooting)
+- [Implementation History](#implementation-history)
+- [Drawbacks](#drawbacks)
+- [Alternatives](#alternatives)
+- [Infrastructure Needed (optional)](#infrastructure-needed-optional)
+- [Implementation History](#implementation-history-1)
+- [Drawbacks [optional]](#drawbacks-optional)
+- [Alternatives [optional]](#alternatives-optional)
+ - [FSGroupChangePolicy
approach](#-approach)
+ - [Change container runtime](#change-container-runtime)
+ - [Move SELinux label management to kubelet](#move-selinux-label-management-to-kubelet)
+ - [Merge FSGroupChangePolicy
and SELinuxRelabelPolicy
](#merge--and-)
+
+
+## Release Signoff Checklist
+
+- [x] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
+- [ ] KEP approvers have set the KEP status to `implementable`
+- [ ] Design details are appropriately documented
+- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
+- [ ] Graduation criteria is in place
+- [ ] "Implementation History" section is up-to-date for milestone
+- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
+- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
+
+## Summary
+
+This KEP tries to speed up the way that volumes (incl. persistent volumes) are made available to Pods on systems with SELinux in enforcing mode.
+Current way includes recursive relabeling of all files on a volume before a container can be started. This is slow for large volumes.
+
+## Motivation
+
+### SELinux intro
+On Linux machines with SELinux in enforcing mode, SELinux tries to prevent users that escaped from a container to access the host OS and also to access other containers running on the host.
+It does so by running each container with unique *SELinux context* (such as `system_u:system_r:container_t:s0:c309,c383`) and labeling all content on all volumes with the corresponding label (`system_u:object_r:container_file_t:s0:c309,c383`).
+Only process with the context `...:container_t:s0:c309,c383` can access files with label `container_file_t:s0:c309,c383`, even if the process runs as root.
+Therefore rogue user cannot access potentially secret data of other containers, because volumes of each container have different label.
+
+In further text, we're going to shorten both `system_u:system_r:container_t:s0:c309,c383` (context of a process) and `system_u:object_r:container_file_t:s0:c309,c383` (label of a file) to `s0:309:383`.
+
+See [SELinux documentation](https://selinuxproject.org/page/NB_MLS) for more details.
+
+### SELinux context assignment
+In Kubernetes, the SELinux context of a pod is assigned in two ways:
+1. Either it is set by user in PodSpec or Container: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/.
+1. If not set in Pod/Container, the container runtime will allocate a new unique SELinux context and assign it to a pod (container) by itself.
+
+### Volumes
+Currently, Kubernetes *knows*(*) which volume plugins supports SELinux (i.e. supports extended attributes on a filesystem the plugin provides).
+If SELinux is supported for a volume, it passes the volume to the container runtime with ":Z" option ("private unshared").
+The container runtime then **recursively relabels** all files on the volume to either the label set in PodSpec/Container or the random value allocated by the container runtime itself.
+
+**This relabeling needs to traverse through the whole volume, and it can be slow for volumes with a large amount of files.**
+
+*) These in-tree volume plugins don't support SELinux: Azure File, CephFS, GlusterFS, HostPath, NFS, Portworx and Quobyte.
+All other volume plugins support it.
+This knowledge is hardcoded in in-tree volume plugins (e.g. [NFS](https://github.com/kubernetes/kubernetes/blob/0c5c3d8bb97d18a2a25977e92b3f7a49074c2ecb/pkg/volume/nfs/nfs.go#L235)).
+
+For CSI, kubelet uses following heuristics:
+
+1. Mount the volume (via `NodeStage` + `NodePublish` CSI calls).
+2. Check mount options of the volume mount dir. If and only if it contains `seclabel` mount option, the volume supports SELinux.
+
+### Goals
+
+Optionally (chosen by user), do not recursively relabel content of the volumes.
+
+### Non-Goals
+
+Change container runtimes / CRI.
+
+## Proposal
+
+Offer option in `Pod.Spec.PodSecurityContext to` *mount* volumes with the right labels instead of recursive relabeling:
+
+```go
+type SELinuxRelabelPolicy string
+
+const (
+ OnVolumeMount SELinuxRelabelPolicy = "OnVolumeMount"
+ AlwaysRelabel SELinuxRelabelPolicy = "Always"
+)
+
+type PodSecurityContext struct {
+ // seLinuxRelabelPolicy ← new field
+ // Defines behavior of changing SELinux labels of the volume before being exposed inside Pod.
+ // Valid values are "OnVolumeMount" and "Always". If not specified, "Always" is used.
+ // "Always" policy recursively changes SELinux labels on all files on all volumes used by the Pod.
+ // "OnVolumeMount" tries to mount volumes used by the Pod with the right context and skip recursive ownership
+ // change. Kubernetes may fall back to policy "Always" if a storage backed does not support this policy.
+ // This field is ignored for Pod's volumes that do not support SELinux.
+ // + optional
+ SELinuxRelabelPolicy *SELinuxRelabelPolicy
+
+ // For context:
+ // fsGroupChangePolicy defines behavior of changing ownership and permission of the volume
+ // before being exposed inside Pod. This field will only apply to
+ // volume types which support fsGroup based ownership(and permissions).
+ // It will have no effect on ephemeral volume types such as: secret, configmaps
+ // and emptydir.
+ // Valid values are "OnRootMismatch" and "Always". If not specified defaults to "Always".
+ // +optional
+ FSGroupChangePolicy *PodFSGroupChangePolicy `json:"fsGroupChangePolicy,omitempty" protobuf:"bytes,9,opt,name=fsGroupChangePolicy"`
+
+ ...
+}
+```
+
+See https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20200120-skip-permission-change.md for similar API for ownership change for fsGroup.
+This KEP should follow API provided for fsGroup closely, however, the implementation is different (`mount` here vs. recursive `chown` in the other KEP).
+
+In order to allow `SELinuxRelabelPolicy: OnVolumeMount` for volumes provided by CSI drivers, kubelet must know if a CSI driver supports SELinux or not.
+
+```go
+// In storage.k8s.io/v1:
+
+
+// CSIDriverSpec is the specification of a CSIDriver.
+type CSIDriverSpec struct {
+ // SELinuxMountSupported specifies if the CSI driver supports "-o context"
+ // mount option.
+ //
+ // When "true", Kubernetes may call NodeStage / NodePublish with "-o context=xyz" mount
+ // option for volumes of a pod with
+ // podSecurityContext.seLinuxRelabelPolicy ="OnVolumeMount".
+ //
+ // When "false", Kubernetes won't pass any special SELinux mount options to the driver.
+ // podSecurityContext.seLinuxRelabelPolicy "OnVolumeMount" is silently ignored.
+ //
+ // Default is "false".
+ SELinuxMountSupporteded *bool;
+ ...
+}
+
+// For context:
+type CSIDriver struct {
+ Spec CSIDriverSpec
+}
+```
+
+### Implementation Details/Notes/Constraints [optional]
+
+#### `mount -o context`
+Linux kernel, with SELinux compiled in, allows `mount -o context=s0:c309,c383 ` to mount a volume and pretend that all files on the volume have given SELinux label.
+It works only for the first mount of the volume!
+It does not work for bind-mounts or any subsequent mount of the same volume.
+
+Note that volumes mounted with `-o context` don't have `seclabel` in their mount options.
+In addition, calling `chcon` there will fail with `Operation not supported`.
+
+### New Kubernetes behavior
+
+* If kubelet *knows* SELinux context of a pod / container to run (i.e. Pod/Container contains at least `SELinuxOptions.Level`):
+ * And pod's `SELinuxRelabelPolicy` is `OnVolumeMount`:
+ * And if the in-tree volume plugin supports SELinux / `CSIDriver.Spec.SELinuxMountSupported` is explicitly `true`:
+ * Kubelet tries to mount the volume for the Pod with given SELinux label using `mount -o context=XYZ`.
+ * Kubelet makes sure the option is passed to the first mount in all in-tree volume plugins (incl. ephemeral volumes like Secrets).
+ * Kubelet passes it as a mount option to all CSI calls for given volume.
+ * After the volume is mounted, kubelet checks that the root of the volume has the expected SELinux label, i.e. that the volume was mounted correctly.
+ * If the volume root has expected label, kubelet passes the volume to the container runtime without any ":z" or ":Z" options - no relabeling is necessary.
+ * If the volume root has unexpected label, for example when CSI driver did not apply `-o context` correctly, or the volume was already mounted with a different context,
+ volume plugin reports an error and kubelet fails to start the pod.
+ It is CSI driver fault that it advertises SELinux support and then fails to apply it.
+
+* Nothing changes when `CSIDriver.Spec.SELinuxMountSupported` is `false` or not set:
+ * CSI volume plugin calls CSI without any special SELinux mount options and it autodetects, if the volume supports SELinux or not by presence of `seclabel` mount option.
+ This is current kubelet behavior.
+
+* Nothing changes if kubelet does not know the SELinux context of a pod (`SELinuxOptions.Level` is empty) or pod's `SELinuxRelabelPolicy` is `Always`.
+ * Volume is mounted without any SELinux options and passed to the container runtime with or without ":Z", depending on if the volume plugin supports SELinux or not (by checking `seclabel` mount option).
+ The container runtime allocates a new SELinux context and recursively relabels all files on the volume.
+ This is current kubelet behavior.
+
+Validation:
+
+* Kubernetes checks that `SELinuxRelabelPolicy` field can be used in a pod only when at least `SELinuxOptions.Level` is set.
+
+When a Pod specifies incomplete SELinux label (i.e. omits `SELinuxOptions.User`, `.Role` or `.Type`), kubelet fills the blanks from the system defaults provided by [ContainerLabels() from go-selinux bindings](https://github.com/opencontainers/selinux/blob/621ca21a5218df44259bf5d7a6ee70d720a661d5/go-selinux/selinux_linux.go#L770).
+
+### Shared volumes
+
+If a single PV that supports SELinux labels is being shared by multiple pods, each of them must have the same SELinux context.
+Currently, a running pod with context `A` will lose access to all files on a volume if a pod with context `B` starts and uses the same volume, because the container runtime relabels the volume for pod `B`.
+This behavior changes with this KEP: kubelet mounts the volume with `-o context=A` for the first pod.
+It tries to do the same for the second pod with `-o context=B`, however, the volume has already been mounted and `mount -o context=B` fails.
+Pod `B` can't start on the same node until pod `A` dies and kubelet unmounts its volumes.
+
+We don't think that this is a bug in the design.
+Only one pod will have access to the volume, this KEP only changes the selection.
+
+The only different behavior is when two pods with different SELinux context use the same volume, but different SubPath - they are working with `Always` policy, as the container runtime relabeled only the subpaths, with `OnVolumeMount` the whole volume must have the same context.
+
+### `CSIDriver.Spec.SELinuxMountSupported`
+
+The new field `CSIDriver.Spec.SELinuxMountSupported` is important so kubelet knows if mounts of volumes provided by the driver are independent on each other.
+There are CSI drivers that actually use a single [NFS](https://github.com/kubernetes-incubator/external-storage/tree/master/nfs-client)
+or [GlusterFS](https://github.com/kubernetes-incubator/external-storage/tree/master/gluster/glusterfs)
+export and provide subdirectories of this export as individual PVs.
+If kubelet mounts such PV (i.e. a subdirectory) with `-o context=A`, all subsequent mounts of the same NFS/Gluster export must have the same SELinux context, despite being different PVs from Kubernetes perspective.
+
+Since kubelet does not know about such limitation of a CSI driver, `CSIDriver.Spec.SELinuxMountSupported=false` (or `nil`) is needed to turn off mounting with `-o context`.
+
+### Examples
+
+Following table captures interaction between actual filesystems on a volume and newly introduced flags. Hypothetic iscsi and NFS CSI drivers are used as an example of a volume based on a block device and shared filesystem.
+
+| Volume | CSIDriver.SELinuxMountSupported | Pod.SELinuxRelabelPolicy | mount opts | docker run -v | |
+|--------------|---------------------------------|--------------------------|------------|---------------|----|
+| iscsi + ext4 | * | Always | - | :Z | 1) |
+| | | | | | |
+| iscsi + ext4 | false / nil | OnVolumeMount | - | :Z | 2) |
+| iscsi + ext4 | true | OnVolumeMount | -o context | - | 3) |
+| | | | | | |
+| iscsi + ntfs | true | OnVolumeMount | -o context | - | 3) |
+| iscsi + ntfs | false / nil | OnVolumeMount | - | - | 4) |
+| iscsi + ntfs | * | Always | - | - | 5) |
+| | | | | | |
+| nfs | true | OnVolumeMount | -o context | - | 6) |
+| nfs | false / nil | OnVolumeMount | - | - | 7) |
+
+1) Using `:Z`, because `seclabel` was autodetected in mount options (ext4 supports SELinux).
+2) `OnVolumeMount` is ignored when `SELinuxMountSupported` is `false`.
+ While iscsi + ext4 supports `mount -o context`, either cluster admin did not update the CSIDriver yet (upgrading from older cluster) or has another reason for this.
+ Using `:Z`, because `seclabel` was autodetected in mount options.
+3) CSI driver supports `-o context` and pod asks for it.
+4) `OnVolumeMount` is ignored when `SELinuxMountSupported` is `false`.
+ Using no `:Z`, because `seclabel` was not detected in mount options (ntfs does not support SELinux).
+5) ntfs mount does not have `seclabel` option, so kubelet won’t pass `:Z` to CRI.
+
+NFS behaves largely as iscsi+ntfs, however these two cases are interesting:
+
+6) Here CSI driver vendor says that all volumes are independent and `mount -o context` is safe. For example, when all volumes are separate NFS shares.
+7) CSI driver vendor explicitly declares that mount of a volume with context `A` may affect mounts of other volumes provided by this driver with different context. For example, when all the volumes are subdirectories of a single NFS share.
+
+### User Stories [optional]
+
+#### Story 1
+
+User does not configure anything special in their pods:
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+ name: testpod
+spec:
+ containers:
+ - image: nginx
+ name: nginx
+ volumeMounts:
+ - name: vol
+ mountPath: /mnt/test
+ volumes:
+ - name: vol
+ persistentVolumeClaim:
+ claimName: myclaim
+```
+
+No change from current Kubernetes behavior:
+
+1. Kubelet does not see any `SELinuxRelabelPolicy` configured in the pod and thus mounts `myclaim` PVC as usual and if the underlying volume supports SELinux, it passes it to the container runtime with ":Z".
+ Kubelet passes also implicit Secret volume with token with ":Z".
+2. Container runtime allocates a new unique SELinux label to the pod and recursively relabels all volumes with ":Z" to this label.
+
+
+
+#### Story 2
+
+User (or something else, e.g. an admission webhook) configures SELinux label for a pod.
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+ name: testpod
+spec:
+ securityContext:
+ seLinuxOptions:
+ level: s0:c10,c0
+ containers:
+ - image: nginx
+ name: nginx
+ volumeMounts:
+ - name: vol
+ mountPath: /mnt/test
+ volumes:
+ - name: vol
+ persistentVolumeClaim:
+ claimName: myclaim
+```
+
+No change from current Kubernetes behavior:
+
+1. Kubelet does not see any `SELinuxRelabelPolicy` configured in the pod and thus mounts `myclaim` PVC as usual and if the underlying volume supports SELinux, it passes it to the container runtime with ":Z".
+ Kubelet passes also implicit Secret volume with token with ":Z".
+2. Container runtime uses SELinux label "s0:c10,c0", as instructed by Kubernetes. It will recursively relabels all volumes with ":Z" to this label.
+
+#### Story 3
+
+User (or something else, e.g. an admission webhook) configures SELinux label for a pod.
+User chooses `SELinuxRelabelPolicy: "OnVolumeMount"`, because they expect a potentially large volume to be used by the pod.
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+ name: testpod
+spec:
+ securityContext:
+ seLinuxOptions:
+ level: s0:c10,c0
+ seLinuxRelabelPolicy: OnVolumeMount
+ containers:
+ - image: nginx
+ name: nginx
+ volumeMounts:
+ - name: vol
+ mountPath: /mnt/test
+ volumes:
+ - name: vol
+ persistentVolumeClaim:
+ claimName: myclaim
+```
+
+In this case, kubelet tries to mount all pod's volumes with `-o context=s0:c10,c0` mount option`.
+If it succeeds, it passes the volume to the container runtime without ":Z" and the container runtime does not relabel the volume.
+See [New Kubernetes behavior](#new-kubernetes-behavior) for error cases.
+
+
+
+### Implementation Details/Notes/Constraints [optional]
+
+### Risks and Mitigations
+
+## Design Details
+
+### Test Plan
+
+* Unit tests:
+ * API validation (all permutations missing / present PodSecurityPolicy.SELinuxOptions & SELinuxRelabelPolicy & container.SecurityPolicy.SELinuxOptions)
+ * Passing mount options from kubelet to volume plugins.
+* E2e tests:
+ * Check no recursive `chcon` is done on a volume when not needed /
+ * Check recursive `chcon` is done on a volume when needed (with a matrix of SELinuxOptions / SELinuxRelabelPolicy).
+* Prepare e2e job that runs with SELinux in Enforcing mode!
+
+### Graduation Criteria
+
+* Alpha:
+ * Provided all tests defined above are passing and gated by the feature gate `SELinuxRelabelPolicy` and set to a default of `false`.
+ * Documentation exists.
+* Beta: with discussions in SIG-Storage regarding success of deployments. A metric will be added to report time taken to perform a volume ownership change. Feature gate `ConfigurableFSGroupPolicy` is `true`.
+* GA: all known issues fixed.
+
+### Upgrade / Downgrade Strategy
+
+`SELinuxRelabelPolicy` becomes "invisible" or dropped in an downgraded cluster. Container runtime will get ":Z" on volumes and they will do slow recursive chown as they do today.
+
+### Version Skew Strategy
+
+## Production Readiness Review Questionnaire
+
+### Feature enablement and rollback
+
+_This section must be completed when targeting alpha to a release._
+
+* **How can this feature be enabled / disabled in a live cluster?**
+ - [X] Feature gate (also fill in values in `kep.yaml`)
+ - Feature gate name: SELinuxRelabelPolicy
+ - Components depending on the feature gate: apiserver (API validation only), kubelet
+ - [ ] Other
+ - Describe the mechanism:
+ - Will enabling / disabling the feature require downtime of the control
+ plane?
+ - Will enabling / disabling the feature require downtime or reprovisioning
+ of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
+
+* **Does enabling the feature change any default behavior?**
+ Any change of default behavior may be surprising to users or break existing
+ automations, so be extremely careful here.
+
+ No, default behavior is the same as before.
+
+* **Can the feature be disabled once it has been enabled (i.e. can we rollback
+ the enablement)?**
+ Also set `rollback-supported` to `true` or `false` in `kep.yaml`.
+ Describe the consequences on existing workloads (e.g. if this is runtime
+ feature, can it break the existing applications?).
+
+ Yes, it can be disabled / rolled back. Corresponding API fields get cleared and Kubernetes uses previous SELinux label handling.
+
+* **What happens if we reenable the feature if it was previously rolled back?**
+
+ Nothing special happens.
+
+* **Are there any tests for feature enablement/disablement?**
+ The e2e framework does not currently support enabling and disabling feature
+ gates. However, unit tests in each component dealing with managing data created
+ with and without the feature are necessary. At the very least, think about
+ conversion tests if API types are being modified.
+
+ We plan unit tests for enabled / disable feature.
+
+### Rollout, Upgrade and Rollback Planning
+
+_This section must be completed when targeting beta graduation to a release._
+
+* **How can a rollout fail? Can it impact already running workloads?**
+ Try to be as paranoid as possible - e.g. what if some components will restart
+ in the middle of rollout?
+
+ Running workloads are not affected during rollout, because they don't use the new API fields.
+
+* **What specific metrics should inform a rollback?**
+
+* **Were upgrade and rollback tested? Was upgrade->downgrade->upgrade path tested?**
+ Describe manual testing that was done and the outcomes.
+ Longer term, we may want to require automated upgrade/rollback tests, but we
+ are missing a bunch of machinery and tooling and do that now.
+
+* **Is the rollout accompanied by any deprecations and/or removals of features,
+ APIs, fields of API types, flags, etc.?**
+ Even if applying deprecation policies, they may still surprise some users.
+
+### Monitoring requirements
+
+_This section must be completed when targeting beta graduation to a release._
+
+* **How can an operator determine if the feature is in use by workloads?**
+ Ideally, this should be a metrics. Operations against Kubernetes API (e.g.
+ checking if there are objects with field X set) may be last resort. Avoid
+ logs or events for this purpose.
+
+* **What are the SLIs (Service Level Indicators) an operator can use to
+ determine the health of the service?**
+ - [ ] Metrics
+ - Metric name:
+ - [Optional] Aggregation method:
+ - Components exposing the metric:
+ - [ ] Other (treat as last resort)
+ - Details:
+
+* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
+ At the high-level this usually will be in the form of "high percentile of SLI
+ per day <= X". It's impossible to provide a comprehensive guidance, but at the very
+ high level (they needs more precise definitions) those may be things like:
+ - per-day percentage of API calls finishing with 5XX errors <= 1%
+ - 99% percentile over day of absolute value from (job creation time minus expected
+ job creation time) for cron job <= 10%
+ - 99,9% of /health requests per day finish with 200 code
+
+* **Are there any missing metrics that would be useful to have to improve
+ observability if this feature?**
+ Describe the metrics themselves and the reason they weren't added (e.g. cost,
+ implementation difficulties, etc.).
+
+### Dependencies
+
+_This section must be completed when targeting beta graduation to a release._
+
+* **Does this feature depend on any specific services running in the cluster?**
+ Think about both cluster-level services (e.g. metrics-server) as well
+ as node-level agents (e.g. specific version of CRI). Focus on external or
+ optional services that are needed. For example, if this feature depends on
+ a cloud provider API, or upon an external software-defined storage or network
+ control plane.
+
+ For each of the fill in the following, thinking both about running user workloads
+ and creating new ones, as well as about cluster-level services (e.g. DNS):
+ - [Dependency name]
+ - Usage description:
+ - Impact of its outage on the feature:
+ - Impact of its degraded performance or high error rates on the feature:
+
+
+### Scalability
+
+_For alpha, this section is encouraged: reviewers should consider these questions
+and attempt to answer them._
+
+_For beta, this section is required: reviewers must answer these questions._
+
+_For GA, this section is required: approvers should be able to confirms the
+previous answers based on experience in the field._
+
+* **Will enabling / using this feature result in any new API calls?**
+ Describe them, providing:
+ - API call type (e.g. PATCH pods)
+ - estimated throughput
+ - originating component(s) (e.g. Kubelet, Feature-X-controller)
+ focusing mostly on:
+ - components listing and/or watching resources they didn't before
+ - API calls that may be triggered by changes of some Kubernetes resources
+ (e.g. update of object X triggers new updates of object Y)
+ - periodic API calls to reconcile state (e.g. periodic fetching state,
+ heartbeats, leader election, etc.)
+
+ No new API calls are required. Kubelet / CSI volume plugin already has CSIDriver informer.
+
+* **Will enabling / using this feature result in introducing new API types?**
+ Describe them providing:
+ - API type
+ - Supported number of objects per cluster
+ - Supported number of objects per namespace (for namespace-scoped objects)
+
+ No new API types.
+
+* **Will enabling / using this feature result in any new calls to cloud
+ provider?**
+
+ No new calls to cloud providers.
+
+* **Will enabling / using this feature result in increasing size or count
+ of the existing API objects?**
+ Describe them providing:
+ - API type(s):
+ - Estimated increase in size: (e.g. new annotation of size 32B)
+ - Estimated amount of new objects: (e.g. new Object X for every existing Pod)
+
+ CSIDriver gets one new field. We expect only few CSIDriver objects in a cluster.
+
+ Pod gets one new field.
+
+* **Will enabling / using this feature result in increasing time taken by any
+ operations covered by [existing SLIs/SLOs][]?**
+ Think about adding additional work or introducing new steps in between
+ (e.g. need to do X to start a container), etc. Please describe the details.
+
+ Each CSI volume setup (mount) may introduce a mount check (for `seclabel`),
+ i.e. parsing whole /proc/mounts. It should be OK, since we already do mount
+ check in the most volume plugins.
+
+* **Will enabling / using this feature result in non-negligible increase of
+ resource usage (CPU, RAM, disk, IO, ...) in any components?**
+ Things to keep in mind include: additional in-memory state, additional
+ non-trivial computations, excessive access to disks (including increased log
+ volume), significant amount of data send and/or received over network, etc.
+ This through this both in small and large cases, again with respect to the
+ [supported limits][].
+
+ No.
+
+### Troubleshooting
+
+Troubleshooting section serves the `Playbook` role as of now. We may consider
+splitting it into a dedicated `Playbook` document (potentially with some monitoring
+details). For now we leave it here though.
+
+_This section must be completed when targeting beta graduation to a release._
+
+* **How does this feature react if the API server and/or etcd is unavailable?**
+
+* **What are other known failure modes?**
+ For each of them fill in the following information by copying the below template:
+ - [Failure mode brief description]
+ - Detection: How can it be detected via metrics? Stated another way:
+ how can an operator troubleshoot without loogging into a master or worker node?
+ - Mitigations: What can be done to stop the bleeding, especially for already
+ running user workloads?
+ - Diagnostics: What are the useful log messages and their required logging
+ levels that could help debugging the issue?
+ Not required until feature graduated to Beta.
+ - Testing: Are there any tests for failure mode? If not describe why.
+
+* **What steps should be taken if SLOs are not being met to determine the problem?**
+
+[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
+[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
+
+## Implementation History
+
+
+
+## Drawbacks
+
+
+
+## Alternatives
+
+
+
+## Infrastructure Needed (optional)
+
+
+
+
+
+## Implementation History
+
+* 1.19: Alpha
+
+## Drawbacks [optional]
+
+* This KEP changes behavior of volumes shared by multiple pods, where each of them has a different SELinux label. See [Shared Volumes](#shared-volumes) for detail.
+* The API is slightly different that `FSGroupChangePolicy`, which may create confusion.
+
+## Alternatives [optional]
+
+### `FSGroupChangePolicy` approach
+The same approach & API as in `FSGroupChangePolicy` can be used.
+**This is a viable option!**
+
+If kubelet knows SELinux context that should be applied to a volume && hypothetical `SELinuxChangePolicy` is `OnRootMismatch`, it would check context only of the top-level directory of a volume and recursively `chcon` all files only when the top level dir does not match.
+This could be done together with recursive change for `fsGroup`.
+Kubelet would not use ":Z" when passing the volume to container runtime.
+
+With `SELinuxChangePolicy: Always`, usual ":Z" is passed to container runtime and it relabels all volumes recursively.
+
+Advantages:
+* Simplicity, both to users and implementation-wise. Follow `FSGroupChangePolicy` approach and do `chcon` instead of `chown`.
+
+Disadvantages:
+* Speed, Kubernetes must recursively `chcon` all files on a volume when the volume is used for the first time.
+ With `mount -o context`, no relabeling is needed.
+
+### Change container runtime
+
+We considered implementing something like `SELinuxChangePolicy: OnRootMismatch` in the container runtime.
+It would do the same as `PodFSGroupChangePolicy: OnRootMismatch` in [fsGroup KEP], however, in the container runtime.
+
+This approach cannot work because of `SubPath`.
+If a Pod uses a volume with SubPath, container runtime gets only a subdirectory of the volume.
+It could check the top-level of this subdir only and recursively change SELinux context there, however, this could leave different subdirectories of the volume with different SELinux labels and checking top-level directory only does not work.
+With solution implemented in kubelet, we can always check top level directory of the whole volume and change context on the whole volume too.
+
+
+### Move SELinux label management to kubelet
+Right now, it's the container runtime who assigns labels to containers that don't have any specific `SELinuxOptions`.
+We could move SELinux label assignment to kubelet.
+This change would require significant changes both in kubelet (to manage the contexts) and CRI (to list used context after kubelet restart).
+As benefit, kubelet would mount volumes for *all* pods quickly, not only those that have explicit `SELinuxOptions`.
+We are not sure if it's possible to change the default behavior to `OnVolumeMount`, without any field in `PodSecurityPolicy`.
+
+### Merge `FSGroupChangePolicy` and `SELinuxRelabelPolicy`
+With this API, user could ask for any shortcuts that are available regarding SELinux relabeling and ownership change for FSGroup:
+
+```go
+const (
+ // The heuristic policy acts like setting both the OnVolumeMount policy and the OnRootMismatch policy.
+ HeuristicVolumeChangePolicy VolumeChangePolicy = "Heuristic"
+ RecursiveVolumeChangePolicy VolumeChangePolicy = "Recursive"
+)
+
+type PodSecurityContext struct {
+ ...
+ VolumeChangePolicy *VolumeChangePolicy
+ ...
+}
+```
+
+In the vast majority of cases it's what users want.
+
+However, this field is not flexible enough to accommodate special cases.
+If supported by the storage backend and the volume is consumed as whole, `SELinuxRelabelPolicy: OnMount` always works.
+At the same time, `FSGroupChangePolicy: OnRootMismatch` may not be desirable for volumes that are modified outside of Kubernetes,
+where various files on the volume may get random owners.
+
+With a single `VolumeChangePolicy`, user has to fall back to `Recursive` policy and SELinux labels would be unnecessarily changed.
diff --git a/keps/sig-storage/1710-selinux-relabeling/kep.yaml b/keps/sig-storage/1710-selinux-relabeling/kep.yaml
new file mode 100644
index 00000000000..a54c8c42f13
--- /dev/null
+++ b/keps/sig-storage/1710-selinux-relabeling/kep.yaml
@@ -0,0 +1,34 @@
+title: Skip SELinux relabeling of volumes
+kep-number: 1710
+authors:
+ - "@jsafrane"
+owning-sig: sig-storage
+participating-sigs:
+ - sig-node
+status: implementable
+creation-date: 2020-02-18
+reviewers:
+ - "@msau42"
+ - "@gnufied"
+ - "@rhatdan"
+ - "@haircommander"
+ - "@saschagrunert"
+ - "@tallclair"
+approvers:
+ - "@saad-ali"
+see-also:
+ - /keps/sig-storage/20200120-skip-permission-change.md
+stage: alpha
+latest-milestone: "v1.19"
+milestone:
+ alpha: "v1.19"
+ beta: "v1.20"
+ stable: "v1.22"
+feature-gate:
+ name: SELinuxRelabelPolicy
+ components:
+ - kube-apiserver
+ - kubelet
+disable-supported: true
+metrics:
+ # TODO: fill at beta