diff --git a/keps/draft-20180426-namespace-template.md b/keps/draft-20180426-namespace-template.md new file mode 100644 index 00000000000..0f0c25fd91b --- /dev/null +++ b/keps/draft-20180426-namespace-template.md @@ -0,0 +1,513 @@ +--- +kep-number: draft-20180426 +title: Namespace Population +authors: + - "@easeway" +owning-sig: sig-auth +participating-sigs: + - sig-api-machinery + - sig-cluster-lifecycle +reviewers: + - "@davidopp" + - "@tallclair" + - "@ericchiang" + - "@liggitt" + - "@roberthbailey" +approvers: + - "@tallclair" + - "@ericchiang" + - "@liggitt" + - "@roberthbailey" +editor: "@easeway" +creation-date: 2018-04-26 +status: provisional +--- + +# Namespace Population + +## Table of Contents + +* [Table of Contents](#table-of-contents) +* [Summary](#summary) +* [Motivation](#motivation) + * [Goals](#goals) + * [Non-Goals](#non-goals) +* [Proposal](#proposal) + * [User Stories](#user-stories) + * [Security Defaults](#security-defaults) + * [Implementation Details](#implementation-details) + * [The Controller](#the-controller) + * [Namespace Match](#namespace-match) + * [Apply the template](#apply-the-template) + * [Schema validation](#schema-validation) + * [Opt-out](#opt-out) + * [Self-serviced namespace creation](#self-serviced-namespace-creation) + * [Readiness of a namespace](#readiness-of-a-namespace) + * [Security Consideration](#security-consideration) + * [Opt-out is privileged](#opt-out-is-privileged) + * [Race Condition](#race-condition) + * [Manual Update of objects](#manual-update-of-objects) +* [Graduation Criteria](#graduation-criteria) + +## Summary + +Namespace Population is an automated mechanism to make sure the predefined policy objects +(e.g. NetworkPolicy, Role, RoleBinding) are present in selected namespaces. + +## Motivation + +Kubernetes users create namespaces (self-service namespace creation) and want to +ensure security/isolation policies are enforced between them and other users. +Just like addon manager creates cluster-scope policy objects like PodSecurityPolicy, +ClusterRole, ClusterRoleBinding at cluster creation time. +It creates namespace-scope policy objects during namespace creation time. + +### Goals + +The namespace population mechanism described in this proposal is effective to +populate identical configurations into large number of namespaces. +It's also effective to populate different sets of predefined configurations into +different sets of namespaces. +The mechanism is agnostic to object types as long as they are namespace scoped. +The mechanism works in a single cluster. + +### Non-Goals + +The namespace population mechanism is NOT effective for customizing namespaces +individually. +No dynamic objects (generating the object definition during population, +e.g. string substitution) is supported - all objects are defined statically, +except `$(CREATOR)` and `$(NAMESPACE)`. +No validation is performed over static object definitions, until they are applied. +The mechanism does NOT work across clusters. + +## Proposal + +The namespace population is performed by defining one or more cluster-scope +Custom Resource with kind `NamespaceTemplate`, +and deploy a controller in the cluster. + +The Custom Resource Definition of `NamespaceTemplate`: + +```yaml +apiVersion: apiextensions.k8s.io/v1beta1 +kind: CustomResourceDefinition +metadata: + name: namespacetemplates.policy +spec: + group: policy + version: v1alpha1 + scope: Cluster + names: + plural: namespacetemplates + singular: namespacetemplate + kind: NamespaceTemplate + shortNames: + - nstpl +``` + +The example of `NamespaceTemplate`: + +```yaml +apiVersion: policy/v1alpha1 +kind: NamespaceTemplate +metadata: + name: default +spec: + namespaces: + # labelSelector selects namespaces by labels to populate defined objects + labelSelector: {} + # excludes lists names of namespaces which must be excluded from being applied + # with current namespace template. + excludes: [] + common: + # labels to be injected into all objects defined in templates + labels: {} + # annotations to be injected into all objects defined in templates + annotations: {} + templates: + # list of objects to be populated into namespaces defined here + - apiVersion: networking.k8s.io/v1 + kind: NetworkPolicy + metadata: + name: default + spec: + podSelector: {} + policyTypes: ["Ingress", "Egress"] + - apiVersion: rbac.authorization.k8s.io/v1 + kind: RoleBinding + metadata: + name: use-podsecuritypolicy + roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: use-psp-default + subjects: + - apiGroup: rbac.authorization.k8s.io + kind: Group + name: system:serviceaccounts + - apiVersion: rbac.authorization.k8s.io/v1 + kind: RoleBinding + metadata: + name: creator + roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: admin + subjects: + - apiGroup: rbac.authorization.k8s.io + kind: User + name: '$(CREATOR)' +``` + +### User Stories + +#### Security Defaults + +The cluster administrator manages a shared Kubernetes clusters and allows +developers self-service to create namespaces by themselves, but with predefined +NetworkPolicy, RoleBinding using PodSecurityPolicy, ResourceQuota and a few +ServiceAccounts and Roles present in the namespace, as security defaults. + +The cluster administrator creates `NamespaceTemplate` with all these objects +in `templates`. +With the help of `NamespaceTemplate` enforcement controller, these objects are +created automatically when a user creates a new namespace. + +### Implementation Details + +#### The Controller + +A controller is deployed in the cluster (recommended in the `kube-system` namespace). +It runs with a service account with privilege to + +- get/list/watch Namespace objects +- get/list/watch NamespaceTemplate objects +- get/list/create/update/patch/delete namespace-scope objects in all namespaces, except `kube-system` + +The controller watches Namespace and NamespaceTemplate objects, +when a namespace is added/updated, it applies the objects in namespace templates matching the namespace; +when a namespace template is added/updated, it applies objects in the template to all matching namespaces. + +#### Namespace Match + +The `NamespaceTemplate` uses `labelSelector` and `excludes` in `namespaces` property to select namespaces. +The selection process first uses `labelSelector`, which is the common mechanism used in Kubernetes, +to filter eligible namespaces, and then uses `excludes` to further filter the result. + +The property `excludes` is a list, with each item identifying the name of a namespace. +Wildcard (`*` and `?`) is allowed, and is handled the same way matching a file name. +The reason to use `excludes` as complimentary to `labelSelector` is that `labelSelector` only works with labels, +not names. +Sometimes it's very difficult to select/de-select a namespace if there's no labels. + +An alternative is to match namespaces based on the user who creates the namespace. +This can be consistent with RBAC which defines the permission between the users and the `NamespaceTemplate` that +the user can apply into the namespaces he creates: + +```yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: use-namespace-template-default +rules: + - apiGroups: ["poliy"] + resources: ["namespacetemplates"] + resourceNames: ["default"] + verbs: ["use"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: namespace-template-usage +roleRef: + kind: ClusterRole + name: use-namespace-template-default + apiGroup: rbac.authorization.k8s.io +subjects: +- kind: User + name: "user1" + apiGroup: rbac.authorization.k8s.io +``` + +This approach changes the namespace defaulting across the cluster: by default (without RBAC), nothing will be populated. +Both approaches can exist at the same time, with annotation on the `NamespaceTemplate` to decide which approach to go. + +``` +policy/namespace-template-activation=rbac +``` + +To select the RBAC based approach. + +MVP scope only covers `labelSelector` based approach. + +#### Apply the template + +The list of `templates` in `NamespaceTemplate` is extracted and concatenated as multi-document YAML +(separated by `---` line between two objects). +The labels/annotations are injected from `common` during this process with additionally + +``` +policy/namespace-template-name= +``` + +To indicate which namespace template creates this object. + +Then `kubectl apply` is used to apply the template. +The reason to use `kubectl` is that the apply logic (3-way merge) is very complicated, +and currently performed by `kubectl` (not in API server). +Once the logic is available in API server, the APIs will be used directly. +`--prune` options is used together with `-l` to clean up old objects no longer defined when namespace template changes. + +When multiple templates matches the same namespace, +the templates are concatenated in the alphabetic order of the name of `NamespaceTemplate`. + +A `NamespaceTemplate` can be disabled by an explicit annotation: + +``` +policy/namespace-template-apply=disable +``` + +#### Schema validation + +As `NamespaceTemplate` is defined as CRD (Custom Resource Definition), +currently there's no effective way to validate the schema of CR (Custom Resource) +if it contains complex types, or encapsulating existing object types. +This problem is out-of-scope here. +As a result, the `templates` are not validated during the creation of `NamespaceTemplate`, +and it will fail at population time (via `kubectl apply`) if anything is written incorrectly. +The failure will be reported as Kubernetes Events in the target namespace. + +An alternative is to create a template namespace, +and put all these objects into the namespace as real resources. +When a new namespace is created, the objects in the template namespace are copied over. +As real objects are created, they are effective, and may cause side effects, +though within known object types, there's no impact of the effectiveness of these objects. + +There are other alternatives that the objects can be defined somewhere else from `NamespaceTemplate`. +The implementation can be pluggable (e.g. adding a `source` property) to support different sources of templates. +While the initial proposal will focus on inlined objects. + +The MVP scope covers the inlining templates only. + +#### Opt-out + +Cluster administrator is able to opt-out some namespaces from being populated by the controller. +`kube-system` is always opt-out. +For other namespaces, there are two options to opt-out from automated population: + +1. Put a label that is excluded in the `labelSelector` of the `NamespaceTemplate` +2. Annotate the namespace with `policy/namespace-template-opt-out=true` + +Note: the objects previously populated by the controller will be left as-is in the namespace when it's opted out. + +Normally, there's no _exclusive_ condition in `labelSelector` of the `NamespaceTemplate`, so option 2 is recommended. + +Some ephemeral namespaces created by privileged controllers may need to be opt-out +as they are not accessibly by users and will be managed completely by the controllers. +As these controllers don't have knowledge about `NamespaceTemplate`, +or it's almost impossible to customize the way how they create namespaces, +an automated opt-out mechanism is required. +The mechanism is based on the service account running the controllers. +With a `MutatingAdmissionWebhook`, +a namespace created by these controllers will be automatically annotated for opt-out. +This mechanism is out of MVP scope. + +#### Self-serviced namespace creation + +Self-serviced namespace creation allows certain users/groups to create namespaces +by themselves using Kubernetes API (or kubectl) directly. +As only cluster admin has the privilege to create namespaces by default, an extra +`ClusterRole` is created to allow _create_ verb of namespaces: + +```yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: namespace-creator +rules: + - apiGroups: [""] + resources: ["namespaces"] + verbs: ["create"] +``` + +After the namespace is created, +the creator should be granted further permissions to manage the namespace. +The following objects are recommended to be included in the `templates` of `NamespaceTemplate`: + +```yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: namespace-owner +rules: + - apiGroups: [""] + resources: ["namespaces"] + resourceNames: ["$(NAMESPACE)"] + verbs: ["get", "list", "watch", "update", "patch", "delete"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: admin +roleRef: + kind: ClusterRole + name: admin + apiGroup: rbac.authorization.k8s.io +subjects: +- kind: User + name: "$(CREATOR)" + apiGroup: rbac.authorization.k8s.io +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: namespace-owner +roleRef: + kind: Role + name: namespace-owner + apiGroup: rbac.authorization.k8s.io +subjects: +- kind: User + name: "$(CREATOR)" + apiGroup: rbac.authorization.k8s.io +``` + +There is one issue that when the controller gets informed to create objects inside a new +namespace from `NamespaceTemplate`, it has no idea who created the namespace. +To be able to substitute `$(CREATOR)`, a `MutatingAdmissionWebhook` is involved in namespace +creation request. +It adds an annotation `policy/namespace-creator=name` to the namespace being created. +This annotation is immutable from further update/patch requests. + +#### Readiness of a namespace + +A newly created namespace is not ready before all objects in `NamespaceTemplate` +are fully populated. +To determine the readiness of a namespace, simply inspect `phase` in `status` +subresource of the namespace. It's ready when the `phase` is `Active`. +Alternatively, the `NamespaceTemplate` controller is registered as an _Initialization Controller_ +and uses `Initializer` admission control to hide the namespace before it's fully populated. + +### Security Consideration + +#### Opt-out is privileged + +A user who has the permission to create a namespace can put the opt-out annotation, +though with proper RBAC setup, +the user may not have further permission in that namespace without being populated. +The situation can be handled in a better way with a `ValidatingAdmissionWebhook` checking +a specific RBAC binding: + +```yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: namespace-template-optout +rules: + - apiGroups: ["policy"] + resources: ["namespacetemplates"] + verbs: ["optout"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: namespace-template-optout +roleRef: + kind: ClusterRole + name: namespace-template-optout + apiGroup: rbac.authorization.k8s.io +subjects: +- kind: User + name: "permitted-user" + apiGroup: rbac.authorization.k8s.io +``` + +The webhook will fail the request if the opt-out annotation is present without the RBAC binding for the requestor. + +#### Race Condition +When a namespace is created, there's a gap before all objects in the namespace +templates are populated. +During the gap, the users may have the access to create objects into namespaces. + +To avoid the situation, a few mechanisms can be used: + +##### Creating namespace through service + +Users don't have permission to create namespace directly using Kubernetes API. +The namespace creation is performed using a custom service which runs under a +service account with enough privilege to create namespaces and objects inside namespaces. +The service will also grant the current user privilege to manage the namespace. + +This approach requires additional development effort and introduces non-standard way +for namespace creation. + +##### Setup RBAC properly + +See above [Self-serviced namespace creation](#self-serviced-namespace-creation) +for details of RBAC setup. + +This approach requires pre-setup work from cluster admin for ClusterRoleBinding to grant +users/groups the privilege to create namespaces. +Beyond that, it is the simplest and recommended approach. + +##### Use of Initializers + +The _NamespaceTemplate_ enforcement controller becomes a namespace initializer. +A newly created namespace stays _uninitialized_ until the controller creates +all predefined objects. + +However, the Initializers admission control only hides the namespace without +block actual access in the namespace. +As long as a user knows the name of the namespace, it's still possible to create +objects inside. + +##### Use of ValidatingAdmissionWebhook + +An external _ValidatingAdmissionWebhook_ will block all create/update/patch/delete +operations in an _uninitialized_ namespace, except those from the _NamespaceTemplate_ +enforcement controller (by whitelisting the service account running the controller). + +The webhook works similarly to _Initializers_. +The service account running the controller for initialization must be whitelisted. +The recommended mechanism for whitelist is using RBAC: + +```yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: namespace-initialization +rules: + - apiGroups: [""] + resources: ["namespaces"] + verbs: ["initialize"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: namespace-initializer +roleRef: + kind: ClusterRole + name: namespace-initialization + apiGroup: rbac.authorization.k8s.io +subjects: +- kind: ServiceAccount + name: "system:serviceaccount:kube-system:name" + apiGroup: rbac.authorization.k8s.io +``` + +Using external admission webhook increases the complexity, and also latency. +As the webhook must be reliable all the time, it must be deployed in HA. +It's recommended to create a built-in admission controller for namespace initialization. + +#### Manual Update of objects + +The objects created by namespace population may be altered manually or through +other ways. +The recommended way is setting up RBAC properly to mitigate the risk. +The proposal doesn't watch update of objects already created. +The update may be reverted in the case triggers template re-apply (namespace change or template change). + +## Graduation Criteria + +Namespaces can be securely populated.