Skip to content

Commit

Permalink
Address comments
Browse files Browse the repository at this point in the history
  • Loading branch information
liggitt committed May 4, 2017
1 parent 11b89bf commit 97e839b
Showing 1 changed file with 82 additions and 32 deletions.
114 changes: 82 additions & 32 deletions contributors/design-proposals/kubelet-authorizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,54 @@ Author: Jordan Liggitt ([email protected])

## Overview

Kubelets are responsible for:
Kubelets are primarily responsible for:
* creating and updating status of their Node API object
* updating status of Pod API objects bound to their node
* running and updating status of Pod API objects bound to their node
* creating/deleting "mirror pod" API objects for statically-defined pods running on their node
* reading Secret and ConfigMap objects referenced by pod specs bound to their node

Currently, kubelets have read/write access to all Node and Pod objects, and read access to all Secret and ConfigMap objects.
This means that compromising a node gives access to credentials with power to make escalating API calls that affect other nodes.
To run a pod, a kubelet must have read access to the following objects referenced by the pod spec:
* Secrets
* ConfigMaps
* PersistentVolumeClaims (and any bound PersistentVolume or referenced StorageClass object)

This proposal limits kubelets' API access using a new node authorizer and admission plugin:
As of 1.6, kubelets have read/write access to all Node and Pod objects, and
read access to all Secret, ConfigMap, PersistentVolumeClaim, and PersistentVolume objects.
This means that compromising a node gives access to credentials that allow modifying other nodes,
pods belonging to other nodes, and accessing confidential data unrelated to the node's pods.

This document proposes limiting a kubelet's API access using a new node authorizer and admission plugin:
* Node authorizer
* Authorizes requests from nodes using the existing policy rules developed for the `system:node` cluster role
* Further restricts secret and configmap requests to only authorize reading objects referenced by pods bound to the node
* Authorizes requests from nodes using a fixed policy identical to the default RBAC `system:node` cluster role
* Further restricts secret and configmap access to only allow reading objects referenced by pods bound to the node making the request
* Node admission
* Limits a node to mutating its own Node API object
* Limits a node to mutating pods bound to itself
* Limits nodes to only be able to mutate their own Node API object
* Limits nodes to only be able to mutate pods bound to themselves
* Limits nodes to only be able to create mirror pods
* Prevents creating mirror pods that reference API objects (secrets, configmaps, persistent volume claims)
* Prevents creating mirror pods that are not bound to nodes
* Prevents removing mirror pod annotations

## Alternatives considered

**Can this just be enforced by authorization?**

Authorization does not have access to request bodies (or the existing object, for update requests),
so it could not restrict access based on fields in the incoming or existing object.

**Can this just be enforced by admission?**

Admission is only called for mutating requests, so it could not restrict read access.

**Can an existing authorizer be used?**

Only one authorizer (RBAC) has in-tree support for dynamically programmable policy.

Manifesting RBAC policy rules to give each node access to individual objects within namespaces
would require large numbers of frequently-modified roles and rolebindings, resulting in
significant write-multiplication.

Additionally, not all clusters will use RBAC, but all useful clusters will have nodes.
A node-specific authorizer allows cluster admins to continue to use their authorization mode of choice.

## Node identification

Expand All @@ -42,10 +74,12 @@ The default `NodeIdentifier` implementation:
* `isNode` - true if the user groups contain the `system:nodes` group
* `nodeName` - populated if `isNode` is true, and the user name is in the format `system:node:<nodeName>`

This group and user name format match the identity created for each kubelet as part of [kubelet TLS bootstrapping](https://kubernetes.io/docs/admin/kubelet-tls-bootstrapping/).

## Node authorizer

A new node authorizer will be inserted into the authorization chain:
* API server authorizer (authorizes "loopback" API clients used by components within the API server)
* API server authorizer (existing, authorizes "loopback" API clients used by components within the API server)
* Node authorizer (new)
* User-configured authorizers... (e.g. ABAC, RBAC, Webhook)

Expand All @@ -64,56 +98,72 @@ Requests from identifiable nodes (`IdentifyNode()` returns nodeName != "") for s
* Requests for configmaps are limited to `get`, and the requested configmap must be related to the requesting node by one of the following relationships:
* node -> pod -> configmap

Requests that do not meet those conditions are forbidden by this authorizer.
Subsequent authorizers in the chain can run and choose to allow the request.

## Node admission

A new node admission plugin is made available that enforces the following:

Limits `create` of node resources by identifiable nodes:
* only allow the node object corresponding to the node making the API request

Limits `create` of pod resources by identifiable nodes:
* only allow pods with mirror pod annotations
* only allow pods with nodeName set to the node making the API request
* only allow pods with no secret references

Limits `update`,`patch`,`delete` of node and nodes/status resources by identifiable nodes:
* only allow modifying the node object corresponding to the node making the API request

Limits `update`,`patch`,`delete` of pod and pod/status resources by identifiable nodes:
* only allow modifying pods with nodeName set to the node making the API request (requires fetching the pod on delete)
* do not allow removing a mirror pod annotation
For requests made by identifiable nodes:
* Limits `create` of node resources:
* only allow the node object corresponding to the node making the API request
* Limits `create` of pod resources:
* only allow pods with mirror pod annotations
* only allow pods with nodeName set to the node making the API request
* Limits `update`,`delete` of node and nodes/status resources:
* only allow modifying the node object corresponding to the node making the API request
* Limits `update`,`delete` of pod and pod/status resources:
* only allow modifying pods with nodeName set to the node making the API request (requires fetching the pod on delete)

For requests made by any user:
* Limits `create` of pod resources with mirror pod annotations:
* Must specify a nodeName
* Must not reference any secrets, serviceaccounts, configmaps, or persistentvolumeclaims
* Limits `update` of pod resources with mirror pod annotations:
* Must not modify the mirror pod annotation
* Must not modify the nodeName

## API Changes

None

## RBAC Changes

Currently, the `system:node` cluster role is automatically bound to the `system:nodes` group.
As of 1.6, the `system:node` cluster role is automatically bound to the `system:nodes` group when using RBAC.

Because the node authorizer accomplishes the same purpose, with the benefit of additional restrictions
on secret and configmap access, this binding is no longer needed, and will no longer be set up automatically.

The `system:node` cluster role will continue to be created, for compatibility with deployment
methods that bind other users or groups to that role.
The `system:node` cluster role will continue to be created when using RBAC,
for compatibility with deployment methods that bind other users or groups to that role.

## Migration considerations

### Kubelets outside the `system:nodes` group

Kubelets outside the `system:nodes` group would not be authorized by the node authorizer,
and would need to continue to be authorized via whatever mechanism currently authorizes them.
The node admission plugin would also ignore requests from these kubelets.
The node admission plugin would not restrict requests from these kubelets.

### Kubelets with undifferentiated usernames

In some deployments, kubelets have credentials that place them in the `system:nodes` group,
but do not identify the particular node they are associated with.
Those kubelets would be broadly authorized by the node authorizer, but would not have secret and configmap
requests restricted, since the specific node name would not be available.
The node admission plugin would ignore requests from these kubelets.
Those kubelets would be broadly authorized by the node authorizer,
but would not have secret and configmap requests restricted.
The node admission plugin would not restrict requests from these kubelets.

### Upgrades from previous versions

Versions prior to 1.7 that have the `system:node` cluster role bound to the `system:nodes` group would need to
remove that binding in order for the node authorizer restrictions on secret and configmap access to be effective.

## Future work

Node and pod mutation, and secret and configmap read access are the most critical permissions to restrict.
Future work could further limit a kubelet's API access:
* only get persistent volume claims and persistent volumes referenced by a bound pod
* only write events with the kubelet set as the event source
* only get/list/watch pods bound to the kubelet's node (requires additional list/watch authorization capabilities)
* only get/list/watch it's own node object (requires additional list/watch authorization capabilities)

0 comments on commit 97e839b

Please sign in to comment.