-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI: send pod information in NodePublishVolumeRequest #2439
Conversation
* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume. | ||
* `csi.kubernetes.io/pod.namespace`: namespace of the pod that wants the volume. | ||
* `csi.kubernetes.io/pod.uid`: uid of the pod that wants the volume. | ||
* `csi.kubernetes.io/serviceAccount.name`: name of the service account under which the pod operates. Namespace is the same as `pod.namespace`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be the annotation names be like x.y.z/w1-w2-w3
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not annotation.
As I wrote, I follow names introduced in Flex, but I don't have strong opinion about the format. CSI itself does not describe how volume_attributes should look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, looks very good. :) A couple of things inline.
## Detailed design | ||
|
||
### CSI enhancement | ||
We don't need to change CSI protocol in any way. It allows kubelet to pass `pod.name`, `pod.uid` and `pod.spec.serviceAccountName` in [`NodePublish` call as `volume_attributes`]((https://github.com/container-storage-interface/spec/blob/master/spec.md#nodepublishvolume)). `NodePublish` is roughly equivalent to Flex `mount` call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NodeServer has two sets of calls that I think are applicable. NodePublishVolume/Unpublish, but also NodeStageVolume/Unstage https://github.com/container-storage-interface/spec/blob/master/spec.md#nodestagevolume. Those calls also need the additional info to function properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case several pods use the same volume on the same node, NodeStageVolume is called only once. It would include just the first Pod. NodePublishVolume is called separately for each pod that uses the volume and CSI driver can do the right thing there.
In addition, Flex driver did not get any pod in mountdevice
call (which is equivalent of NodeStageVolume in CSI and nobody complained.
Flex did not provide pod information in unmount
(=NodeUnpublishVolume) and nobody complained. Do you have concrete use case when pod can be useful in NodeUnpublishVolume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, really? I thought stage was called once per pod, but mount was called per container mounting it? I guess I was mistaken. So I'll have to rewrite the image driver to have those semantics then. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane For inline CSI volume, what would be the equivalent of flexvolume's options
? Currently we can only pass in volumeAttributes
to csi
as part of a PersistentVolume deployment. Is there a way to support volumeAttributes
as part of inline CSI volume as part of the pod spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9. CSI volume plugin stores this flag for the CSI driver. | ||
|
||
When a volume is being mounted for a pod: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional csi stage call here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added NodeStageVolume to the example, with a note that it won't get any pod information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Thanks. The clarity will help others writing drivers.
|
||
### CSI driver wants pod information in `NodePublish` | ||
1. CSI driver author provides documentation and templates that the driver need pod information in `NodePublish`. | ||
2. Admin installs CSI driver with CSI Driver Registrar `--requires-publish-pod`, e.g. by applying helm template from CSI diver author. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/diver/driver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks
We're considering using cluster-global API object to hold "configuration" of CSI driver in Kubernetes, i.e. how Kubernetes talks to the driver. It should contain also a flag if kubelet should pass pod name/namespace/uid in NodePublish. We won't need new gRPC API in kubelet this way. The design is it flux right now, stay tuned. |
hmm. cluster global info might allow for pod inline validation to happen earlier too. +1. |
I reworked the proposal to use |
And I kept the original commit there, just in case we need to revert to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. just a couple of typo's to fix.
## High-level design | ||
We decided to pass the pod information as `NodePublishVolumeRequest.volume_attributes`. | ||
|
||
* Kubernetes passes pod information only to CSI drivers that explicitly require that information in their [`CSIDriver` instance](https://github.com/kubernetes/community/pull/2523). These drivers are tightly coupled to Kubernetes and may not work or may require reconfiguration on other cloud orchestrators. It is expected (but not limited to) that these drivers will provider ephemeral volumes similar to Secrets or ConfigMap, extending Kubernetes secret or configuration sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo at 'provider ephemeral'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
We decided to pass the pod information as `NodePublishVolumeRequest.volume_attributes`. | ||
|
||
* Kubernetes passes pod information only to CSI drivers that explicitly require that information in their [`CSIDriver` instance](https://github.com/kubernetes/community/pull/2523). These drivers are tightly coupled to Kubernetes and may not work or may require reconfiguration on other cloud orchestrators. It is expected (but not limited to) that these drivers will provider ephemeral volumes similar to Secrets or ConfigMap, extending Kubernetes secret or configuration sources. | ||
* Kubernetes will not pass pod information to CSI drivers that don't know or don't care about pods and service accounts. It is expected (but not limited to) that these drivers will provide real persistent storage. Such CSI driver would reject a CSI call with pod information as invalid. This is current behavior of Kubernete and it will be the default behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernete typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add one more that I think will be useful?
csi.kubernetes.io/volume.name: name of the volume as specified in the spec.
ex:
volumes:
- name: foo
csi:
driver: image
it would be "foo" in this case.
@kfox1111, do you have concrete use case for |
* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume. | ||
* `csi.kubernetes.io/pod.namespace`: namespace of the pod that wants the volume. | ||
* `csi.kubernetes.io/pod.uid`: uid of the pod that wants the volume. | ||
* `csi.kubernetes.io/serviceAccount.name`: name of the service account under which the pod operates. Namespace of the service account is the same as `pod.namespace`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strings above define sort of API between Kubernetes and kubernetes-aware CSI drivers. @liggitt, does it need any special approval? It will be set to stone when we implement this proposal.
Background: volume_attributes
is generic map[string]string
, without any prescribed format of keys in CSI standard. I used the above keys because similar ones are already used in flex volumes. Now is the time to change the keys if we don't like them.
At very least, volume.name + pod.id is a guaranteed unique identifier for the volume instance. So could be useful to drivers that are pod aware and want to track lifecycle? There may be other uses too. Haven't thought too much about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other then one comment.
We don't need to change CSI protocol in any way. It allows kubelet to pass `pod.name`, `pod.uid` and `pod.spec.serviceAccountName` in [`NodePublish` call as `volume_attributes`]((https://github.com/container-storage-interface/spec/blob/master/spec.md#nodepublishvolume)). `NodePublish` is roughly equivalent to Flex `mount` call. | ||
|
||
The only thing we need to do is to **define** names of the `volume_attributes` keys that CSI drivers can expect: | ||
* `csi.kubernetes.io/pod.name`: name of the pod that wants the volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
csi.storage.kubernetes.io
to align with CSIDriver
group name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to csi.storage.k8s.io/*
Renamed parameters to |
b97efb4
to
5068303
Compare
And added feature gate |
Implementation: kubernetes/kubernetes#67945 |
Automatic merge from submit-queue (batch tested with PRs 68171, 67945, 68233). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. CSI pod information in NodePublish **What this PR does / why we need it**: This is implementation of kubernetes/community#2439. It brings CSI closer to Flex volume features, CSI driver can (optionally) get name/namespace/UID/ServiceAccount of the pod that requires the CSI volume mounted. This allows CSI drivers to either do AAA before the volume is mounted or tailor content of volume to the pod. Work in progress: * contains #67803 to get `CSIDriver` API. Ignore the first commit. Related to #64984 #66362 (fixes only part of these issues) /sig storage cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied **Release note**: ```release-note CSI NodePublish call can optionally contain information about the pod that requested the CSI volume. ```
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
This was implemented as alpha in k8s 1.12 (kubernetes/kubernetes#67945). Merging the doc so that it can be updated. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane, saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
CSI: send pod information in NodePublishVolumeRequest
This proposal adds possibility to send pod information in CSI
NodePublishVolumeRequest
call.Fixes kubernetes/kubernetes#64984
It depends on #2514
/sig storage