Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-596: promote CSIInlineVolume to GA #3158

Merged
merged 1 commit into from
Feb 3, 2022

Conversation

dobsonj
Copy link
Member

@dobsonj dobsonj commented Jan 18, 2022

  • One-line PR description:
    KEP-596: promote CSIInlineVolume to GA
  • Other comments:

KEP Preview: https://github.com/dobsonj/enhancements/tree/kep-596-ga/keps/sig-storage/596-csi-inline-volumes

/cc @jsafrane

@k8s-ci-robot k8s-ci-robot requested a review from jsafrane January 18, 2022 15:46
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jan 18, 2022
@dobsonj dobsonj force-pushed the kep-596-ga branch 3 times, most recently from 18e4aad to f3e11c8 Compare January 19, 2022 17:02
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 25, 2022
@dobsonj dobsonj force-pushed the kep-596-ga branch 2 times, most recently from 97fe988 to d3aa8d0 Compare January 25, 2022 22:01
@dobsonj dobsonj changed the title KEP-596: update milestones, template, security considerations KEP-596: promote CSIInlineVolume to GA Jan 26, 2022
Comment on lines 294 to 298
#### GA

- Remove dependency on deprecated `PodSecurityPolicy` and document new strategy
- Upgrade / downgrade manual testing
- Conformance tests implemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to leave the feature in beta and publish updated docs that don't rely on using PodSecurityPolicy, then get feedback for at least 1 release to make sure that folks are happy with how it works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a line to the GA graduation criteria to make sure the docs get updated before graduation. I'm not so sure it's necessary to wait another release cycle after the docs are updated since this feature has already been in beta for >2 years, but I'll defer that decision to the sig-storage leads.

@sftim
Copy link
Contributor

sftim commented Jan 27, 2022

A question: if I'm a cluster admin and volume creation is relatively expensive, but my CSI driver makes it possible, what controls can / should I put in place to stop a single namespace from having a wider impact on my storage?

Are these controls different from the restrictions I would need to put in place to stop people using the PersistentVolumeClaim mechanism to achieve the same kind of denial-of-service / spend-all-the-budget impact?

My point really is that if this does introduce a new way to turn namespace-scoped access into a shocking cloud bill, we should take that into account.

@jsafrane
Copy link
Member

My point really is that if this does introduce a new way to turn namespace-scoped access into a shocking cloud bill, we should take that into account.

This has been discussed on sig-auth meeting several times, the guidance we got is that you should not install "unsafe" CSI drivers in the first place. Every CSI driver that allows "Ephemeral" use is a suspicious one and either it should be safe by default or come with something that makes it safe, such as an admission plugin. We solve this by documentation and by fixing CSI drivers that are under kubernetes-sigs umbrella.

There is nothing in Kubernetes that would prevent an unsafe (or broken) driver to do anything with your storage.

@ehashman
Copy link
Member

/assign @wojtek-t

@sftim
Copy link
Contributor

sftim commented Jan 30, 2022

This has been discussed on sig-auth meeting several times, the guidance we got is that you should not install "unsafe" CSI drivers in the first place. Every CSI driver that allows "Ephemeral" use is a suspicious one and either it should be safe by default or come with something that makes it safe, such as an admission plugin. We solve this by documentation and by fixing CSI drivers that are under kubernetes-sigs umbrella.

Thanks, that makes sense.
Once that advice is part of the docs on selecting CSI drivers and operating them securely as part of your cluster, I'd have no objection to seeing this feature gradate to stable. At the moment, I don't think the docs cover this.

We will update the documentation to include the security aspects of inline CSI volumes and recommend CSI driver vendors not implement inline volumes for persistent storage unless they also provide a 3rd party pod admission plugin.

This is consistent with the proposal by sig-auth in [KEP-2579](https://github.com/kubernetes/enhancements/blob/787515fbfa386bed95ff4d21e472474f61d1c536/keps/sig-auth/2579-psp-replacement/README.md?plain=1#L512-L519) regarding how inline CSI volumes should be handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Secret Store CSI Driver handle the security aspects properly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of.

  • The CSI driver needs an user to set up SecretProviderClass CR to provide any secrets to Pods. This is well documented: https://secrets-store-csi-driver.sigs.k8s.io/concepts.html#secretproviderclass
  • What is not documented is that cluster admin should give permissions to create (update, delete, read) SecretProviderClasses only to trustworthy users. Depending on the secret provider underneath and its permissions, an users that can create SecretProviderClasses can get any secrets that underlying secret provider has.

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a bunch of PRR-related questions, but I would also like to see SIG-level approval first.

keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved

###### Are there any tests for feature enablement/disablement?

Yes, the unit tests will test with the feature gate enabled and disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To you have any tests that exercise the "switch" (i.e. the test start with feature gate enabled and disables it during the test - or vice-versa)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for this question in the template states:

The e2e framework does not currently support enabling or disabling feature
gates. However, unit tests in each component dealing with managing data, created
with and without the feature, are necessary. At the very least, think about
conversion tests if API types are being modified.

I see quite a few unit tests exercising objects "created with and without the feature". For example the strategy and mounter tests. Is this what you're looking for? Or if not, is there an example you can point me to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a test that is doing something, switching the feature-gate and then doing something else and checking
This one is a nice (and simple) example:
https://github.com/kubernetes/kubernetes/blob/b7c82bb83c1b3933b99fbc5fdcffa59fd6441617/pkg/registry/networking/networkpolicy/strategy_test.go#L246-L281

[I'm not going to block this GA on this at this point, just checking where we are and building awareness that things are at least to some extent possible. FWIW, I noted an AI for myself to extend the comment for that question in the template]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, thanks for the example. I added a TODO item in the "Ephemeral inline volumes unit tests" section to follow up on this. It should be possible to add something similar for this feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved
also needs to support ephemeral inline volumes, and the pod spec would need
to be updated to make use of the feature.

A rollback (or disabling the feature flag) may impact running workloads,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph seems to be focused on "new workloads we're trying start".
What about the pods that use CSI inline volumes that are already running. Is there anything that could lead to their failures? Under normal circumstances? If they fails for some reason and needs to be restarted by Kubelet (e.g. due to Always restart policy)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded the rollout section to (hopefully) clarify. I can't think of any reason why rollout would impact running pods, even in the case of pod restarts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - the previous paragraph now looks ok to me.

Regarding this one ("A rollback...") I would make it more explicit, that once these workloads are already running, they are fine. And it may be problematic if we want to start new ones or potentially we need to restart the running ones (if kubelet no longer has a FG enabled, then it will not call appropriate mount/unmount calls, right?

So (assuming I'm correct - please challenge me if I'm missing something), I would change to something like:

A rollback (or disabling the feature flag) will not impact already running workloads, however starting new ones (as well as potentially restarting ones that failed for some reason) would fail. Additionally, kubelet will fail to fully cleanup after pods that were running and taking advantage of inline volumes.
*It is highly recommended to delete any pods using this feature before disabling it*.

Does that make sense? Is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, with one caveat. If the feature flag is disabled, kubelet will still attempt to unmount existing volumes when requested to do so (i.e. there is no feature flag check when unmounting):
https://github.com/kubernetes/kubernetes/blob/66445662ad7b65ffdb144b9665608f7b6a4b57a6/pkg/volume/csi/csi_mounter.go#L380-L382
But it will not attempt to mount new volumes using this feature if it's disabled.
https://github.com/kubernetes/kubernetes/blob/66445662ad7b65ffdb144b9665608f7b6a4b57a6/pkg/volume/csi/csi_mounter.go#L380-L382
I changed your wording slightly to "kubelet may fail to fully cleanup" since kubelet will request NodeUnpublishVolume but cleanup then depends on the individual CSI driver, and we still recommend deleting pods using this feature before disabling it.

keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved
keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved
keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved
keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved

###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Compared to CSI persistent volumes, there should be no increase in the amount of time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's worth clarifying that mount time is on a critical part for pod-startup slo:
https://github.com/kubernetes/community/blob/master/sig-scalability/slos/pod_startup_latency.md

And configmaps/secrets are actually (at least as of now) part of stateless pods, which means that we may see a regression.
So following up on that (wearing my scalability hat now) - do we have any numbers about this now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's worth clarifying that mount time is on a critical part for pod-startup slo:

I've added a note to this in the KEP with a link to the pod startup latency SLO.

And configmaps/secrets are actually (at least as of now) part of stateless pods, which means that we may see a regression.

Ah, but the definition in the pod startup latency SLO states:

[2]A stateless pod is defined as a pod that doesn't mount volumes with sources other than secrets, config maps, downward API and empty dir.

If a pod uses CSI inline volumes, that is still a type of volume. So it's not really a stateless pod in that case.

So following up on that (wearing my scalability hat now) - do we have any numbers about this now?

No, I don't have any numbers readily available. What kind of comparison would be interesting to see?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a pod uses CSI inline volumes, that is still a type of volume. So it's not really a stateless pod in that case.

OK - you're right. Thanks for correcting me.

No, I don't have any numbers readily available. What kind of comparison would be interesting to see?

So what I'm looking for is a comparison, that e.g. if I have a pod with 5 secrets, it takes X seconds now, but it takes Y second if we mount those secrets via csi inline volume.

[Again - non-blocking, as it's not a replacement, it's a new way of doing stuff. I'm asking more for building awareness about that.]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what I'm looking for is a comparison, that e.g. if I have a pod with 5 secrets, it takes X seconds now, but it takes Y second if we mount those secrets via csi inline volume.

Ok, I'd expect this to vary based on the CSI driver used, but I agree it would be useful to document this comparison so we at least have some idea of how big that difference is for a basic use case like this to set expectations for users of this feature. I've added a TODO line in this section to follow up on it during GA testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

keps/sig-storage/596-csi-inline-volumes/README.md Outdated Show resolved Hide resolved
@dobsonj
Copy link
Member Author

dobsonj commented Feb 2, 2022

/assign @xing-yang
for sig-storage approval

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few more nits - but I'm overall fine with this PRR.
However, I will wait with approval for this to be approved by the SIG leads.


###### Are there any tests for feature enablement/disablement?

Yes, the unit tests will test with the feature gate enabled and disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a test that is doing something, switching the feature-gate and then doing something else and checking
This one is a nice (and simple) example:
https://github.com/kubernetes/kubernetes/blob/b7c82bb83c1b3933b99fbc5fdcffa59fd6441617/pkg/registry/networking/networkpolicy/strategy_test.go#L246-L281

[I'm not going to block this GA on this at this point, just checking where we are and building awareness that things are at least to some extent possible. FWIW, I noted an AI for myself to extend the comment for that question in the template]

also needs to support ephemeral inline volumes, and the pod spec would need
to be updated to make use of the feature.

A rollback (or disabling the feature flag) may impact running workloads,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - the previous paragraph now looks ok to me.

Regarding this one ("A rollback...") I would make it more explicit, that once these workloads are already running, they are fine. And it may be problematic if we want to start new ones or potentially we need to restart the running ones (if kubelet no longer has a FG enabled, then it will not call appropriate mount/unmount calls, right?

So (assuming I'm correct - please challenge me if I'm missing something), I would change to something like:

A rollback (or disabling the feature flag) will not impact already running workloads, however starting new ones (as well as potentially restarting ones that failed for some reason) would fail. Additionally, kubelet will fail to fully cleanup after pods that were running and taking advantage of inline volumes.
*It is highly recommended to delete any pods using this feature before disabling it*.

Does that make sense? Is that correct?


###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Compared to CSI persistent volumes, there should be no increase in the amount of time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a pod uses CSI inline volumes, that is still a type of volume. So it's not really a stateless pod in that case.

OK - you're right. Thanks for correcting me.

No, I don't have any numbers readily available. What kind of comparison would be interesting to see?

So what I'm looking for is a comparison, that e.g. if I have a pod with 5 secrets, it takes X seconds now, but it takes Y second if we mount those secrets via csi inline volume.

[Again - non-blocking, as it's not a replacement, it's a new way of doing stuff. I'm asking more for building awareness about that.]

@dobsonj
Copy link
Member Author

dobsonj commented Feb 2, 2022

I want to collapse these changes into a single commit before the PR merges, but I'll do that later today, in case anyone is looking at the incremental changes.

@xing-yang
Copy link
Contributor

@dobsonj Can you squash your commits?

@xing-yang
Copy link
Contributor

/approve

@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2022
@xing-yang
Copy link
Contributor

/assign @wojtek-t

- Feature flag enabled by default
- CSI drivers can support both ephemeral inline volumes and persistent volumes

#### GA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new security guidance, should we also add an item to make sure our sponsored nfs and smb drivers align with this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msau42 @xing-yang can you re-lgtm? That one addition was the only change I made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2022
@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2022
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now - thanks!


###### Are there any tests for feature enablement/disablement?

Yes, the unit tests will test with the feature gate enabled and disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Additionally, kubelet may fail to fully cleanup after pods that were
running and taking advantage of inline volumes.

*It is highly recommended to delete any pods using this feature before disabling it*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - thanks!


###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?

Compared to CSI persistent volumes, there should be no increase in the amount of time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@wojtek-t
Copy link
Member

wojtek-t commented Feb 3, 2022

/lgtm
/approve PRR

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dobsonj, wojtek-t, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2022
@k8s-ci-robot k8s-ci-robot merged commit 41906ee into kubernetes:master Feb 3, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Feb 3, 2022
@dobsonj dobsonj mentioned this pull request Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants