Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV Protection breaks deleting #195

Closed
rmb938 opened this issue Dec 20, 2018 · 21 comments
Closed

PV Protection breaks deleting #195

rmb938 opened this issue Dec 20, 2018 · 21 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@rmb938
Copy link

rmb938 commented Dec 20, 2018

If a user deletes a PV while it has a PVC with a pv protection finalizer the CSI never gets alerted to delete the volume when the pvc is deleted.

@rmb938
Copy link
Author

rmb938 commented Jan 7, 2019

kubernetes-sigs/sig-storage-lib-external-provisioner#6 has been merged so just waiting for a release to make a pr to update the library

@msau42
Copy link
Collaborator

msau42 commented Jan 8, 2019

Thanks for finding this issue. Could you also look into adding an e2e test for this in k/k? This may be a good place: https://sourcegraph.com/github.com/kubernetes/kubernetes/-/blob/test/e2e/storage/csi_volumes.go

Also, I think this is an issue for in-tree volumes too. Should we consider fixing in-tree behavior? cc @jsafrane @wongma7

k8s-ci-robot added a commit that referenced this issue Jan 8, 2019
update kubernetes-sigs/sig-storage-lib-external-provisioner to fix #195
@rmb938
Copy link
Author

rmb938 commented Jan 8, 2019

@msau42 I will look into adding tests, I am not too familiar with the kubenretes e2e tests but I will try and figure it out!

@msau42
Copy link
Collaborator

msau42 commented Jan 8, 2019

feel free to ping me if you need help. Try following this guide: https://github.com/kubernetes/community/blob/master/contributors/devel/e2e-tests.md

@msau42
Copy link
Collaborator

msau42 commented Apr 8, 2019

Hi, looking back at this PR, I'm not sure if this is the right way to fix the issue.

I think this issue also exists for in-tree volumes, so we should see if a fix is possible at the PV controller level. At the very least, we should make this behavior opt-in and maybe alpha flag gated at the CSI provisioner level while we investigate other solutions.

cc @jsafrane @wongma7

@wongma7
Copy link
Contributor

wongma7 commented Apr 8, 2019

I'll make it opt-in.

I don't think the *exact same issue occurs but I agree this needs another look. The fix is a bit heavy-handed, it fixes behaviour for this specific issue so that it's the same for in-tree & out-of-tree, but also makes other behaviour different which I didn't follow up on...

If a user deletes a PVC+PV while they are Bound, the PV's protection finalizer will not be removed until the PVC is deleted, PV leaves Bound, and goes to Released.

  • The in-tree controller will always see the PVC has been deleted first, then change the PV's state to Released, then try to reclaim the PV. Maybe the in-tree controller is wrong to try to reclaim even if there is already a deletionTimestamp on the PV? If that reclaim attempt fails, and the PV is actually deleted in the meantime, does it retry? @jsafrane
  • The out-of-tree controller had this optimization/check to avoid immediately doing a reDelete https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/pull/6/files#diff-18de8fd9ec37aa97da8831f9408ce6aeL860, and so unlike in-tree didn't try to reclaim PVs if they already had a deletionTimestamp. Now it does, so in that way it's like in-tree, but the finalizer also means users can't immediately delete PVs, unlike in-tree.

I liked the idea of having a finalizer because it makes it clear that the provisioner is responsible for deleting the back-end volume and users can still delete the PV by removing the finalizer first. Why they want to delete the dynamically provisioned PV manually when the provisioner will do it automatically, IDK (it could be an honest mistake/accident, like @rmb938 kind of suggests in kubernetes-sigs/sig-storage-lib-external-provisioner#5), but I think in any case they expect the back-end volume to be deleted when they delete the PV because the ReclaimPolicy of the PV is Delete. We document that you must delete the PVC first users should delete only the PVC, but clearly the ReclaimPolicy can cause confusion.

Anyway, all that said, I like consistency with in-tree more than I like the finalizer so I hope @jsafrane can help.

@rmb938
Copy link
Author

rmb938 commented Apr 8, 2019

Yea, this whole issue came about when I accidentally deleted the PV while the PVC was still around. Then when I went to go and delete the PVC the CSI never got called to clean up the backing volume for the PV.

I actually never tested this issue on the in-tree provisioners so my expectations of this not happening with them could have been wrong.

However if this issue also happens with in-tree provisioners I think we should have a discussion to change this behaviour so the backend volumes always get cleaned up when the ReclaimPolicy is Delete. I certainly do not want tons of leftover volumes lying around when someone accidentally deletes the PV resource first.

@msau42
Copy link
Collaborator

msau42 commented Apr 9, 2019

I'm thinking about this possible race:

  1. PV is deleted, but has the storage protection finalizer due to the PVC
  2. PVC is deleted
  3. PV controller moves the PV to released phase
  4. Before it is able to reclaim, storage protection finalizer is removed on the PV, which causes the object to be deleted
  5. PV controller tries to delete the volume, but fails to Get the PV.

I'm thinking to properly fix this, we may need some new phase/condition/annotation after Released and keep the storage protection finalizer until the it reaches that deleted phase. I think that would be tricky to deal with version skew though.

@wongma7
Copy link
Contributor

wongma7 commented Apr 9, 2019

Another way to do it could be a finalizer right before step 3 "PV controller moves the PV to released phase," then the protection controller wouldn't have to change. No API change needed like phase/condition and cleaner than annotation.

So basically there are 2 outcomes to the race and we prefer a) because we prefer to avoid dangling volumes.

a) PV is reclaimed. happens if pv controller wins the race
b) PV is not reclaimed. happens if protection controller & garbage collector wins the race, because above Get fails

Are we 100% sure a) is the fix/correct behaviour?...basically the question IMO is: Should manually deleting a PV mean "ignore the ReclaimPolicy, delete the object without reclaiming the volume"? Because making a) standard means if the PV has a storage protection finalizer the ReclaimPolicy is not ignored and otherwise it is. That unpredictability is maybe worse than the accidental/mistaken dangling volumes issue we're trying to solve.

@wongma7
Copy link
Contributor

wongma7 commented Apr 9, 2019

Putting a phase/condition/annotation/finalizer will probably mean the out-of-tree has to become aware of it so yeah there could be some version skew issues to solve else the PV would stay around with that phase/condition/annotation/finalizer forever.

Also my "PV has a storage protection finalizer the ReclaimPolicy is not ignored and otherwise it is" situation is pretty contrived, there's no way it's "worse." Worded differently, my issue is: if I delete a PV when it's Bound and remove the finalizer it's clear I don't want the ReclaimPolicy to kick in when the PVC is deleted, the PVC will become Lost first. But if I delete a PV when it's Bound without removing the finalizer then what should happen when the PVC is deleted?

@jsafrane
Copy link
Contributor

jsafrane commented Apr 9, 2019

Yes, upgrades and downgrades get problematic if we introduce new finalizer / condition.

IMO we should focus on the use case first. Why users delete PVs? They should delete PVCs, that's what we want to support. Reclaim policy was always tied to PVC deletion and never to PV deletion. It's just a coincidence that reclaim policy is applied once when both PV and PVC is deleted and PV finalizer kicks in.

So, why users delete PVs again?

@wongma7
Copy link
Contributor

wongma7 commented Apr 9, 2019

So, why users delete PVs again?

If the PV is Bound: either it's an accident/mistake or they really want to prevent people from using it.

If it's an accident/mistake then the protection finalizer helps to ensure that new/restarted pods can still use it. But, IMO, the protection finalizer doesn't say anything about whether the reclaim policy applies. Like you say, it's a coincidence that it makes the reclaim policy apply once. A user will appreciate that, but IMO having the provisioner finalizer too will be even better because it's more explicit, it tells the user that the provisioner is the owner and still has unfinished work to do (reclaim).

If they want to prevent new/restarted pods from using it then the object needs to be gone ASAP, so they will simply remove all finalizers from it. They probably don't want the reclaim policy to be applied, a provisioner finalizer will force them to consider that for a second. However, they could achieve the same thing by changing the PV's reclaim policy and deleting the PVC.

If the PV is Released, reclaim may be in progress already. The PV protection finalizer is gone at this point. Either the user's deletion is an accident/mistake or reclaim is failing repeatedly and they want to give up. If it's an accident/mistake, here the provisioner finalizer will help ensure reclaim is retried. If reclaim is failing, the user can remove the finalizer.

@msau42
Copy link
Collaborator

msau42 commented Apr 9, 2019

I'm not sure if it's worth having to give the provisioner an additional update PV permission to better handle a situation where the user makes a mistake or you don't want the finalizer anyway.

@wongma7
Copy link
Contributor

wongma7 commented Apr 9, 2019

I am very liberal about adding permissions and features, especially if a user reported/contributed it like in this case. There's evidence the mistake happens so mitigating it surely helps. The provisioner already has create and delete so from a security perspective I don't think it is a big issue. But RBAC has bad UX and changes to it are disruptive, so that part could be weighed more heavily. If there could be a better solution for surfacing RBAC changes to users, that would be ideal but I doubt anything is happening there. (Like if there were an API for a pod to request permissions and prompt user to kubectl add it to the clusterrole.)

In the future when adding permissions for objects maybe they should be grouped together as R(get, list watch) and W(create, update, patch, delete) to avoid starting too granular and having to add later. But that could be an issue too with some permissions being unnecessary.

@wongma7
Copy link
Contributor

wongma7 commented Apr 9, 2019

Maybe health check should be tied to permissions so pod will fail immediately if it doesn't have everything. Plus in this case the finalizer thing is clearly optional so it would be good to have it fallback if permission is not present. (still working on making it off by default anyway)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 7, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@deitch
Copy link

deitch commented Oct 16, 2019

Aha. I had been looking for this, almost opened a new issue.

  1. Create a pvc, which automatically creates a pv and calls CreateVolume
  2. Delete the pv, which puts it in Terminating state and awaits the pvc's deletion because of protection.
  3. Delete the pvc, the pv is deleted, but DeleteVolume never is called, leaving the volume hanging around

FYI, I tried this right up against the latest 1.3.1 version of the provisioner.

I get that users perhaps should delete the pvc and not the pv that was created secondarily. But as long as the k8s API enables them to do it, we probably should support the correct behaviours. If it is truly "wrong" to delete it directly, then should k8s itself behave differently and return an error?

Can we reopen this?

@pohly
Copy link
Contributor

pohly commented Jan 12, 2021

Instead of reopening this issue, let's continue in #546 because it has a clearer description of what the issue is about.

kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this issue Dec 29, 2023
Avoid logging PV name,where PV created by other CSI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

8 participants