Allow user to opt-in to Node draining during Cluster delete #9692

dlipovetsky · 2023-11-09T01:06:33Z

What would you like to be added (User Story)?

As an operator, I would like to opt-in to Node draining during Cluster delete, so that I can delete Clusters managed by Infrastructure Providers that impose additional conditions on deleting underlying VMs, e.g. detaching all secondary storage (which in turns can require all Pods using this storage to be evicted).

Detailed Description

When it reconciles a Machine resource marked to be deleted, the Machine controller attempts to drain the corresponding Node. However, if the Cluster resource is marked for deletion, it does not perform the drain. This behavior was added in #2746 in order to decrease the time it takes to delete a cluster.

As part of reconciline a Machine marked for deletion, the Machine controller marks the referenced InfraMachine resource for deletion. The infrastructure provider's InfraMachine controller reconciles that delete. The InfraMachine controller can refuse to delete the underlying infrastructure until some condition is met. That condition could be that the corresponding Node must be drained.

For example, if I create a cluster with the VMware Cloud Director infrastructure provider (CAPVCD), and use secondary storage ("VCD Named Disk Volumes"), then the delete cannot proceed until I drain all Nodes with Pods that use this storage. Cluster API does not drain the Nodes.

CAPVCD requires the Node corresponding to the InfraMachine be drained. This is because CAPVCD requires all secondary storage to be detached from the VM underlying the InfraMachine. Detaching this storage is the responsibility of the VCD CSI driver, and it refuses to do this until all volumes that use this storage can be unmounted, and in turn that means all Pods using these volumes must be evicted from the Node.

Because Cluster API does not drain Nodes during Cluster delete, the Nodes are not drained, the volumes are not unmounted, and CAPVCD refuses to delete the underlying VMs. The Cluster delete process continues until the the Nodes are drained manually.

Anything else you would like to add?

As @lubronzhan helpfully points out below, draining on Cluster delete is possible by implementing and deploying a PreDrainDeleteHook.

It is my understanding that the Machine controller skips drain on Cluster delete as an optimization, not for correctness. For some infrastructure providers, draining is required for correctness. Given that, I think Cluster API itself should allow users to disable this optimization in favor of correctness, and I think requiring users to implement and deploy a webhook is too high a bar. I would prefer a simpler solution, e.g., an annotation, or an API field.

Label(s) to be applied

/kind feature
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

k8s-ci-robot · 2023-11-09T01:06:41Z

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dlipovetsky · 2023-11-09T01:20:54Z

/cc @erkanerol

dlipovetsky · 2023-11-09T01:22:43Z

On a related note, there is an ongoing conversation in Kubernetes slack with CAPVCD maintainers about whether the InfraMachine controller should impose the condition described above.

lubronzhan · 2023-11-09T23:12:16Z

I think it's already doable if you implement the MachineDeletionPhaseHook https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20200602-machine-deletion-phase-hooks.md#changes-to-machine-controller

dlipovetsky · 2023-11-13T18:03:25Z

I think it's already doable if you implement the MachineDeletionPhaseHook https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20200602-machine-deletion-phase-hooks.md#changes-to-machine-controller

Thanks for this helpful context! I amended my feature request to explain why I think Cluster API should provide a solution that doesn't require the user to implement and deploy a webhook.

vincepri · 2023-11-13T18:58:28Z

This seems a reasonable ask, I'd like to see an option in the Cluster object under a field we could call deletionStrategy or similar to avoid adding more annotations

sbueringer · 2023-11-14T12:18:55Z

Sounds reasonable. Shouldn't be necessary to implement a hook for this case.

dlipovetsky · 2023-12-08T19:29:08Z

I think there are two obvious deletion strategies. Feel free to suggest others, or alternate names for these.

Graceful. Drain a Node before deleting its corresponding Machine.
Forced. Just delete the Machine.

Today, we support the Forced strategy only.

I think we could support the Graceful strategy, but this support is limited by how drain works. Notably, drain does not evict Pods managed by DaemonSets.

Let's say a CAPVCD cluster, runs a DaemonSet that mounts VCD CSI-backed persistent volumes. If we use the Graceful strategy to delete the cluster, CAPI will drain each Node. However, after a Node is drained, the DaemonSet Pods continue to run, and their volumes remain mounted.

dlipovetsky · 2024-01-10T18:29:22Z

In summary, I think supporting a Graceful strategy may make sense, but it won't address problem that motivated this idea. I'm fine with waiting to see what the community thinks.

/priority awaiting-more-evidence

fabriziopandini · 2024-03-29T19:58:07Z

/triage accepted
/kind api-change
/priority backlog

IMO there are a couple of things to unwind (and eventually split into subtasks), but the discussion is interesting:

Add a configuration knob allowing to drain on cluster deletion
Add a configuration knob allowing to drain also DeamonSets (TBD how, if only on cluster deletion, etc)
Consider if to allow configuration of drain knobs (and probably other knobs) at cluster level, thus introducing a sort of "hierarchical configuration" model

fabriziopandini · 2024-04-16T08:50:16Z

triage-party:
/remove-priority awaiting-more-evidence
because we are not waiting for logs/data, but the issue needs refinement on the requirements/way forward.

/remove-triage accepted
because the issue is not actionable in this state

k8s-ci-robot · 2024-04-16T08:50:19Z

@fabriziopandini: Those labels are not set on the issue: priority/

In response to this:

triage-party:
/remove-priority awaiting-more-evidence
because we are not waiting for logs/data, but the issue needs refinement on the requirements/way forward.

/remove-triage accepted
because the issue is not actionable in this state

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2024-07-15T09:13:32Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

sbueringer · 2024-08-28T14:17:33Z

/help

k8s-ci-robot · 2024-08-28T14:17:36Z

@sbueringer:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

lubronzhan · 2024-08-28T20:03:21Z

hands up

sbueringer · 2024-08-29T09:23:00Z

Thank you!! Feel free to assign yourself the issue

enxebre · 2024-08-29T09:42:51Z

I think we could support the Graceful strategy, but this support is limited by how drain works. Notably, drain does not evict Pods managed by DaemonSets.
Let's say a CAPVCD cluster, runs a DaemonSet that mounts VCD CSI-backed persistent volumes. If we use the Graceful strategy to delete the cluster, CAPI will drain each Node. However, after a Node is drained, the DaemonSet Pods continue to run, and their volumes remain mounted.

Also I understand graceful also makes the assumption there's no PDBs in the data plane? otherwise deletion is perpetually blocked.

sbueringer · 2024-08-29T12:20:15Z

Just to mention it.

Once we fixed propagation of timeouts across the board even if objects are in deleting (#10753 + e.g. Cluster topology + KCP). It might be possible to just set NodeDrainTimeout + NodeVolumeDetachTimeout to a very low value when deleting the Cluster and we end up with something very similar to "skip drain".

lubronzhan · 2024-10-22T21:49:26Z

/assign

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 9, 2023

k8s-ci-robot added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Jan 10, 2024

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Apr 16, 2024

k8s-ci-robot removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Apr 16, 2024

sbueringer mentioned this issue Jul 10, 2024

Umbrella issue: API changes #10852

Open

59 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2024

fabriziopandini added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 31, 2024

k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 31, 2024

k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 28, 2024

k8s-ci-robot assigned lubronzhan Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow user to opt-in to Node draining during Cluster delete #9692

Allow user to opt-in to Node draining during Cluster delete #9692

dlipovetsky commented Nov 9, 2023 •

edited

Loading

k8s-ci-robot commented Nov 9, 2023

dlipovetsky commented Nov 9, 2023

dlipovetsky commented Nov 9, 2023

lubronzhan commented Nov 9, 2023

dlipovetsky commented Nov 13, 2023

vincepri commented Nov 13, 2023

sbueringer commented Nov 14, 2023

dlipovetsky commented Dec 8, 2023

dlipovetsky commented Jan 10, 2024

fabriziopandini commented Mar 29, 2024

fabriziopandini commented Apr 16, 2024

k8s-ci-robot commented Apr 16, 2024

k8s-triage-robot commented Jul 15, 2024

sbueringer commented Aug 28, 2024

k8s-ci-robot commented Aug 28, 2024

lubronzhan commented Aug 28, 2024

sbueringer commented Aug 29, 2024

enxebre commented Aug 29, 2024

sbueringer commented Aug 29, 2024

lubronzhan commented Oct 22, 2024

Allow user to opt-in to Node draining during Cluster delete #9692

Allow user to opt-in to Node draining during Cluster delete #9692

Comments

dlipovetsky commented Nov 9, 2023 • edited Loading

What would you like to be added (User Story)?

Detailed Description

Anything else you would like to add?

Label(s) to be applied

k8s-ci-robot commented Nov 9, 2023

dlipovetsky commented Nov 9, 2023

dlipovetsky commented Nov 9, 2023

lubronzhan commented Nov 9, 2023

dlipovetsky commented Nov 13, 2023

vincepri commented Nov 13, 2023

sbueringer commented Nov 14, 2023

dlipovetsky commented Dec 8, 2023

dlipovetsky commented Jan 10, 2024

fabriziopandini commented Mar 29, 2024

fabriziopandini commented Apr 16, 2024

k8s-ci-robot commented Apr 16, 2024

k8s-triage-robot commented Jul 15, 2024

sbueringer commented Aug 28, 2024

k8s-ci-robot commented Aug 28, 2024

Guidelines

lubronzhan commented Aug 28, 2024

sbueringer commented Aug 29, 2024

enxebre commented Aug 29, 2024

sbueringer commented Aug 29, 2024

lubronzhan commented Oct 22, 2024

dlipovetsky commented Nov 9, 2023 •

edited

Loading