Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use --disable-eviction option when drain node #6929

Closed
Bo0km4n opened this issue Jul 15, 2022 · 8 comments
Closed

use --disable-eviction option when drain node #6929

Bo0km4n opened this issue Jul 15, 2022 · 8 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@Bo0km4n
Copy link
Contributor

Bo0km4n commented Jul 15, 2022

User Story
If cluster user had created PDB resource, machine deleting would stack at process of node drain
So after a few drain attempts, I hope that cluster-api machine controller try drain with --disable-eviction option.

This problem often occurs when a user performs a RollingUpdate of MachineDeployment.

Detailed Description
I propose my idea to implement above opinion.

Check node drain timeout with using nodeDrainTimeoutExceeded.
Next, If the elapsed time by drain node is exceeded, machine controller enable --disable-eviction option
in drainNode function.

Anything else you would like to add:

[Miscellaneous information that will assist in solving the issue.]

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 15, 2022
@killianmuldoon
Copy link
Contributor

On first look it seems strange to use Cluster API to implicitly overwrite user intention expressed in a pod disruption budget. It seems like a better solution to this issue would be to surface the reason for the lack of rollout of machines at the Cluster API level so users can better configure their workloads.

@Bo0km4n have you got a toy example I could test to see the impact of PDB blocking rollouts?

@sbueringer
Copy link
Member

sbueringer commented Jul 15, 2022

@killianmuldoon I think you can just deploy something like that:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: cluster-api
spec:
  minAvailable: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/provider: cluster-api

Selector has to match some pods that you have (e.g. just let it match the capi controller on a self-hosted cluster). If you want a always blocking PDB just use minAvailable 1 and a Deployment with 1 replica.

Current behavior is probably:

  • without nodeDrainTimeout: drain will just fail indefinitely
  • with nodeDrainTimeout: drain will fail until the timeout, then the node is deleted

@Bo0km4n
Copy link
Contributor Author

Bo0km4n commented Jul 15, 2022

@killianmuldoon
@sbueringer 's example is I said.

with nodeDrainTimeout: drain will fail until the timeout, then the node is deleted

I didn't know that. If I set the timeout to machine I want drain, is it possible to forcibly delete a Node that is running a Pod that cannot be evicted?
If that's true, my problem will be solved by setting nodeDrainTimeout

@sbueringer
Copy link
Member

I didn't know that. If I set the timeout to machine I want drain, is it possible to forcibly delete a Node that is running a Pod that cannot be evicted?

That is my understanding, yes.

@Bo0km4n
Copy link
Contributor Author

Bo0km4n commented Jul 18, 2022

Thanks @sbueringer .
But one point of concern. If the evicting target pod using PVC, the mounted volume attached to delete node will be orphan volume?

@sbueringer
Copy link
Member

Not sure. This might depend on your infrastructure. CAPI will just delete the node object and then the corresponding infra.

@enxebre
Copy link
Member

enxebre commented Jul 18, 2022

Not sure. This might depend on your infrastructure. CAPI will just delete the node object and then the corresponding infra.

To clarify, capi will always enforce the underlying infra is gone before deleting the Node to avoid potential stateful issues, #2565.

But one point of concern. If the evicting target pod using PVC, the mounted volume attached to delete node will be orphan volume?

At the moment CAPI will wait indefinitely for volumes to be dettached #4945, your kcm cloud provider should take care of it. There's also ongoing discussion about enabling and optional timeout while waiting for the volume #6285.

I didn't know that. If I set the timeout to machine I want drain, is it possible to forcibly delete a Node that is running a Pod that cannot be evicted?
If that's true, my problem will be solved by setting nodeDrainTimeout

Yes. This issue is asking for the behaviour supported via nodeDrainTimeout.
Note that changing this in an existing MachineDeployment it's unlikely to help as it would try to trigger a rolling upgrade. That's a suboptimal UX we want to improve #5880

@Bo0km4n if this makes sense we can close this as it's supported by nodeDrainTimeout and keep any related discussion in the issued linked above.

@Bo0km4n
Copy link
Contributor Author

Bo0km4n commented Jul 18, 2022

@enxebre Thank you for your information. I will try discuss about above issues.
And I will check the above volumes and related behaviors in my environment.

Thank you guys. I close this issue.

@Bo0km4n Bo0km4n closed this as completed Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants