Custom drain flow #740

dorsegal · 2022-10-20T12:02:08Z

Tell us about your request

Add a rollout flag when using drain. It will be used when consolidation and native termination handler (aws/karpenter-provider-aws#2546) will be ready.
The custom drain flow is like this:

Cordon the node
Do a rolling restart of the deployments that have pods running on the node.
Drain the node.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Currently when using consolidation feature or aws-node-termination-handler we can result of downtime or heavy performance degradation with the current implementation of kubectl drain

Current drain will terminate all workloads on a node and the scheduler will try to create those workloads on available nodes and if not any Karpenter will provision new node.
Even with PDB there is some level or degradation.

Are you currently working around this issue?

having a custom bash script that implements an alternative to kubectl drain

https://gist.github.com/juliohm1978/1f24f9259399e1e1edf092f1e2c7b089

Additional Context

kubectl drain leads to downtime even with a PodDisruptionBudget kubernetes/kubernetes#48307

Attachments

No response

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

tzneal · 2022-10-20T18:23:20Z

Sorry, I'm not following with respect to consolidation, it always pre-spins a replacement node so you should never need to wait for a node to provision.

Regarding PDBs, why are they not sufficient? It will slow the rate at which the pods are evicted.

dorsegal · 2022-10-21T06:41:45Z

There are cases when application takes time to load so even if you pre-spin node the application takes time to become available.
PDB have the same problem. It will first terminate a pod(s) and K8s will schedule a new one. If I PDB are defined with 99% or only allow small number of pod disruption it can slow the rate at which the pods are evicted as well.

We want to achieve as close to 100% up-time using spot instances and currently the drain behavior is what holding us back.

tzneal · 2022-10-21T12:51:39Z

It sounds like you're using the max surge on the restart to temporarily launch more pods. Instead you can just
permanently scale the deployment to your desired baseline + whatever the surge you want is then use a PDB to limit the maxUnavailable for that deployment to the surge amount. This will ensure you always have your baseline desired capacity without incurring extra restarts.

bwagner5 · 2022-10-24T22:30:44Z

You could also try catching SIGTERM within your pod and keep it from shutting down immediately so that the new pod has time to initialize if they are spinning up while the other pod is terminating.

dorsegal · 2022-10-25T06:34:30Z

You could also try catching SIGTERM within your pod and keep it from shutting down immediately so that the new pod has time to initialize if they are spinning up while the other pod is terminating.

We thought about it. The problem is when using 3rd party images, it will require to change source code for every used application. Plus it is recommend to handle SIGTERM as graceful shutdown not suspend you application till k8s kills it.

This request makes it granular solution for all pods.

We had a new idea for custom flow that does not use rollouts. The labels to detach pods from their controllers (replica sets) to add/remove label from all pods in

so the new drain flow will be like this:

cordon node
change labels for all pods inside that node
wait 90 seconds (when spot terminates we need handle this in no more than 120 seconds)
drain node.

It's not perfect but will reduce the impact of draining nodes.

ellistarn · 2022-10-26T05:16:58Z

What about a pre-stop command? https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/

dorsegal · 2022-10-26T06:30:08Z

What about a pre-stop command? https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/

Since pre-stop does not put the container in terminating state k8s scheduler does not know to spin a new pod.

ellistarn · 2022-10-31T21:11:27Z

IIUC, it should go into terminating, which will trigger the pod's replicaset to create a new one.

PreStop hooks are not executed asynchronously from the signal to stop the Container; the hook must complete its execution before the TERM signal can be sent. If a PreStop hook hangs during execution, the Pod's phase will be Terminating and remain there until the Pod is killed after its terminationGracePeriodSeconds expires.

dorsegal · 2022-11-04T06:27:14Z

IIUC, it should go into terminating, which will trigger the pod's replicaset to create a new one.

PreStop hooks are not executed asynchronously from the signal to stop the Container; the hook must complete its execution before the TERM signal can be sent. If a PreStop hook hangs during execution, the Pod's phase will be Terminating and remain there until the Pod is killed after its terminationGracePeriodSeconds expires.

It even makes it worse :) since the pod is terminating requests are no longer coming to that pod which means we are getting desegregation till new pods are available.

The idea is to not terminate pods till new pods are available. Just like rollout restart

tath81 · 2023-03-14T22:52:52Z

This is a similar issue we're also running into where the node(s) will terminate before the schedule pod is in a running state on the new node(s).

sftim · 2023-06-05T17:06:00Z

I actually think a better approach here is to move https://www.medik8s.io/maintenance-node/ to be an official (out of tree, but official) Kubernetes API, and then use that when it's available in a cluster.

You could customize behavior by using your own controller rather than the default one, and keep the API the same for other parties such as kubectl and Karpenter.

Yes, it's a big change. However, it's easier than solving the n-to-m relationship between all the things that might either drain a node or watch a drain happen.

k8s-triage-robot · 2024-03-18T19:49:59Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-04-17T20:06:54Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-05-17T20:37:19Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-05-17T20:37:24Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Bharath509 · 2024-06-04T11:21:19Z

I'm also facing same problem. Please open this issue

k8s-ci-robot · 2024-06-17T09:15:54Z

@jsamuel1: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

vainkop · 2024-09-13T22:45:02Z

Would be nice to see a proper solution

dorsegal added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 20, 2022

njtran transferred this issue from aws/karpenter-provider-aws Nov 2, 2023

garvinp-stripe mentioned this issue Nov 22, 2023

support drain timeout #743

Closed

garvinp-stripe mentioned this issue Dec 6, 2023

docs: RFC for Customizable Node Finalizer #843

Closed

ellistarn mentioned this issue Dec 7, 2023

Evict pods like rolling-update aws/karpenter-provider-aws#5232

Closed

njtran added the deprovisioning Issues related to node deprovisioning label Dec 19, 2023

njtran mentioned this issue Dec 22, 2023

support to add delay in node termination to honor ELB connection draining interval aws/karpenter-provider-aws#4673

Open

garvinp-stripe mentioned this issue Feb 16, 2024

Mega Issue: Manual node provisioning #749

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 18, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 17, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom drain flow #740

Custom drain flow #740

dorsegal commented Oct 20, 2022

tzneal commented Oct 20, 2022

dorsegal commented Oct 21, 2022

tzneal commented Oct 21, 2022

bwagner5 commented Oct 24, 2022

dorsegal commented Oct 25, 2022

ellistarn commented Oct 26, 2022

dorsegal commented Oct 26, 2022

ellistarn commented Oct 31, 2022 •

edited

Loading

dorsegal commented Nov 4, 2022

tath81 commented Mar 14, 2023

sftim commented Jun 5, 2023

k8s-triage-robot commented Mar 18, 2024

k8s-triage-robot commented Apr 17, 2024

k8s-triage-robot commented May 17, 2024

k8s-ci-robot commented May 17, 2024

Bharath509 commented Jun 4, 2024 •

edited

Loading

k8s-ci-robot commented Jun 17, 2024

vainkop commented Sep 13, 2024

Custom drain flow #740

Custom drain flow #740

Comments

dorsegal commented Oct 20, 2022

Tell us about your request

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Are you currently working around this issue?

Additional Context

Attachments

Community Note

tzneal commented Oct 20, 2022

dorsegal commented Oct 21, 2022

tzneal commented Oct 21, 2022

bwagner5 commented Oct 24, 2022

dorsegal commented Oct 25, 2022

ellistarn commented Oct 26, 2022

dorsegal commented Oct 26, 2022

ellistarn commented Oct 31, 2022 • edited Loading

dorsegal commented Nov 4, 2022

tath81 commented Mar 14, 2023

sftim commented Jun 5, 2023

k8s-triage-robot commented Mar 18, 2024

k8s-triage-robot commented Apr 17, 2024

k8s-triage-robot commented May 17, 2024

k8s-ci-robot commented May 17, 2024

Bharath509 commented Jun 4, 2024 • edited Loading

k8s-ci-robot commented Jun 17, 2024

vainkop commented Sep 13, 2024

ellistarn commented Oct 31, 2022 •

edited

Loading

Bharath509 commented Jun 4, 2024 •

edited

Loading