Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine pool nodes are not drained during upgrade #2170

Closed
bdehri opened this issue Mar 20, 2023 · 5 comments
Closed

Machine pool nodes are not drained during upgrade #2170

bdehri opened this issue Mar 20, 2023 · 5 comments
Assignees
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/bug provider/cluster-api-aws Cluster API based running on AWS topic/capi

Comments

@bdehri
Copy link

bdehri commented Mar 20, 2023

Issue

Machine pool nodes are not drain during a cluster upgrade.

It's probably the root cause for #1993 .

@calvix calvix changed the title during and machine pool update, the nodes are tear down without draining during a machine pool update, the nodes are tear down without draining Mar 21, 2023
@fiunchinho
Copy link
Member

fiunchinho commented Mar 22, 2023

I did a bit of research on the topic. Draining of worker nodes when using AWSMachinePools is not currently implemented in CAPA. This is the issue from upstream AWSMachinePool graceful scale down.

This feature is blocked by this Graduation of EventBridge Feature. And this is the ADR explaining the proposal.

There was a PR to implement the feature but it was closed because of the mentioned "Graduation of EventBridge Feature".

I believe the final implementation would involve deploying this component in our clusters https://github.com/aws/aws-node-termination-handler

@fiunchinho
Copy link
Member

There is also kubernetes-sigs/cluster-api-provider-aws#2023

@fiunchinho
Copy link
Member

fiunchinho commented Mar 23, 2023

There is a k8s built-in mechanism that we could try https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/

The configuration flags needed to configure it (like ShutdownGracePeriod and ShutdownGracePeriodCriticalPods) are not available as kubelet flags though, so they can't be configured using kubeletExtraFlags. That means it wouldn't work with our current approach to configure the kubelet using kubeadm and the kubeletExtraFlags field. I opened this upstream kubernetes-sigs/cluster-api#8348

@alex-dabija alex-dabija added area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service team/hydra topic/capi provider/cluster-api-aws Cluster API based running on AWS kind/bug labels Mar 27, 2023
@alex-dabija alex-dabija changed the title during a machine pool update, the nodes are tear down without draining Machine pool nodes are not drain during upgrade Mar 27, 2023
@mnitchev mnitchev self-assigned this Mar 29, 2023
@mnitchev
Copy link
Member

Regarding the above ShutdownGracePeriod. Kubeadm allows patching the Kubelet config (see here) but this was added in 1.25. I'll see if this can be done with a pre/post kubeadm command.

@mnitchev mnitchev changed the title Machine pool nodes are not drain during upgrade Machine pool nodes are not drained during upgrade Apr 3, 2023
@mnitchev
Copy link
Member

mnitchev commented Apr 6, 2023

Released in [email protected]
A bit more info on how it works in the PR: giantswarm/cluster-aws#276

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/bug provider/cluster-api-aws Cluster API based running on AWS topic/capi
Projects
None yet
Development

No branches or pull requests

4 participants