-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASG node rotation after launch configuration / template change #333
Comments
I'm not going to want my entire cluster to redeploy at the same time. The way I've always done it and the way I believe it should be done is to manually drain and terminate one node at a time. The way you have it up there you will still get an outage because kubernetes doesn't get the opportunity to gracefully drain the nodes. Once the new auto scaling group comes up the old one will be terminated with no grace period. |
There is also a long standing bug in Terraform that means the suggested workflow no longer works. The ASG resource does not get correctly tagged for recreation if the launch config changes name. This results in the "cannot delete launch_configuration, resource still in use" error and then a messy terraform state. hashicorp/terraform#17855 |
1000% no 😅
This is correct.
Exactly. This is desired. When an ASG is deleted and all the instances are terminated, there's no graceful shutdown of pods. And some pods have complicated or time consuming processes that need to happen. These processes are correctly done in the draining process or when a pod is gracefully stopped. Some info here but specifically think about StatefulSets with Container Lifecycle Hooks. Here's some notable examples from some common Helm charts:
Personally, for me, changes to the LC are quite a big deal. e.g. a new AMI. In this scenario I will manually drain nodes, wait for the pods to be So perhaps we could implement the more aggressive option of ASG replacement as an option? Currently I think not as you can't have interpolation in lifecycle rules, unfortunately. @sc250024 perhaps we could solve your problem a different way? Perhaps you could define your ASG outside of this module? Perhaps there's some missing outputs that would make this easy to implement? |
An even simpler example: just a database that has an EBS volume mounted in the pod might need some time to shutdown. |
This is a bigger issue when the module deletes and recreates a launch template. The ASG's do not get updated with the new launch template, leaving ASG's that are basically zombies. |
Thanks everyone for their feedback! I guess this a controversial feature to have, and it seems right now is not the time to think about it. Personally, I'm coming from using Kops, and one feature that it has (that @RothAndrew and @max-rocket-internet mentioned) is to be able to rotate nodes out, one at a time, until the change you want is gone. Does anyone have a link to an open-source project to do what Kops does, i.e. rotate nodes out one at a time, safely? |
What if we use cloud formation ( |
@barryib I think that's more in line with what |
While CloudFormation gives you a lot of control over the update process, e.g. replacing nodes 1 by 1, you'd still need to cordon and drain nodes. Unfortunately, that's not possible using vanilla CloudFormation. |
https://github.com/aws-samples/amazon-k8s-node-drainer sounds very interesting. Going to give it a try. 👍 At this point, I'm using eksctl to deploy EKS clusters and update worker node groups, mainly because the upgrade path is too cumbersome with Terraform (which I prefer otherwise). |
After fixing a bug, I can report that the node drainer works as expected.
|
@mlafeldt I'll give it a try as well |
@mlafeldt Thanks for your feedback.
Yes, this is an expected behavior. I was thinking to use SQS for notification with 1 as batch size of 1 for the lambda function to drain nodes one by one.
Great. The |
One thing that's also possible with Terraform is to add a provisioner script to the launch configuration that will drain nodes of the old ASG before deleting it.
The script might use node labels added via user data (e.g. Here's an example script I used for testing: #!/bin/bash
set -e
set -x
new_ami_id="$1"
kubectl cordon --selector "ami_id!=${new_ami_id}"
nodes=$(kubectl get node --selector "ami_id!=${new_ami_id}" -o json | jq -r '.items[].metadata.name')
for node in $nodes; do
kubectl drain "$node" --ignore-daemonsets=true --delete-local-data --force
sleep 60
done However, unless this script is super robust (which it clearly isn't at the moment), I consider this a hack more than anything else. |
@mlafeldt That's pretty much what I do locally; let this module update the ASGs, and run something like that script. For the setup of the Cloudformation/ SQS, maybe that could be documented in this repository? Whether or not it's actually implemented in the module is up for debate, but it'd be super cool if there was a document describing how to do it in general. |
You can also just filter by
|
The advantage of calling the script via local-exec is that you would never have to run it manually, decoupled from Terraform. At least in theory. 😁 I'm still not sure it's a viable path. |
FWIW, I tested the lambda function with SQS (autoscaling group can send lifecycle hook events into a queue). I was able to spawn new nodes and to drain old nodes in batch (one by one in my case). For now, everything looks good. I'll try to test more in the next couple of days. PS: I found another lambda for EKS draining https://github.com/awslabs/amazon-eks-serverless-drainer which is written in few lines of bash. |
@barryib, do you have an example setup for this? |
We've put some thought into this too, but what we're leaning toward is creation of ASG scheduled actions in Terraform to handle this rotation for us. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
I have issues
I'm posting this issue before I open a PR around this in order to have a discussion around how to best handle this problem in the context of this module. Currently the module updates launch configurations, and the auto scaling group is updated with this new configuration. However, the old nodes aren't rotated out, and it's left up to the user as to how to rotate out their nodes with the old configuration.
I'm submitting a...
What is the current behavior?
My example came from changing my
t3.medium
worker group to uset3.large
. While the launch configuration and ASG were updated, the old nodes persist without any change.What's the expected behavior?
The nodes should be rotated out by the
terraform apply
command using thecreate_before_destroy
lifecycle hook to ensure that nodes are first brought up before the old ones are killed.There's a great Medium article (Using Terraform for zero downtime updates of an Auto Scaling group in AWS) that has the approach I think should be taken. Essentially, the name of the launch template is used as a prefix for the
aws_autoscaling_group
resource that's created.I would update the module to use this approach for launch configurations / launch templates in order to handle node rotation.
Are you able to fix this problem and submit a PR? Link here if you have already.
Yes, but I am curious as to how other people currently solve this issue, or if there's an approach already in the module I haven't considered.
Environment details
The text was updated successfully, but these errors were encountered: