-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instance Refresh for EC2 Auto Scaling #929
Comments
Blocked first on support being available in the AWS provider. Looks like there's a WIP PR over there: hashicorp/terraform-provider-aws#13791 The wiring for this would be complicated if you want to avoid service interruption. You need to inform kubernetes that a node is about to go away via a drain call. Ideally you need spare capacity or the replacement node already running and fully joined to the cluster to avoid downtime. How would the notification for new-node-ready be linked to something as primitive as this instance refresh? A rolling refresh of this type would potentially cause a lot of pod interruption unless you cordon the whole ASG first. But then what happens if the rollout failed and all your nodes are cordoned? I think this is outside of the scope of this module. Look at the complexity involved in #937 for setting up node draining. Maybe some day the managed node groups feature will be useful enough for everybody? Although you still need a system outside of terraform to trigger a rollout of them. |
@dpiddockcmp I think if the AWS node termination handler added support (aws/aws-node-termination-handler#181) then this module could provide the trigger to start the process as a user option. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
For the records:
|
@barryib thanks for the update. |
Hi all. We've just merged hashicorp/terraform-provider-aws#16678. Instance Refresh support will be shipping in AWS Provider v3.22.0 today |
Would be willing to take care of integrating the new instance refresh support into this module. I'd introduce a flag to allow to enable/disable the instance refresh. What should be the default? What would be sane defaults for warmup time and min healthy percentage? |
@brennerm guess the default should be |
@brennerm Is this work in progress? |
@ffjia Not sure about that. Think of people that are running stateful applications that can't handle being shutdown at any time. Additionally it would introduce a change to the current behavior of this module that users may not expect. Not tested it yet but I would simply make a change to the launch template/configuration and see if the instances refresh @bashims Not yet, but I'd start this weekend. FYI would go with the default values of AWS regarding warmup time and min healthy percentage. |
@brennerm thanks for the quick reply! I am mostly done with the updates since we need it asap. Mind if I lend a hand here? I agree that it should not be enabled by default. |
Sure go ahead. No need to make the work twice. |
According to the Terraform document, instance refresh will be triggered by |
I've tested the @ffjia I agree with you. ASG recreation was always a bit of a hack to work around a missing feature in AWS. Unfortunately we have incurred downtime when when transitioning between ASGs and will greatly benefit from the instance refresh feature. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This should be left open as PR #1224 will close. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
I'm submitting a...
What is the current behavior?
When this module makes a change to the worker group launch template an external action is required to update the workers to use the new launch template.
If this is a bug, how to reproduce? Please include a code sample if relevant.
n/a
What's the expected behavior?
I'm interested to see if the new Instance Refresh for EC2 Auto Scaling would be something that could be leveraged by this module to automate worker upgrades?
Currently the best solution I've found for this is the HelloFresh EKS Rolling Update but due to the fact that this is synchronous and time consuming it doesn't seem like a good fit for a Terraform module. I'd be interested to see if Instance Refresh for EC2 Auto Scaling is an alternative that can be embedded in a Terraform module.
Are you able to fix this problem and submit a PR? Link here if you have already.
n/a
Environment details
n/a
Any other relevant info
n/a
The text was updated successfully, but these errors were encountered: