AWS AZ Rebalancing Unexpectedly Terminates Nodes #1744

cablespaghetti · 2019-03-01T17:01:02Z

Not strictly speaking an autoscaler issue, but I will raise a PR to at least mention this in the docs if nothing else.

We couldn't work out why some of our nodes were getting surprise terminated without being drained first. Our apps need to get a nice SIGTERM and shutdown nicely or things get a bit wobbly.

I've now traced it to a feature of AWS Autoscaling Groups, which is designed to keep an even number of instances in each AZ. It will automatically spin up a new instance in the AZ with less instances and terminate an instance in the AZ with too many. Which is of course not very helpful in our use case. https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-benefits.html#arch-AutoScalingMultiAZ

Here's the docs of how to disable or "Suspend" this feature for your Autoscaling group, which I imagine anyone using this autoscaler on AWS will probably want to do: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-suspend-resume-processes.html

bskiba · 2019-03-04T10:08:09Z

Hi there. Is this connected with the open source Cluster Autoscaler in any way or is it only a problem with AWS Autoscaling Groups?

cablespaghetti · 2019-03-04T10:54:49Z

In my eyes it should be in the documentation for this project, so the information is easily discoverable. I spent a while trying to work out why the K8S autoscaler was behaving oddly and not draining my nodes when actually it was a side effect of the cluster autoscaler making my cluster "unbalanced" as it scaled down and AWS trying to rectify that.

bskiba · 2019-03-04T10:57:52Z

Makes sense. Looks like this could go here as an AWS specific gotcha: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#common-notes-and-gotchas
Would you be interested in contributing?

cablespaghetti · 2019-03-04T11:04:48Z

I agree. I'll make a PR shortly.

…

On Mon, 4 Mar 2019 at 10:57, Beata Skiba ***@***.***> wrote: Makes sense. Looks like this could go here as an AWS specific gotcha: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#common-notes-and-gotchas Would you be interested in contributing? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1744 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKoi5hXuSRg_M1dosSW52Zs2n-fBCbWpks5vTPw0gaJpZM4bZbem> .

mtsr · 2019-03-11T16:18:30Z

Has anyone tried whether ASG Notifications work normally for Rebalancing? Because it's relatively easy to setup a SNS notification and Lambda to trigger draining for ECS and I imagine the same should be doable for K8s.

See this blog post: https://aws.amazon.com/de/blogs/compute/how-to-automate-container-instance-draining-in-amazon-ecs/

aleksandra-malinowska added area/provider/aws Issues or PRs related to aws provider sig/aws labels Mar 4, 2019

bskiba added area/cluster-autoscaler kind/documentation Categorizes issue or PR as related to documentation. labels Mar 4, 2019

cablespaghetti mentioned this issue Mar 4, 2019

Improve AWS "gotchas" list and discoverability of cloud-provider READMEs. #1746

Merged

k8s-ci-robot closed this as completed in #1746 Mar 5, 2019

yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this issue Feb 22, 2024

Admisssion to Admission doc typo (kubernetes#1744)

6d57813

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS AZ Rebalancing Unexpectedly Terminates Nodes #1744

AWS AZ Rebalancing Unexpectedly Terminates Nodes #1744

cablespaghetti commented Mar 1, 2019

bskiba commented Mar 4, 2019

cablespaghetti commented Mar 4, 2019 •

edited

Loading

bskiba commented Mar 4, 2019

cablespaghetti commented Mar 4, 2019 via email

mtsr commented Mar 11, 2019

AWS AZ Rebalancing Unexpectedly Terminates Nodes #1744

AWS AZ Rebalancing Unexpectedly Terminates Nodes #1744

Comments

cablespaghetti commented Mar 1, 2019

bskiba commented Mar 4, 2019

cablespaghetti commented Mar 4, 2019 • edited Loading

bskiba commented Mar 4, 2019

cablespaghetti commented Mar 4, 2019 via email

mtsr commented Mar 11, 2019

cablespaghetti commented Mar 4, 2019 •

edited

Loading