[AWS][Cluster Autoscale] Cluster Autoscaler is randomly adding and deleting nodes in the node groups, results in uneven node distribution across different zones #3082

avedpathak · 2020-04-23T15:08:15Z

Although I have the same labels for all ASGs, the same instance size and also enabled balance-similar-node-groups, cluster auto-scaler does not balance or evenly distribute instances.

We have 4 ASGs, one ASG per AZ.
There are 6 instances in ASG-A, 2 in ASG-B, 1 in ASG-C, 1 in ASG-D

Cloud provider: AWS
EKS version is 1.15.6
The current app version is k8s.gcr.io/cluster-autoscaler:v1.14.6 and Chart version is cluster-autoscaler-6.2.0
We have multiple AZ in EKS cluster and each AZ has one ASG. Min no of nodes in each asg is set to 1 and Max number of nodes is set to 6.
Each ASG has two tags on which cluster autoscaler works those are k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/ Cluster autoscaler will identify the ASG/Node group to scale using these tags.
We are using LaunchConfiguration to launch the nodes
Instance types are also similar. When the cluster autoscaler attempts to discover similar node groups, it requires an exact match in memory capacity
All nodes have same labels kubectl get nodes --show-labels

Container Configuration:

containers:

command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --namespace=utilities
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/eks
- --balance-similar-node-groups=true
- --logtostderr=true
- --stderrthreshold=warning
- --v=0

Config map logs:
status: |+
Cluster-autoscaler status at 2020-04-23 09:44:25.020239986 +0000 UTC:
Cluster-wide:
Health: Healthy (ready=12 unready=0 notStarted=0 longNotStarted=0 registered=13 longUnregistered=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 2020-04-23 02:33:05.172110061 +0000 UTC m=+28.835387548
ScaleUp: NoActivity (ready=12 registered=13)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 2020-04-23 09:33:51.210503791 +0000 UTC m=+25274.873781228
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 2020-04-23 09:43:44.950793146 +0000 UTC m=+25868.614070533
NodeGroups:
Name: eks-travel-qa-subnet-1e46c654-workers-NodeGroup-PBB6IJ6ZNHKF
Health: Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=0))
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleUp: NoActivity (ready=0 cloudProviderTarget=0)
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
Name: eks-travel-qa-subnet-4d4d8c2a-workers-NodeGroup-PC6EPZSXRVAT
Health: Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=0))
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleUp: NoActivity (ready=0 cloudProviderTarget=0)
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
Name: eks-travel-qa-subnet-9e48b5c2-workers-NodeGroup-J61VTAEY3A7U
Health: Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=0))
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleUp: NoActivity (ready=0 cloudProviderTarget=0)
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
Name: eks-travel-qa-subnet-faf608d4-workers-NodeGroup-NW2O6EXRYLO
Health: Healthy (ready=0 unready=0 notStarted=0 longNotStarted=0 registered=0 longUnregistered=0 cloudProviderTarget=0 (minSize=0, maxSize=0))
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleUp: NoActivity (ready=0 cloudProviderTarget=0)
LastProbeTime: 0001-01-01 00:00:00 +0000 UTC
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2020-04-23 09:44:25.016382263 +0000 UTC m=+25908.679659720
LastTransitionTime: 0001-01-01 00:00:00 +0000 UTC

Any help would be appreciated, Thanks.

The text was updated successfully, but these errors were encountered:

r8474 · 2020-07-20T14:16:17Z

We are seeing very similar behaviour in multiple clusters. Each cluster has 3 ASGs (3 AZs) with varying maximum and minimum instance numbers.

Some current ASG numbers:
5,4,0 (min 0, max 5)
3,2,0 (min 0, max 5)
3,0,0 (min 0, max 5)
4,2,1 (min 1, max 5)

Cloud provider: AWS
EKS version: 1.16
CA version: k8s.gcr.io/cluster-autoscaler:v1.16.4
Chart version: cluster-autoscaler-7.2.2
Each AZ has one ASG. Minimum nodes in each ASG are either 0 or 1 and maximum number of nodes varies
Each ASG has two tags which CA uses:
k8s.io/cluster-autoscaler/enabled
k8s.io/cluster-autoscaler/cluster-name
We are using Launch Configurations
Only one instance type is set per ASG
All nodes have the same labels

Command:
./cluster-autoscaler
--cloud-provider=aws
--namespace=kube-system
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/cluster-name
--balance-similar-node-groups=true
--expander=random
--leader-elect=true
--logtostderr=true
--scale-down-enabled=true
--scale-down-unneeded-time=10m
--scale-down-unready-time=10m
--scale-down-utilization-threshold=0.3
--scan-interval=10s
--skip-nodes-with-local-storage=false
--skip-nodes-with-system-pods=false
--stderrthreshold=info
--v=2
--write-status-configmap=true

carlosjgp · 2020-08-05T14:48:15Z

Digging a little bit on the CA source code I've seen this
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go#L85

// NodeGroups returns all node groups configured for this cloud provider.
func (aws *awsCloudProvider) NodeGroups() []cloudprovider.NodeGroup {
	asgs := aws.awsManager.getAsgs()
	ngs := make([]cloudprovider.NodeGroup, len(asgs))
	for i, asg := range asgs {
		ngs[i] = &AwsNodeGroup{
			asg:        asg,
			awsManager: aws.awsManager,
		}
	}

	return ngs
}

Meaning, if I understood correctly how CA works, that each ASG is a node group on itself instead of grouping by tags or name pattern or similar...

Once said that the behaviour seen on k8s.gcr.io/cluster-autoscaler:v1.16.4 by @r8474 (coworker) is a little bit different with k8s.gcr.io/autoscaling/cluster-autoscaler:v1.16.6 released yesterday
because we thought that this fixture will help
https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.16.6

Nodes with small difference in available memory will now be considered similar for the purposes of balancing NodeGroup sizes. This should increase the reliability of NodeGroup balancing on some providers (#3124).

and I've observed these logs entries. <INSTANCE-IP> AND <REGION> are placeholders 😅

 node_tree.go:93] Added node "ip-<INSTANCE-IP>.<REGION>.compute.internal" in group "<REGION>:\x00:<REGION>a" to NodeTree
 node_tree.go:93] Added node "ip-<INSTANCE-IP>.<REGION>.compute.internal" in group "<REGION>:\x00:<REGION>b" to NodeTree
 node_tree.go:93] Added node "ip-<INSTANCE-IP>.<REGION>.compute.internal" in group "<REGION>:\x00:<REGION>c" to NodeTree
...

reaching a more evenly distributed cluster across AZs using a quick test...

$ kubectl create deployment --image nginx nginx

Set resource request and limits. I'm using 2Gi and 500m of cPU

$ kubectl edit deployments.apps nginx

$ kubectl scale deployment nginx --replicas 30

Sit back and relax...

at the end the chosen instance ASG type chosen was scaled evenly

3,3,3 for m5a.xlarge instance types

(Please bear in mind that we have only been running this CA version for a couple of hours... I'll repeat the test a couple of times over this week and see what happens)

At the moment we have 3 different ASG with the same configuration one per AZ
but maybe the way of doing this properly is using one single ASG with multiple AZs
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#availability_zones

but AWS docs say that
https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html

If you are running a stateful application across multiple Availability Zones that is backed by Amazon EBS volumes and using the Kubernetes Cluster Autoscaler, you should configure multiple node groups, each scoped to a single Availability Zone. In addition, you should enable the --balance-similar-node-groups feature. Otherwise, you can create a single node group that spans multiple Availability Zones.

which makes specific reference to StatefukSets but same rules apply to other workloads...???

I hope this helps someone to through some light on this issue

fejta-bot · 2020-11-03T15:15:21Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-12-03T15:59:59Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2021-01-02T16:44:48Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2021-01-02T16:45:00Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 3, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 3, 2020

k8s-ci-robot closed this as completed Jan 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AWS][Cluster Autoscale] Cluster Autoscaler is randomly adding and deleting nodes in the node groups, results in uneven node distribution across different zones #3082

[AWS][Cluster Autoscale] Cluster Autoscaler is randomly adding and deleting nodes in the node groups, results in uneven node distribution across different zones #3082

avedpathak commented Apr 23, 2020

r8474 commented Jul 20, 2020

carlosjgp commented Aug 5, 2020 •

edited

Loading

fejta-bot commented Nov 3, 2020

fejta-bot commented Dec 3, 2020

fejta-bot commented Jan 2, 2021

k8s-ci-robot commented Jan 2, 2021

[AWS][Cluster Autoscale] Cluster Autoscaler is randomly adding and deleting nodes in the node groups, results in uneven node distribution across different zones #3082

[AWS][Cluster Autoscale] Cluster Autoscaler is randomly adding and deleting nodes in the node groups, results in uneven node distribution across different zones #3082

Comments

avedpathak commented Apr 23, 2020

r8474 commented Jul 20, 2020

carlosjgp commented Aug 5, 2020 • edited Loading

fejta-bot commented Nov 3, 2020

fejta-bot commented Dec 3, 2020

fejta-bot commented Jan 2, 2021

k8s-ci-robot commented Jan 2, 2021

carlosjgp commented Aug 5, 2020 •

edited

Loading