Allow balancing by labels exclusively #4174

jsravn · 2021-06-29T10:28:41Z

Adds a new flag --balancing-label which allows users to balance between node groups exclusively via labels for when --balance-similar-node-groups is being used.

This gives users the flexibility to specify the similarity logic themselves rather than trying to work within the existing heuristics.

This is a POC based on my idea in #4165.

An example use case is a heavily diversified AWS deployment with per-AZ node groups to accommodate zone affinity restrictions of EBS volumes. The current heuristic logic makes this difficult to accomplish in combination with the least-waste expander, as the template node info for empty node groups rarely matches existing nodes (for many reasons, as discussed in the linked issue and elsewhere). This change makes it very simple - just specify the labels you want to compare on, such as node.kubernetes.io/instance-type, and that's it.

k8s-ci-robot · 2021-06-29T10:28:49Z

Welcome @jsravn!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

bpineau

That's a great idea!
Unpredictable kernel memory reservations on AWS instances is a real issue for balance-similar-node-groups. We unsuccessfully tried to other approaches here, here, here and a few more times.

bpineau · 2021-07-02T10:44:28Z

cluster-autoscaler/processors/nodegroupset/label_nodegroups.go

+
+import (
+	klog "k8s.io/klog/v2"
+	schedulerframework "k8s.io/kubernetes/pkg/scheduler/framework/v1alpha1"


master/HEAD uses k8s.io/kubernetes/pkg/scheduler/framework (not v1alpha1 anymore, since 0fb897b)

bpineau · 2021-07-02T10:48:33Z

cluster-autoscaler/processors/nodegroupset/label_nodegroups.go

+		val1 := n1.Node().ObjectMeta.Labels[label]
+		val2 := n2.Node().ObjectMeta.Labels[label]
+		if val1 != val2 {
+			klog.V(4).Infof("%s label did not match. %s: %s, %s: %s", label, n1.Node().Name, val1, n2.Node().Name, val2)


Do we need that log emitted (at that low log level)? that will be very verbose on large clusters

I thought it would be helpful to debug why things are not being balanced. Mostly from my struggles trying to debug the existing balancing code :). But it may not be necessary given you control the labels used here.

What level would be good? Or should I remove it completely?

I would actually like to see this loop through all labels and log all mismatches. I really like to see all the reasons, not just the first one. This often saves time.

After spending half a day debugging label mismatches, this log output would be great.

jsravn · 2021-07-05T14:18:40Z

I've been testing this in my dev clusters (100+ nodes) and it is working really well. I'm getting perfect balancing across node groups with same instance types.

tsunamishaun · 2021-07-16T17:34:03Z

I've had to disable balance similar with priority expander as it ignores the priority list. I have two node groups one for spot, one for on demand, same taint and workergroup labels. Only difference is market option label which I dont use for affinity etc. I'm pretty sure this would fix my problem letting me re-enable balance similar and use -balance-label set to my workergroup label/topology?

jsravn · 2021-07-20T09:11:51Z

I've had to disable balance similar with priority expander as it ignores the priority list. I have two node groups one for spot, one for on demand, same taint and workergroup labels. Only difference is market option label which I dont use for affinity etc. I'm pretty sure this would fix my problem letting me re-enable balance similar and use -balance-label set to my workergroup label/topology?

I haven't tested it, but assuming balancing works w/ the priority expander, then yes this should help your usecase.

syscod3 · 2021-08-23T14:49:14Z

Hello. I was wondering if this PR can be merged now? It would solve a lot of people's issues.

jsravn · 2021-08-24T12:56:25Z

I pushed changes for the review comments. I suppose it needs to be discussed at the SIG meeting?

artificial-aidan · 2021-10-27T18:45:51Z

This is working well so far in my testing. Any updates?

jaypipes

@jsravn I support this addition. I think it's a simple solution to the problem you describe in #4165 and simple == good, at least for me!

Couple things I'd like to see, however:

Please add at least a unit test for the new functionality.
A discussion, either in the areLabelsSame code comment or in the AWS FAQ section about multiple instance type node groups about how this technique will not work with ASGs with a mixed instance policy because the NodeInfo object will be returned for the first instance in the InstanceGroup and that instance will obviously only have a single instance-type label and therefore depending on which instance type different NodeGroups happened to launch first, the instance-type label may be different for them, which will cause the areLabelsSame call to return false.

jsravn · 2021-11-23T16:06:11Z

@jaypipes I've addressed your comment and rebased. Let me know what you think.

jaypipes

Great stuff, @jsravn, much appreciated!

jaypipes · 2021-11-23T17:13:23Z

cluster-autoscaler/cloudprovider/aws/README.md

@@ -304,6 +304,10 @@ spec:
                  - i3.2xlarge
 ```

+Similarly, if using the `balancing-label` flag, you should only choose labels which have the same value for all nodes in
+the node group.  Otherwise you may get unexpected results, as the flag values will vary based on the nodes created by
+the ASG.


👍 thanks for adding this note, @jsravn

jaypipes · 2021-11-23T17:15:44Z

cluster-autoscaler/processors/nodegroupset/label_nodegroups_test.go

+			checkNodesSimilar(t, node1, node2, comparator, tc.isSimilar)
+		})
+	}
+}


++ nice tests.

jsravn · 2021-11-23T23:44:54Z

/assign @towca

TBBle · 2022-01-26T04:23:56Z

To avoid any confusion for users, perhaps there should there be some kind of failure or complaint if both balancing-ignore-label and balancing-label are provided, even if it's just a log warning that the former will be ignored. A start-up failure with error seems reasonable to me too. The semantic of both flags provided is pretty clear (since 'ignored labels' includes all labels not listed in balancing-labels by default, so the ignore is redundant) except if the user provides the same label in both lists for some reason, which implies a misunderstanding of the intended behaviour.

As a stray thought, since this is actually replacing the behaviour of --balance-similar-node-groups to bypass the check for similarity, does it contrast with that option instead, e.g., should the flag be something like --balance-labelled-node-groups?

The current implementation wouldn't need to change, i.e., something like

BalanceSimilarNodeGroups:           (*balanceSimilarNodeGroupsFlag || len(*balancingLabelsFlag) > 0)

would achieve this by making --balancing-label automatically activate --balance-similar-node-groups, and the rest of the implementation continues to work.

TBBle · 2022-01-26T04:51:33Z

cluster-autoscaler/processors/nodegroupset/label_nodegroups.go

+		val1 := n1.Node().ObjectMeta.Labels[label]
+		val2 := n2.Node().ObjectMeta.Labels[label]


What happens if the specified label is not present on either NodeInfo? ~~Wouldn't this panic?~~ (Edit: sorry, brainfart. It'll return "", so behaves as "absence is a value" below.)

This seems like an easy use-case to hit if I'm using a custom example.com/balance-group-name label for when I have duplicated a nodegroup across AZs, but am not using it for nodegroups which are not set up that way, e.g., they are AZ-local for reasons other than EBS-selection such as locality to other resources. Such nodegroupsets would still be fed into this comparison function if they otherwise matched the pod-to-be-created, if the pod didn't require that it be deployed on a balanced nodegroup.

And once this isn't panicking, should the behaviour of 'balancing label is not present' be:

"Don't use this nodeset", i.e. the balancing should not even be attempted if the pod-match-chosen n1 doesn't have the desired label, and if the potential-match n2 lacks the label, it always returns false; or

"Absence is also a value", i.e. only match if both NodeInfos lack the label, or if both have the label and the same value for that label; or

something else?

e.g., to implement the "missing label excludes from balancing completely", I guess it'd be:

Suggested change

val1 := n1.Node().ObjectMeta.Labels[label]

val2 := n2.Node().ObjectMeta.Labels[label]

val1 := n1.Node().ObjectMeta.Labels[label]

val2 := n2.Node().ObjectMeta.Labels[label]

if val1 == "" {

klog.V(8).Infof("%s label not present on %s.", label, n1.Node().Name)

return false

}

if val2 == "" {

klog.V(8).Infof("%s label not present on %s.", label, n2.Node().Name)

return false

}

Either way, it'd be good to see the behaviour in the case of one, the other, or both node groups lacking the relevant label in the tests, so we know what was intended.

Good point! My initial thinking is that if the label is missing completely, it wouldn't be be considered similar to another node group with also a missing label. The absence of the label would indicate "never consider this similar".

I've added tests and your suggestion. I made a slight change to use an existence check instead of comparing on the empty string - so if a user does explicitly set an empty label value, it will be considered, allowing users to use label keys alone as ways to group nodes.

artificial-aidan · 2022-04-15T22:42:40Z

@jsravn are you still working on getting this PR in? Anything I can do to help with the comments from TBBle?

jsravn · 2022-04-19T09:27:53Z

@artificial-aidan I've yet to have a single OWNER review this. I'm hesitant to keep working on this PR until I get some official feedback first. ping @towca

njablonski · 2022-04-28T16:25:47Z

This, as written, would solve for some challenges I have with predicting and controlling the existing similarity logic.

jsravn · 2022-07-06T08:57:49Z

As a stray thought, since this is actually replacing the behaviour of --balance-similar-node-groups to bypass the check for similarity, does it contrast with that option instead, e.g., should the flag be something like --balance-labelled-node-groups?

@TBBle interesting point. In my original thinking, we are still balancing similar node groups. It's just we're replacing the decision logic about what makes node groups similar. If we were to bubble this up into the top level option, I'd rather rename the existing option something like "balance-similar-node-groups-by-size" or something to indicate how it works (although the actual heuristics are more complex than that, and compare many things including labels). Then the user would choose either "balance-similar-node-groups-by-size" or "balance-similar-node-groups-by-labels". Given that breaks backwards compatibility though, not sure it's a great idea. I'm okay with introducing this as a new flag "balance-similar-nodes-by-labels" I think as you suggest. It is pretty straightforward to change it if the maintainers agree.

Adds a new flag `--balance-label` which allows users to balance between node groups exclusively via labels. This gives users the flexibility to specify the similarity logic themselves when --balance-similar-node-groups is in use.

jsravn · 2022-07-06T09:37:02Z

I've rebased and addressed the edge cases @tbbie pointed out. @mwielgus would you or another reviewer be able to have a look?

jsravn · 2022-07-06T09:38:00Z

/assign @mwielgus

(just noticed towca is no longer in the OWNERS file)

grosser · 2022-07-19T22:37:42Z

FYI used this patch and it works as expected, splitting a 8 node scale up into 3,3,2 🎉

{"level":"INFO","message":"Estimated 8 nodes needed in usw2a-AutoscalingGroup-a"}
{"level":"INFO","message":"Splitting scale-up between 3 similar node groups: {usw2a-AutoscalingGroup-a, usw2b-AutoscalingGroup-b, usw2c-AutoscalingGroup-c}"}
{"level":"INFO","message":"Final scale-up plan: [{usw2a-AutoscalingGroup-a 1->4 (max: 10)} {usw2b-AutoscalingGroup-b 1->4 (max: 10)} {usw2c-AutoscalingGroup-c 1->3 (max: 10)}]"}

elmiko

this seems like a nice feature, we should probably discuss it at a SIG meeting if we can't get more review here.

elmiko · 2022-07-25T17:05:43Z

we talked about this PR at today's meeting, hopefully we'll get a few more reviews. i think it's a nice feature, code looks good to me.

/lgtm

towca · 2022-08-01T13:30:24Z

Everything looks good to me here, thanks for a very useful contribution!

/lgtm
/approve

k8s-ci-robot · 2022-08-01T13:31:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jaypipes, jsravn, towca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [towca]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…els-exclusively Allow balancing by labels exclusively

jbilliau-rcd · 2024-10-31T16:43:02Z

I just switched to this, since my existing use of --balance-ignore-labels apparently broke at some point (it just spun up 11 nodes in one node group, RIP ). Using balancing-label: eks.amazonaws.com/nodegroup-image, I am not getting proper balancing across 3 node groups, 3 AZ's again. Thanks @jsravn !

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 29, 2021

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 29, 2021

k8s-ci-robot requested review from aleksandra-malinowska and Jeffwan June 29, 2021 10:29

bpineau reviewed Jul 2, 2021

View reviewed changes

jbartosik added the area/cluster-autoscaler label Jul 22, 2021

jaypipes suggested changes Nov 1, 2021

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 16, 2021

jsravn force-pushed the allow-balancing-by-labels-exclusively branch from 15d07a3 to 756d9d8 Compare November 23, 2021 15:23

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 23, 2021

jaypipes approved these changes Nov 23, 2021

View reviewed changes

k8s-ci-robot assigned towca Nov 23, 2021

thiagosantosleite mentioned this pull request Jan 25, 2022

Cluster-autoscaler not balancing similar node groups on AWS #3515

Closed

TBBle reviewed Jan 26, 2022

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 4, 2022

jsravn force-pushed the allow-balancing-by-labels-exclusively branch from ab04299 to 6710b4c Compare July 6, 2022 08:27

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 6, 2022

Allow balancing by labels exclusively

1b98b38

Adds a new flag `--balance-label` which allows users to balance between node groups exclusively via labels. This gives users the flexibility to specify the similarity logic themselves when --balance-similar-node-groups is in use.

jsravn force-pushed the allow-balancing-by-labels-exclusively branch from 6710b4c to 1b98b38 Compare July 6, 2022 09:34

k8s-ci-robot assigned mwielgus Jul 6, 2022

elmiko reviewed Jul 21, 2022

View reviewed changes

k8s-ci-robot assigned elmiko Jul 25, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 25, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 1, 2022

k8s-ci-robot merged commit 8ba1853 into kubernetes:master Aug 1, 2022

grosser mentioned this pull request Aug 4, 2022

Provide option to balance label based on the given labels #3615

Closed

navinjoy pushed a commit to navinjoy/autoscaler that referenced this pull request Oct 26, 2022

Merge pull request kubernetes#4174 from jsravn/allow-balancing-by-lab…

a9350c3

…els-exclusively Allow balancing by labels exclusively

jbilliau-rcd mentioned this pull request Oct 31, 2024

Labels match but Cluster Autoscaler says "are not similar, labels do not match" when trying to balance similar node groups. #6954

Closed

		val1 := n1.Node().ObjectMeta.Labels[label]
		val2 := n2.Node().ObjectMeta.Labels[label]

-		val1 := n1.Node().ObjectMeta.Labels[label]
-		val2 := n2.Node().ObjectMeta.Labels[label]
+		val1 := n1.Node().ObjectMeta.Labels[label]
+		val2 := n2.Node().ObjectMeta.Labels[label]
+		if val1 == "" {
+			klog.V(8).Infof("%s label not present on %s.", label, n1.Node().Name)
+			return false
+		}
+		if val2 == "" {
+			klog.V(8).Infof("%s label not present on %s.", label, n2.Node().Name)
+			return false
+		}

Allow balancing by labels exclusively #4174

Allow balancing by labels exclusively #4174

Conversation

jsravn commented Jun 29, 2021 • edited Loading

k8s-ci-robot commented Jun 29, 2021

bpineau left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsravn Jul 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsravn commented Jul 5, 2021 • edited Loading

tsunamishaun commented Jul 16, 2021 • edited Loading

jsravn commented Jul 20, 2021

syscod3 commented Aug 23, 2021

jsravn commented Aug 24, 2021

artificial-aidan commented Oct 27, 2021

jaypipes left a comment

Choose a reason for hiding this comment

jsravn commented Nov 23, 2021

jaypipes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsravn commented Nov 23, 2021

TBBle commented Jan 26, 2022 • edited Loading

TBBle Jan 26, 2022 • edited Loading

Choose a reason for hiding this comment

TBBle Jan 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsravn Jul 6, 2022 • edited Loading

Choose a reason for hiding this comment

artificial-aidan commented Apr 15, 2022

jsravn commented Apr 19, 2022

njablonski commented Apr 28, 2022

jsravn commented Jul 6, 2022

jsravn commented Jul 6, 2022

jsravn commented Jul 6, 2022

grosser commented Jul 19, 2022 • edited Loading

elmiko left a comment

Choose a reason for hiding this comment

elmiko commented Jul 25, 2022

towca commented Aug 1, 2022

k8s-ci-robot commented Aug 1, 2022

jbilliau-rcd commented Oct 31, 2024

jsravn commented Jun 29, 2021 •

edited

Loading

jsravn Jul 2, 2021 •

edited

Loading

jsravn commented Jul 5, 2021 •

edited

Loading

tsunamishaun commented Jul 16, 2021 •

edited

Loading

TBBle commented Jan 26, 2022 •

edited

Loading

TBBle Jan 26, 2022 •

edited

Loading

TBBle Jan 26, 2022 •

edited

Loading

jsravn Jul 6, 2022 •

edited

Loading

grosser commented Jul 19, 2022 •

edited

Loading