Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to balance nodes using specified labels #3839

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cluster-autoscaler/config/autoscaling_options.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ type AutoscalingOptions struct {
MaxBulkSoftTaintTime time.Duration
// IgnoredTaints is a list of taints to ignore when considering a node template for scheduling.
IgnoredTaints []string
// BalancingLabels is a list of labels to use when comparing if two node groups are similar.
BalancingLabels []string
// BalancingExtraIgnoredLabels is a list of labels to additionally ignore when comparing if two node groups are similar.
// Labels in BasicIgnoredLabels and the cloud provider-specific ignored labels are always ignored.
BalancingExtraIgnoredLabels []string
Expand Down
10 changes: 8 additions & 2 deletions cluster-autoscaler/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ var (
newPodScaleUpDelay = flag.Duration("new-pod-scale-up-delay", 0*time.Second, "Pods less than this old will not be considered for scale-up.")

ignoreTaintsFlag = multiStringFlag("ignore-taint", "Specifies a taint to ignore in node templates when considering to scale a node group")
balancingLabelsFlag = multiStringFlag("balancing-label", "Specifies a label to use when comparing if two node groups are similar")
balancingIgnoreLabelsFlag = multiStringFlag("balancing-ignore-label", "Specifies a label to ignore in addition to the basic and cloud-provider set of labels when comparing if two node groups are similar")
awsUseStaticInstanceList = flag.Bool("aws-use-static-instance-list", false, "Should CA fetch instance types in runtime or use a static list. AWS only")
concurrentGceRefreshes = flag.Int("gce-concurrent-refreshes", 1, "Maximum number of concurrent refreshes per cloud object type.")
Expand Down Expand Up @@ -240,6 +241,7 @@ func createAutoscalingOptions() config.AutoscalingOptions {
Regional: *regional,
NewPodScaleUpDelay: *newPodScaleUpDelay,
IgnoredTaints: *ignoreTaintsFlag,
BalancingLabels: *balancingLabelsFlag,
BalancingExtraIgnoredLabels: *balancingIgnoreLabelsFlag,
KubeConfigPath: *kubeConfigFile,
NodeDeletionDelayTimeout: *nodeDeletionDelayTimeout,
Expand Down Expand Up @@ -309,7 +311,11 @@ func buildAutoscaler() (core.Autoscaler, error) {
opts.Processors.PodListProcessor = core.NewFilterOutSchedulablePodListProcessor()

nodeInfoComparatorBuilder := nodegroupset.CreateGenericNodeInfoComparator
if autoscalingOptions.CloudProviderName == cloudprovider.AzureProviderName {
nodeInfoComparatorBuilderArgs := autoscalingOptions.BalancingExtraIgnoredLabels
if len(autoscalingOptions.BalancingLabels) > 0 {
nodeInfoComparatorBuilder = nodegroupset.CreateGenericNodeInfoLabelComparator
nodeInfoComparatorBuilderArgs = autoscalingOptions.BalancingLabels
} else if autoscalingOptions.CloudProviderName == cloudprovider.AzureProviderName {
nodeInfoComparatorBuilder = nodegroupset.CreateAzureNodeInfoComparator
} else if autoscalingOptions.CloudProviderName == cloudprovider.AwsProviderName {
nodeInfoComparatorBuilder = nodegroupset.CreateAwsNodeInfoComparator
Expand All @@ -318,7 +324,7 @@ func buildAutoscaler() (core.Autoscaler, error) {
}

opts.Processors.NodeGroupSetProcessor = &nodegroupset.BalancingNodeGroupSetProcessor{
Comparator: nodeInfoComparatorBuilder(autoscalingOptions.BalancingExtraIgnoredLabels),
Comparator: nodeInfoComparatorBuilder(nodeInfoComparatorBuilderArgs),
}

// These metrics should be published only once.
Expand Down
20 changes: 17 additions & 3 deletions cluster-autoscaler/processors/nodegroupset/compare_nodegroups.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,13 @@ func resourceListWithinTolerance(qtyList []resource.Quantity, maxDifferenceRatio
return larger-smaller <= larger*maxDifferenceRatio
}

func compareLabels(nodes []*schedulerframework.NodeInfo, ignoredLabels map[string]bool) bool {
func compareLabels(nodes []*schedulerframework.NodeInfo, explicitLabels, ignoredLabels map[string]bool) bool {
labels := make(map[string][]string)
for _, node := range nodes {
for label, value := range node.Node().ObjectMeta.Labels {
ignore, _ := ignoredLabels[label]
if !ignore {
include, _ := explicitLabels[label]
if ( include || !ignore ) {
labels[label] = append(labels[label], value)
}
}
Expand Down Expand Up @@ -105,6 +106,19 @@ func CreateGenericNodeInfoComparator(extraIgnoredLabels []string) NodeInfoCompar
}
}

func CreateGenericNodeInfoLabelComparator(labels []string) NodeInfoComparator {
return func(n1, n2 *schedulerframework.NodeInfo) bool {
includedLabels := make(map[string]bool)
for _, l := range labels {
includedLabels[l] = true
}
if !compareLabels([]*schedulerframework.NodeInfo{n1, n2}, includedLabels, make(map[string]bool)) {
return false
}
return IsCloudProviderNodeInfoSimilar(n1, n2, make(map[string]bool))
}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested, I created a new comparator to accept labels instead of extraIgnoredLabels, and compare nodes based on the specified labels. This --balancing-labels arg here bypasses the --balancing-ignored-labels flag. This also ignores the cloud provider-specific basic ignored flags. This calls for some discussion.

Rather than creating a new comparator, I'd personally prefer to edit the signature of the current comparator creators to accept an extra argument for labels in addition to extraIgnoredLabels, and compare the nodes for matching labels specified in labels if labels is not empty. If check fails, then we return, else we proceed to call IsCloudProviderNodeInfoSimilar, which takes care of ignored labels.

@MaciekPytel please share your thoughts and review here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second implementation idea doesn't change current behavior at all. Currently every label that is not on ignored-labels list will cause the node groups to compare as different. Checking --balancing-labels first doesn't change the ultimate result, it just fails the comparison slightly earlier.

My understanding of #3615 is that specifying --balancing-labels makes comparison treat any label not on the list as ignored-label (current implementation treats any label that is not on ignored list as "balancing labels"). I think that:

  • Specifying both --balancing-labels and --balancing-ignored-labels makes no sense. What would be the behavior for any label not on either list?
    • (if you agree with this logic) we should add validation in main.go that prevents setting both flags at the same time.
  • Ignoring cloudprovider labels seems irrelevant when using --balancing-labels (since we would be ignoring all labels not on the list).
  • On the implementation side - calling IsCloudProviderNodeInfoSimilar as-is will not work as far as I can see, since it already has the label check embedded. I think the easiest way is to extract the resources check to a separate function and call that new function if --balancing-labels are specified, but there are plenty of other ways to refactor and I don't feel particularly strongly about which is the right one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to

  • Making --balancing-labels and --balancing-ignored-labels mutually exclusive
  • Ignoring cloudprovider labels irrelevant when using --balancing-labels
  • Yes, we'll need to refactor a underlying implementation of IsCloudProviderNodeInfoSimilar as you said. Since this is my first issue on CA, I was a bit hesitant to do much refactoring. Thanks for asserting my thoughts.

// IsCloudProviderNodeInfoSimilar returns true if two NodeInfos are similar enough to consider
// that the NodeGroups they come from are part of the same NodeGroupSet. The criteria are
// somewhat arbitrary, but generally we check if resources provided by both nodes
Expand Down Expand Up @@ -156,7 +170,7 @@ func IsCloudProviderNodeInfoSimilar(n1, n2 *schedulerframework.NodeInfo, ignored
return false
}

if !compareLabels(nodes, ignoredLabels) {
if !compareLabels(nodes, make(map[string]bool), ignoredLabels) {
return false
}

Expand Down