Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterapi provider for cluster autoscaler should expose labels and taints when scaling from zero #7685

Closed
elmiko opened this issue Dec 5, 2022 · 21 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@elmiko
Copy link
Contributor

elmiko commented Dec 5, 2022

User Story

As a user I would like to ensure that the cluster autoscaler will use the labels and taints associated with my cluster api scalable resources (MachineDeployment, MachineSet, etc) when predicting the node shape in scale from zero scenarios so that i can be sure that pod scheduling will occur as expected.

Detailed Description

Currently, the clusterapi provider for the cluster autoscaler does not have a convenient way to expose the labels and taints that will be applied to nodes created from the scalable resources. Part of this is due to the fact that the cluster api community is still coordinating about how these labels and taints will be propagated to the nodes that are created.

Anything else you would like to add:

This will require some coordination with the cluster api community to solve completely, there are a couple pieces of prior art around the labels that should be noted:

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 5, 2022
@fabriziopandini
Copy link
Member

/triage accepted
Let's try to nail down this before the work on label propagation starts

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 5, 2022
@sbueringer
Copy link
Member

sbueringer commented Dec 22, 2022

I'm not sure I understand what the requirement for Cluster API is. This sounds a bit like it's something for the autoscaler "I would like to ensure that the cluster autoscaler will use ". But I assume I'm misunderstanding it :)

What does this mean:

convenient way to expose the labels and annotations that will be applied to nodes created from the scalable resources

Does the autoscaler want to set labels on nodes via Cluster API resources?

Or is it about mapping the labels Cluster API will apply to nodes based on Cluster API resources to the labels which are found on the nodes by the autoscaler?

@elmiko
Copy link
Contributor Author

elmiko commented Dec 22, 2022

Or is it about mapping the labels Cluster API will apply to nodes based on Cluster API resources to the labels which are found on the nodes by the autoscaler?

it's about this, we need to agree on a way to expose those labels and annotations (although i think we already agreed on a method for the labels).

essentially, when the autoscaler is attempting a prediction for a node group with zero replicas, it needs a way to understand what the labels and annotations will be present on the nodes created from the node group. when there are replicas >= 1, then the autoscaler can inspect an actual node for this information. it uses this information to know whether the pending pods will actually be scheduled on any node that it can create.

@elmiko
Copy link
Contributor Author

elmiko commented Dec 22, 2022

@cnmcavoy has proposed a novel solution in kubernetes/autoscaler#5382

@elmiko elmiko changed the title clusterapi provider for cluster autoscaler should expose labels and annotations when scaling from zero clusterapi provider for cluster autoscaler should expose labels and taints when scaling from zero Dec 22, 2022
@elmiko
Copy link
Contributor Author

elmiko commented Dec 22, 2022

updated this to talk about labels and taints, i don't think we need the annotations from the nodes.

@sbueringer
Copy link
Member

sbueringer commented Jan 2, 2023

Thx, I think now I've got it.

I think the situation is the following:

P.S. also answered on kubernetes/autoscaler#5382

@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 20, 2024
@fabriziopandini
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 12, 2024
@fabriziopandini
Copy link
Member

@elmiko is that ok to close this issue now that we have well defined rules for label propagation from Cluster API objects down to nodes?

@sbueringer
Copy link
Member

Q: I know we have those well-defined rules for MD => MS => Machine. What is the state of MachinePools? (that are also supported by cluster-autoscaler)

@elmiko
Copy link
Contributor Author

elmiko commented May 2, 2024

@fabriziopandini i think we could close, aside from the question @sbueringer raised about MachinePools. i need to study the label propagation method though just to make sure.

@elmiko
Copy link
Contributor Author

elmiko commented May 2, 2024

i looked at the rules for label propagation and think they solve the issue for labels, but we probably want to open a card for the taints separately.

i think the restriction rules make it a little complicated because two of the options require the user to keep things in the kubernetes.io domain, but tools like karpenter have some restrictions around allowing non-wellknown labels into that domain. https://kubernetes.io/docs/reference/labels-annotations-taints/

@sbueringer
Copy link
Member

@elmiko not sure if I understood correctly. We're blocking certain labels / domains and karpenter only allows specific labels?

What would be the ones we would need to support? IIRC we already allow some specific labels that just make sense, maybe we should just allow them

@fabriziopandini
Copy link
Member

@fabriziopandini i think we could close, aside from the question @sbueringer raised about MachinePools. i need to study the label propagation method though just to make sure.

cc @mboersma @willie-yao @Jont828 for the comment about MP

/close

FYI Those are the criteria for label propagation from machines to nodes:

  • Has node-role.kubernetes.io as prefix.
  • Belongs to node-restriction.kubernetes.io domain.
  • Belongs to node.cluster.x-k8s.io domain.

The third option allows to have labels outside kubernetes.io, but let's open a separated issue if we want to expand those rules.

Same if we want to start thinking to Taint propagation, which probably requires some API design work since we don't have such a concept in core machine APIs

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

@fabriziopandini i think we could close, aside from the question @sbueringer raised about MachinePools. i need to study the label propagation method though just to make sure.

cc @mboersma @willie-yao @Jont828 for the comment about MP

/close

FYI Those are the criteria for label propagation from machines to nodes:

  • Has node-role.kubernetes.io as prefix.
  • Belongs to node-restriction.kubernetes.io domain.
  • Belongs to node.cluster.x-k8s.io domain.

The third option allows to have labels outside kubernetes.io, but let's open a separated issue if we want to expand those rules.

Same if we want to start thinking to Taint propagation, which probably requires some API design work since we don't have such a concept in core machine APIs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@elmiko
Copy link
Contributor Author

elmiko commented May 3, 2024

@sbueringer i don't think we have a problem yet, but i just wanted to highlight a possible bad interaction with karpenter.

for example, as Fabrizio points out in order for labels to be propagated they must contain one of the three domain prefixes: node-role.kubernetes.io, node-restriction.kubernetes.io, and node.cluster.x-k8s.io domain. if a user creates a NodePool for karpenter that requires inspecting nodes for one of the labels with the kubernetes.io domain, then it must conform to the well-known names. if it does not conform, karpenter will not allow the user to use that key as a filter.

as i said, i don't think we need to do anything immediately, but it's something to consider as we move forward.

@sbueringer
Copy link
Member

Sounds good. Our main goal with limiting the labels we propagate was to only allow it for labels that make sense + node.cluster.x-k8s.io. If we discover additional use cases that require specific additional labels we can consider expanding the list of labels that are propagated.

@chrischdi
Copy link
Member

Sounds good. Our main goal with limiting the labels we propagate was to only allow it for labels that make sense + node.cluster.x-k8s.io. If we discover additional use cases that require specific additional labels we can consider expanding the list of labels that are propagated.

Alternative to allow-listing would be make it configurable in some way (e.g. via a flag).

@elmiko
Copy link
Contributor Author

elmiko commented May 3, 2024

Our main goal with limiting the labels we propagate was to only allow it for labels that make sense + node.cluster.x-k8s.io

i think this make good sense, we just want to inform users about preferring to use node.cluster.x-k8s.io labels.

Alternative to allow-listing would be make it configurable in some way (e.g. via a flag).

+1, would be interested to hear more about this idea

@sbueringer
Copy link
Member

I think whenever possible we should avoid adding additional flags to keep CAPI as simple as possible and also so it behaves the same for everyone (as much as possible). So if there is a general use case for specific standard Kubernetes labels, let's talk about it and not make it configurable so everyone has to allowlist them by themselves.

@elmiko
Copy link
Contributor Author

elmiko commented May 7, 2024

is it worth opening another issue to capture the taint use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

6 participants