Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler #3964

clamoriniere · 2021-03-22T09:07:36Z

This Pull request is implementing feature request #3873

Solution1 describe in #3873 has been implemented. Base on the feedback and advice received during the last SIG meeting: March 15 2021, the feature has been wrapped inside a Processor and disabled by default.

Thanks to this feature, it is possible to provide to the cluster-autoscaler, PodTemplates that will be considered as Daemonset.

Changes:

e29a8f1: Initial implementation Add a new processor:
PodTemplatesProcessor which aims to simulate Daemonset instances from
PodTemplates. This implementation extended the Daemonset list used during scale-up operations.
- Add new podTemplates processor with 2 implementations: noOpPodTemplateListProcessor (default) and
  activePodTemplateListProcessor.
- Update cluster-autoscaler helm chart with corev1.PodTemplate required RBAC.
6294791: Reimplement PodTemplateProcessor
behind the NodeInfoProcessor interface. The new nodeInfoWithPodTemplateProcessor updates for each NodeInfo the NodeInfo.Pods with Pods generated from the PodTemplate.

Thanks to this code structure it will be possible to enable NodeInfoProcessor implementations behind build flags.

k8s-ci-robot · 2021-03-22T09:07:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: clamoriniere
To complete the pull request process, please assign feiskyer, gjtempleton after the PR has been reviewed.
You can assign the PR to them by writing /assign @feiskyer @gjtempleton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

charts/OWNERS
cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

MaciekPytel · 2021-03-22T10:51:21Z

What I had in mind during the sig-meeting was to implement a NodeInfoProcessor that would inject those pods directly into template nodes.

My reasoning is that NodeInfoProcessor allows all sort of modifications to template nodes or pods that run on them. The new processor introduced in this PR seems tailored to this single use-case, handling custom DS controllers is the only reason I can think of for modifying the list of DS. I'd argue that this defeats the point of having a processor:

The annotation alone seems to give enough flexibility to handle custom DS controllers;
The interface is not generic enough to cover other use-cases.

Is there some reason why injecting into list of daemonsets would work better than injecting directly into template nodes? Or is there some other possible use-case where this processor could be useful that I'm missing?

clamoriniere · 2021-03-22T14:14:28Z

Hi @MaciekPytel

Sorry, I didn't fully understood your recommendation to use the NodeInfoProcessor for this use case. I had focused on the Processor pattern only. I will improve my change in this direction.

Is there some reason why injecting into the list of daemonsets would work better than injecting directly into template nodes? Or is there some other possible use-case where this processor could be useful that I'm missing?

I focused too much on my use case which is to recognize other resources as daemonsets. That is why updating the Daemonsets list seemed to be a good approach to not modify the workflows that use this list, like defining if a Daemonset should be apply to a Node.

But I can definitely work on a solution that implements the NodeInfoProcessor interface and create pods in the NodeInfo.Pods slice.

MaciekPytel · 2021-03-22T14:30:25Z

Thanks. Just for the record - I'm not saying I'm 100% certain that NodeInfoProcessor based solution is the best one, just that I'd like to consider it as an option and, if it's not good enough understand why that is. I think it should cover scale-up (though I may be missing something).
It may be worth looking into scale-down though - we handle DS pods differently when draining nodes and that logic has nothing to do with template nodes or NodeInfoProcessor. I think the question is what the desired behavior is in that case and whether we need a change that goes beyond template nodes to achieve it.

clamoriniere · 2021-03-22T16:52:21Z

@MaciekPytel
I agree with you. I will update my PR with NodeInfoProcessor approach.

I have a question about the NodeInfoProcessor if I understand properly (and correct me if I'm wrong) the current approach is to only have one instance of the NodeInfoProcessor at the time (like for other processor). But could we imaging to have severals NodeInfoProcessor at the same time? For example several extra pods sources; In this case run sequentially the different NodeInfoProcessor implementations.

For the scale down. I think I have already cover the needs in #2483.
The Daemonset list that the podTemplates processor was modifying is only used for scale up.

MaciekPytel · 2021-03-23T17:52:12Z

Regarding running multiple processors:

In many cases ordering is very important. In my experience if we have 2 processors A and B it's safer to assume A(B(x)) != B(A(x)).
I don't think we have enough processors to justify refactoring processors to lists of processors or similar. If needed it's easy enough to just write a processor that wraps some other processors and calls them.
- A processor that just holds a list of processors and calls them in sequence is a simple generic solution, but I found it starts to be pretty hard to reason about it once you have 3+ processors that you want to call. I now think having a dedicated processor that calls a specific sequence of other processors is actually a better solution. Given the importance in ordering I mentioned in my first comment I think there is little value in a generic solution and having a specialized processor that chains a set of processors in a way that addresses a given use-case is the way to go.

clamoriniere · 2021-03-29T06:32:03Z

Hi @MaciekPytel

I have updated the PR based on our previous discussion. I added two new commits:

4ad17cfda: Reimplement PodTemplateProcessor
behind the NodeInfoProcessor interface. The new nodeInfoWithPodTemplateProcessor updates for each NodeInfo the NodeInfo.Pods with Pods generated from the PodTemplate.
9bcf6344: Add new package k8s.io/autoscaler/cluster-autoscaler/processors/nodeinfos/builder which contains a "NodeInfoProcessor" factory:
- NodeInfoProcessor Implementations can register into the factory with the builder.Register() function.
- builder.Build() function is used to instantiate the NodeInfoProcessor implementation base on AutoscalerOptions.
Thanks to this code structure it will be possible to enable NodeInfoProcessor implementations behind build flags.

elmiko

i am not overly familiar with the processors code, but this generally makes sense to me. i did have one question though.

elmiko · 2021-04-12T19:15:47Z

cluster-autoscaler/processors/nodeinfos/podtemplates/node_info_with_podtemplates_processors.go

+	for _, podInfo := range baseNodeInfo.Pods {
+		pods = append(pods, podInfo.Pod)
+	}
+	if err := clusterSnapshot.AddNodeWithPods(node, pods); err != nil {


this seems like a side effect of calling this function, i would expect from the name that it just returns the nodeinfo with the pod templates. why do we add the node/pods to the cluster snapshot here?

Thanks @elmiko for the review.

I used the same logic that is present in the Daemonset support here
I need to check more deeply the purpose of this function.

that makes sense following the daemonset pattern. i was mainly asking because i don't know this code that well and was curious, i appreciate any further findings you might share =)

Hello @elmiko,

Sorry for the late reply,

I checked why it is used like this in the GetDaemonSetPodsForNode() function.

So for what I understand, it is because in this specific case the Node doesn't exist yet (we are in the edgecase where we scale from 0), and we don't want to add this Node the context.AutoscalingContext.ClusterSnapshot yet.

The other solution could have been to use "context.AutoscalingContext.ClusterSnapshot.Fork()" but since we are not interested by the others NodeInfo here, creating a new ClusterSnapshot with simulator.NewBasicClusterSnapshot() limit the memory copy.

thanks @clamoriniere ! i appreciate the extra context =)

Using an empty snapshot makes nodes that are already in the clusters invisible to scheduling, in other words you implicitly assume such pods wouldn't impact the result of scheduling. There are some scheduling constraints (podAffinity, podTopologySpreading) that look at multiple nodes. Admittedly using any of those on DS or DS-like pod seems like a really bad idea, but technically I think it's more correct to go with Fork/Revert route.

Also - Fork/Revert is used across codebase, but I don't think we actually use BasicClusterSnapshot anywhere in CA. So you're exposing yourself to code that is not tested in production.

Hi @MaciekPytel
thanks for the details.
The edge-case that I would like to solve is the "scale from 0", for which the cluster-autoscaler is looking at Daemonset to simulate an existing Node. Thanks is why in this case, knowing the other existing Nodes doesn't seems useful. It is what I understood from code use to generate Pod based on Daemonset

autoscaler/cluster-autoscaler/utils/daemonset/daemonset.go

Line 35 in 2542e8c

clusterSnapshot := simulator.NewBasicClusterSnapshot()

which also use an empty snapshot. I thought it was an optimisation.

But I don't see any reason why I can't change PodTemplate NodeInfoProcessor to use a Fork.

Fair enough, I was clearly wrong on not using BasicClusterSnapshot anywhere. I don't think the fact that we're scaling-from-0 changes anything here though. Those scheduling constraints look at all nodes in a given topology (in this case the relevant topology being zone) and the fact that there are no nodes in a given NodeGroup does not imply there are no other nodes in a zone. That being said it is a bit of a theoretical problem as I can't think of a reason to use topologySpreading or podAffinity on DS and the fact that no-one run into problems with existing DS logic seems to confirm that.

k8s-ci-robot · 2021-06-30T19:46:57Z

New changes are detected. LGTM label has been removed.

The PodTemplate Processor is used to simulate DaemonSet resources from PodTemplate that have a specific label. The processorr is used during ScaleUp operation.

`nodeInfoWithPodTemplateProcessor` implements the `NodeInfoProcessor` interface. This NodeInfoProcessor can be used to assign Pods to a NodeInfo during scale-up simulation. The processor looks at PodTemplates to generates Pods from these templates. Only PodTemplates with a specific label are selected.

The `nodeinfos.builder` package contains: * the `Register()` function use to register `nodeinfos.NodeInfoProcessor` implementation. * the `Build()` function use to instanciate the requested `nodeinfos.NodeInfoProcessor` implementation from `AutoscalerOptions`.

Signed-off-by: cedric lamoriniere <[email protected]>

k8s-ci-robot · 2021-07-05T17:42:55Z

@clamoriniere: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2021-10-03T18:01:05Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

gjtempleton · 2021-10-03T21:12:23Z

/remove-lifecycle stale

k8s-triage-robot · 2022-01-01T22:09:50Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-01-31T22:39:59Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

mwielgus · 2022-02-07T13:48:50Z

No activity, closing. Feel free to reopen if needed.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 22, 2021

k8s-ci-robot requested review from gjtempleton and towca March 22, 2021 09:07

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 22, 2021

clamoriniere changed the title ~~Add PodTemplate Processor~~ Add PodTemplates Processor Mar 22, 2021

clamoriniere changed the title ~~Add PodTemplates Processor~~ Add PodTemplates Processor in Cluster-Autoscaler Mar 22, 2021

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2021

clamoriniere force-pushed the feature/PodTemplateProcessor branch from a277580 to 277c519 Compare March 27, 2021 16:33

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 27, 2021

clamoriniere force-pushed the feature/PodTemplateProcessor branch from 277c519 to 0356b02 Compare March 27, 2021 16:42

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 27, 2021

clamoriniere force-pushed the feature/PodTemplateProcessor branch 2 times, most recently from 7975c64 to 3d0c914 Compare March 27, 2021 21:43

bpineau mentioned this pull request Apr 8, 2021

NodeInfo Processor for exclusive usage of template infos #4000

Closed

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 9, 2021

clamoriniere force-pushed the feature/PodTemplateProcessor branch 3 times, most recently from 228614f to c1b1594 Compare April 9, 2021 19:30

clamoriniere changed the title ~~Add PodTemplates Processor in Cluster-Autoscaler~~ Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler Apr 9, 2021

clamoriniere force-pushed the feature/PodTemplateProcessor branch from c1b1594 to 6411d70 Compare April 9, 2021 19:38

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 9, 2021

elmiko reviewed Apr 12, 2021

View reviewed changes

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 30, 2021

clamoriniere added 7 commits June 30, 2021 21:48

Add PodTemplate Processor

e29a8f1

The PodTemplate Processor is used to simulate DaemonSet resources from PodTemplate that have a specific label. The processorr is used during ScaleUp operation.

Apply getNodeInfoWithPodTemplates() on NodeTemplate only

c2b9367

fix unit-test after rebase

a5af2c3

Signed-off-by: cedric lamoriniere <[email protected]>

Remove the NodeInfoProcessor builder

e1e8a34

update all ClusterRole to support PodTemplates

aee8776

clamoriniere force-pushed the feature/PodTemplateProcessor branch from 672fe3b to aee8776 Compare June 30, 2021 20:06

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 30, 2021

clamoriniere added 2 commits June 30, 2021 22:39

fix test to filter nodes between copy&template

2935c4a

Replace NodeInfo.Clone() by DeepCopyTemplateNode()

1ba4f60

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 5, 2021

ahmed-mez mentioned this pull request Aug 6, 2021

Generate a PodTemplate for each EDS DataDog/extendeddaemonset#110

Merged

clamoriniere mentioned this pull request Aug 24, 2021

[local] NodeInfoPodTemplateProcessor implementation DataDog/autoscaler#47

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 1, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 31, 2022

mwielgus closed this Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler #3964

Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler #3964

clamoriniere commented Mar 22, 2021 •

edited

Loading

k8s-ci-robot commented Mar 22, 2021

MaciekPytel commented Mar 22, 2021

clamoriniere commented Mar 22, 2021 •

edited

Loading

MaciekPytel commented Mar 22, 2021

clamoriniere commented Mar 22, 2021

MaciekPytel commented Mar 23, 2021

clamoriniere commented Mar 29, 2021 •

edited

Loading

elmiko left a comment

elmiko Apr 12, 2021

clamoriniere Apr 12, 2021

elmiko Apr 12, 2021 •

edited

Loading

clamoriniere May 25, 2021

elmiko May 25, 2021

MaciekPytel Jun 9, 2021 •

edited

Loading

clamoriniere Jun 9, 2021 •

edited

Loading

MaciekPytel Jun 9, 2021

k8s-ci-robot commented Jun 30, 2021

k8s-ci-robot commented Jul 5, 2021

k8s-triage-robot commented Oct 3, 2021

gjtempleton commented Oct 3, 2021

k8s-triage-robot commented Jan 1, 2022

k8s-triage-robot commented Jan 31, 2022

mwielgus commented Feb 7, 2022

Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler #3964

Add PodTemplates support as NodeInfoProcessor in Cluster-Autoscaler #3964

Conversation

clamoriniere commented Mar 22, 2021 • edited Loading

Changes:

k8s-ci-robot commented Mar 22, 2021

MaciekPytel commented Mar 22, 2021

clamoriniere commented Mar 22, 2021 • edited Loading

MaciekPytel commented Mar 22, 2021

clamoriniere commented Mar 22, 2021

MaciekPytel commented Mar 23, 2021

clamoriniere commented Mar 29, 2021 • edited Loading

elmiko left a comment

Choose a reason for hiding this comment

elmiko Apr 12, 2021

Choose a reason for hiding this comment

clamoriniere Apr 12, 2021

Choose a reason for hiding this comment

elmiko Apr 12, 2021 • edited Loading

Choose a reason for hiding this comment

clamoriniere May 25, 2021

Choose a reason for hiding this comment

elmiko May 25, 2021

Choose a reason for hiding this comment

MaciekPytel Jun 9, 2021 • edited Loading

Choose a reason for hiding this comment

clamoriniere Jun 9, 2021 • edited Loading

Choose a reason for hiding this comment

MaciekPytel Jun 9, 2021

Choose a reason for hiding this comment

k8s-ci-robot commented Jun 30, 2021

k8s-ci-robot commented Jul 5, 2021

k8s-triage-robot commented Oct 3, 2021

gjtempleton commented Oct 3, 2021

k8s-triage-robot commented Jan 1, 2022

k8s-triage-robot commented Jan 31, 2022

mwielgus commented Feb 7, 2022

clamoriniere commented Mar 22, 2021 •

edited

Loading

clamoriniere commented Mar 22, 2021 •

edited

Loading

clamoriniere commented Mar 29, 2021 •

edited

Loading

elmiko Apr 12, 2021 •

edited

Loading

MaciekPytel Jun 9, 2021 •

edited

Loading

clamoriniere Jun 9, 2021 •

edited

Loading