cluster-autoscaler: vendored scheduler dependency must be updated ASAP #3224

aermakov-zalando · 2020-06-15T10:01:47Z

In kubernetes/kubernetes#89222, the scheduler algorithm was slightly tweaked to consider init containers when calculating how many resources a pod takes on a node. This went into at least 1.17.6 (which we currently run) and I guess the corresponding 1.18.x/1.19.x versions. CA has the scheduler as a vendored dependency and there was no corresponding release, which means that every cluster running Kubernetes 1.17.6 can easily go into a situation where CA won't scale up even though the pods can't be scheduled. This happens if you have pods whose initContainer requests are bigger than their main container requests; the scheduler in this case will refuse to schedule but CA will also not do anything since it'll think that a pod could be scheduled.
Other than updating the dependencies, I'd suggest collaborating more closely with sig-scheduling, because any changes in the scheduler might affect CA and they might need to be released in lockstep.

MaciekPytel · 2020-06-15T15:35:47Z

Thanks for bringing this up. I'm discussing with sig-scheduling how to handle this.
It's should be a straightforward dependency bump for all versions except 1.17. 1.17 CA relies on a commit that is not on 1.17 branch in k/k, so we can't update dependencies there. We're looking to see if we can cherry-pick the required changes on 1.17 branch in k/k, but that will probably take a while.

aermakov-zalando · 2020-06-15T15:46:57Z

We run our own fork (still based on 1.12.2) and we already fixed it there by just patching the vendored dependency, so it's not very relevant for us. You might want consider doing the same in CA until you can update the dependencies properly so it works for other users.

MaciekPytel · 2020-07-28T12:13:19Z

1.18.2 and 1.17.3 should have fix for this issue.

fejta-bot · 2020-10-26T13:06:27Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

yurrriq · 2020-11-17T22:00:46Z

This can be closed now, yeah?

MaciekPytel · 2020-11-18T10:57:11Z

Correct. Sorry, I forgot to close it.

MaciekPytel mentioned this issue Jun 22, 2020

Cherry pick of #85689: Export scheduler.Snapshot function kubernetes/kubernetes#92376

Merged

MaciekPytel mentioned this issue Jul 14, 2020

Cluster Autoscaler patch releases #3317

Closed

ahg-g mentioned this issue Jul 17, 2020

Scheduler and cluster auto scaler don't agree on available resources kubernetes/kubernetes#93186

Closed

MaciekPytel mentioned this issue Jul 21, 2020

Updating vendor against [email protected]:kubernetes/kubernetes.git:rele… #3332

Merged

hardikdr mentioned this issue Jul 27, 2020

Consider the update from upstream gardener/autoscaler#51

Closed

MaciekPytel mentioned this issue Jul 28, 2020

CA panic (NodesHaveSameTopologyKey) in large clusters with high node churn #3368

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2020

MaciekPytel closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster-autoscaler: vendored scheduler dependency must be updated ASAP #3224

cluster-autoscaler: vendored scheduler dependency must be updated ASAP #3224

aermakov-zalando commented Jun 15, 2020 •

edited

Loading

MaciekPytel commented Jun 15, 2020

aermakov-zalando commented Jun 15, 2020

MaciekPytel commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

yurrriq commented Nov 17, 2020

MaciekPytel commented Nov 18, 2020

cluster-autoscaler: vendored scheduler dependency must be updated ASAP #3224

cluster-autoscaler: vendored scheduler dependency must be updated ASAP #3224

Comments

aermakov-zalando commented Jun 15, 2020 • edited Loading

MaciekPytel commented Jun 15, 2020

aermakov-zalando commented Jun 15, 2020

MaciekPytel commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

yurrriq commented Nov 17, 2020

MaciekPytel commented Nov 18, 2020

aermakov-zalando commented Jun 15, 2020 •

edited

Loading