Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync upstream v1.26.0 #184

Merged
merged 259 commits into from
Mar 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
259 commits
Select commit Hold shift + click to select a range
c5978b6
Drop redudant parameter in utilization calculation
x13n May 20, 2022
1284ecd
Extract checks for scale down eligibility
x13n Aug 23, 2022
c2a0329
Limit amount of node utilization logging
x13n Aug 31, 2022
34e3512
Merge pull request #5157 from jayantjain93/go-cmp-upgrade
k8s-ci-robot Sep 1, 2022
65e7776
Merge pull request #5151 from jbartosik/increase-timeout
k8s-ci-robot Sep 1, 2022
5e85a7d
Merge pull request #5115 from yaroslava-serdiuk/pdb
k8s-ci-robot Sep 2, 2022
f46910a
Increase timeout for VPA E2E
jbartosik Sep 2, 2022
8641dca
Merge pull request #5159 from jbartosik/increase-timeout
k8s-ci-robot Sep 5, 2022
c38cc74
Merge pull request #5124 from omerlh/patch-1
k8s-ci-robot Sep 5, 2022
11d150e
Add podScaleUpDelay annotation support
Aug 9, 2021
f4d25df
Corrected the links for Priority in k8s API and Pod Preemption in k8s.
Shubham82 Sep 6, 2022
00fa7c2
Restrict Updater PodLister to namespace
voelzmo Sep 8, 2022
b7d68c0
Update controller-gen to latest and use go install
voelzmo Sep 8, 2022
c50c8ad
Run hack/generate-crd-yaml.sh
voelzmo Sep 8, 2022
64194ec
update owners list for cluster autoscaler azure
gandhipr Sep 9, 2022
7fc3113
Change VPA default version to 0.12.0
jbartosik Sep 9, 2022
947c670
Merge pull request #5182 from jbartosik/default012
k8s-ci-robot Sep 14, 2022
96da700
Pin controller-gen to 0.9.2
voelzmo Sep 14, 2022
cb1ba27
Merge pull request #5116 from DataDog/bp/azure-support-undescore-tags
k8s-ci-robot Sep 14, 2022
9d28c03
AWS ReadMe update
bdobay Sep 15, 2022
8dcd3b3
Merge pull request #5118 from x13n/scaledown
k8s-ci-robot Sep 15, 2022
6419abf
Move resource limits checking to a separate package
x13n Aug 25, 2022
2854870
Allow simulator to persist changes in cluster snapshot
x13n Aug 26, 2022
6017d85
Don't depend on IsNodeBeingDeleted implementation
x13n Sep 15, 2022
b042aae
Merge pull request #5191 from x13n/scaledown
k8s-ci-robot Sep 15, 2022
ab08e9a
Merge pull request #5131 from x13n/scaledown2
k8s-ci-robot Sep 15, 2022
540ff4e
Stop treating masters differently in scale down
x13n Aug 26, 2022
9de9e5d
CA - AWS - Instance List Update 2022-09-16
juanitomint Sep 16, 2022
b878168
fix typo
JamesClonk Sep 17, 2022
6edb3f2
Modifying taint removal logic on startup to consider all nodes instea…
fookenc Sep 19, 2022
8c6bd81
fix typo
JamesClonk Sep 20, 2022
bcc1cf3
Merge pull request #5200 from fookenc/remove-taints-on-startup
k8s-ci-robot Sep 20, 2022
c52316b
Update VPA compatibility for 0.12 release
jbartosik Sep 21, 2022
e6f637b
Updated the golang version for GitHub workflow.
Shubham82 Sep 21, 2022
76a5ec8
Create GCE CloudProvider Owners file
jayantjain93 Sep 21, 2022
da8c8cf
Merge pull request #5207 from jayantjain93/patch-2
k8s-ci-robot Sep 21, 2022
30a64a7
Fix error formatting in GCE client
x13n Sep 22, 2022
eea1107
Merge pull request #5208 from x13n/patch-4
k8s-ci-robot Sep 22, 2022
65b0d78
Introduce NodeDeleterBatcher to ScaleDown actuator
yaroslava-serdiuk Jul 25, 2022
b3c6b60
Merge pull request #5060 from yaroslava-serdiuk/deleting-in-batch
k8s-ci-robot Sep 22, 2022
f1b6d4d
handle directx nodes the same as gpu nodes
Flask Sep 23, 2022
70efe28
Merge pull request #5133 from x13n/scaledown3
k8s-ci-robot Sep 23, 2022
4491403
magnum: add an option to create insecure TLS connections
antonkurbatov Sep 25, 2022
07b531c
Drop unused maps
x13n Aug 26, 2022
15a2afa
Merge pull request #4799 from matthyx/flags
k8s-ci-robot Sep 26, 2022
e767044
Merge pull request #5177 from voelzmo/fix/updater-podlist-namespacing
k8s-ci-robot Sep 26, 2022
3a3ec38
Extract criteria for removing unneded nodes to a separate package
x13n Aug 30, 2022
28b92e7
skip instances on validation error
Freyert Sep 26, 2022
ca6ed15
Merge pull request #5178 from voelzmo/enh/update-controller-gen
k8s-ci-robot Sep 29, 2022
5c9cc27
cleanup unused constants in clusterapi provider
elmiko Sep 29, 2022
728bea6
Merge pull request #5222 from elmiko/capi-cleanup
k8s-ci-robot Sep 29, 2022
030a7f5
Update the example spec of civo cloudprovider
vishalanarase Sep 30, 2022
a99294d
Fix race condition in scale down test
yaroslava-serdiuk Sep 30, 2022
54239bd
Clean up stale OWNERS
x13n Sep 30, 2022
9cae42c
Merge pull request #5227 from yaroslava-serdiuk/batch-test
k8s-ci-robot Sep 30, 2022
3d9ab55
add example for multiple recommenders
matthyx Oct 3, 2022
cdf8406
Merge pull request #5226 from vishalanarase/civo-update-example-spec
k8s-ci-robot Oct 3, 2022
35e8ee8
Balancer KEP
mwielgus Sep 26, 2022
5417153
Merge pull request #5209 from helio/gpu-processor-directx
k8s-ci-robot Oct 3, 2022
7db0d94
Merge pull request #5205 from Shubham82/update_golang_version
k8s-ci-robot Oct 3, 2022
0973954
Add VPA E2E for recomemndation not exaclty matching pod
jbartosik Oct 3, 2022
ab75576
Add VPA E2E for recomemndation not exaclty matching pod with limit range
jbartosik Oct 4, 2022
e75a769
Remove units for default boot disk size
yaroslava-serdiuk Oct 4, 2022
078a6e0
Merge pull request #5233 from yaroslava-serdiuk/boot-disk
k8s-ci-robot Oct 4, 2022
9a92bcf
Merge pull request #5231 from matthyx/doc
k8s-ci-robot Oct 5, 2022
eded7ec
Merge pull request #5232 from jbartosik/e2e-test-admission-pod-recomm…
k8s-ci-robot Oct 5, 2022
f63315c
Merge pull request #5213 from Freyert/gce-409-skip
k8s-ci-robot Oct 5, 2022
ee09474
Merge pull request #5196 from JamesClonk/master
k8s-ci-robot Oct 5, 2022
ddf0fe0
Merge pull request #5193 from juanitomint/master
k8s-ci-robot Oct 5, 2022
c65a3a3
Merge pull request #5210 from antonkurbatov/bugfix/magnum-tls-insecure
k8s-ci-robot Oct 5, 2022
4ff4903
Merge pull request #5167 from Shubham82/Correct_Pod_Priority_and_Pree…
k8s-ci-robot Oct 6, 2022
5500a15
Fix accessing index out of bonds
jbartosik Sep 29, 2022
3eb5bf8
[vpa] introduce recommendation post processor
dbenque Oct 5, 2022
d1f2acf
Fixed gofmt error.
Shubham82 Oct 10, 2022
e206ae2
Merge pull request #5241 from Shubham82/update-gofmt
k8s-ci-robot Oct 11, 2022
2ee8023
Don't break scale up with priority expander config
x13n Oct 11, 2022
ae9ed65
Merge pull request #5246 from x13n/priority-expander
k8s-ci-robot Oct 11, 2022
2870e1e
added replicas count for daemonsets to prevent massive pod eviction
RomanenkoDenys Oct 11, 2022
fcdf42d
code review, move flag to boolean for post processor
dbenque Oct 11, 2022
e286a95
Add support for extended resource definition in GCE MIG template
zaymat Oct 11, 2022
8ad8cd7
Merge pull request #4978 from RomanenkoDenys/daemonset-count-replicas
k8s-ci-robot Oct 12, 2022
2128c8b
Make expander factory logic more pluggable
x13n Oct 12, 2022
0ee2a35
Add option to wait for a period of time after node tainting/cordoning
alex-matei Oct 6, 2022
d022e26
Merge pull request #4956 from damirda/feature/scale-up-delay-annotations
k8s-ci-robot Oct 13, 2022
2dfa33d
Merge pull request #5228 from x13n/owners
k8s-ci-robot Oct 13, 2022
2bff923
Merge pull request #5202 from jbartosik/vpa-0-12-compat
k8s-ci-robot Oct 13, 2022
37c4ff1
Merge pull request #5220 from jbartosik/fix-vpa-updater-recommendatio…
k8s-ci-robot Oct 13, 2022
a395e67
Merge pull request #5181 from gandhipr/prachigandhi-azure-ca-update-o…
k8s-ci-robot Oct 13, 2022
82cb726
Merge pull request #5211 from mwielgus/balancer
k8s-ci-robot Oct 14, 2022
95698da
remove the flag for Capping post-processor
dbenque Oct 14, 2022
7aba0f4
Merge pull request #5248 from x13n/expander
k8s-ci-robot Oct 14, 2022
bb015b2
remove unsupported functionality from cluster-api provider
elmiko Oct 14, 2022
a1b2c2c
Merge pull request #5249 from elmiko/remove-capi-labels-taints
k8s-ci-robot Oct 17, 2022
dc73ea9
Merge pull request #5235 from UiPath/fix_node_delete
k8s-ci-robot Oct 17, 2022
f445a6a
Merge pull request #5147 from x13n/scaledown4
k8s-ci-robot Oct 17, 2022
95fd1ed
Remove ScaleDown dependency on clusterStateRegistry
x13n Sep 1, 2022
776d731
Adding support for identifying nodes that have been deleted from clou…
fookenc Jul 27, 2022
cf67a30
Implementing new cloud provider method for node deletion detection (#1)
fookenc Oct 17, 2022
e59c044
Fixing go formatting issues with clusterstate_test
fookenc Oct 17, 2022
7fc1f6b
Fixing errors due to merge on branches.
fookenc Oct 17, 2022
ea7059f
Adjusting initial implementation of NodeExists to be consistent among…
fookenc Oct 18, 2022
fa2c245
Fix list scaling group instance pages bug
Oct 17, 2022
169c661
Format log output
Oct 18, 2022
18f2e67
Split out code from simulator package
x13n Oct 18, 2022
0a46483
Merge pull request #5252 from jwcesign/fix-pages
k8s-ci-robot Oct 19, 2022
6bf6f50
Code Review: Do not return an error on malformed extended_resource + …
zaymat Oct 19, 2022
8a4df42
huawei-cloudprovider:enable tags resolve for as
Oct 19, 2022
aeee344
Magnum provider: switch UUID dependency from satori to gofrs
tghartland Oct 19, 2022
c342fc2
change uuid dependency in cluster autoscaler kamatera provider
OriHoch Oct 19, 2022
92f5b86
Extract scheduling hints to a dedicated object
x13n Sep 2, 2022
360b193
Merge pull request #5161 from x13n/scaledown5
k8s-ci-robot Oct 20, 2022
585ad02
Remove dead code for handling simulation errors
x13n Sep 29, 2022
5298d55
Merge pull request #5256 from jwcesign/fix-tags-resolve
k8s-ci-robot Oct 20, 2022
e6ff526
Merge pull request #5229 from x13n/scaledown
k8s-ci-robot Oct 20, 2022
5e8e743
Merge pull request #5247 from DataDog/mayeul/add-extended-resource-su…
k8s-ci-robot Oct 24, 2022
bc586f8
Fix typo, move service accounts to RBAC
joelsmith Oct 24, 2022
419cbe5
VPA: Add missing --- to CRD manifests
joelsmith Oct 24, 2022
accf58f
Base parallel scale down implementation
x13n Sep 10, 2022
a703c3f
Merge pull request #5230 from x13n/scaledown6
k8s-ci-robot Oct 25, 2022
8dec202
Stop applying the beta.kubernetes.io/os and arch
pacoxu Oct 27, 2022
7cbcabc
[CA] Register recently evicted pods in NodeDeletionTracker.
olagacek Oct 26, 2022
1e40dda
Merge pull request #5274 from olagacek/master
k8s-ci-robot Oct 27, 2022
8f58f6f
Merge pull request #5260 from Kamatera/cluster-autoscaler-kamatera-ch…
k8s-ci-robot Oct 28, 2022
887810c
Merge pull request #5190 from bdobay/updateReadMe
k8s-ci-robot Oct 28, 2022
e0d4679
Merge pull request #5261 from tghartland/5218-magnum-uuid-dep
k8s-ci-robot Oct 28, 2022
7be12a1
Add KEP to introduce UpdateMode: UpscaleOnly
voelzmo Apr 27, 2022
93f5a8e
Clarify prometheus use-case
voelzmo May 2, 2022
aabb09a
Adapt to review comments
voelzmo May 30, 2022
1686541
Adapt KEP according to review
voelzmo Jul 28, 2022
9bac7d8
Add newline after header
voelzmo Sep 8, 2022
f74e054
Rename proposal directory to fit KEP title
voelzmo Sep 8, 2022
7f3ef8c
Make KEP and implementation proposal consistent
voelzmo Oct 31, 2022
398e68e
remove post-processor factory
dbenque Oct 31, 2022
cacd6b6
update test for MapToListOfRecommendedContainerResources
dbenque Oct 31, 2022
d7caed3
Merge pull request #5268 from joelsmith/rbac
k8s-ci-robot Nov 2, 2022
9501927
Update aws OWNERS
x13n Nov 2, 2022
4373c46
Add ScaleDown.Actuator to AutoscalingContext
BigDarkClown Oct 20, 2022
3ceb97a
Merge pull request #5239 from DataDog/david.benque/reco-post-processing
k8s-ci-robot Nov 2, 2022
012f61e
update the hyperlink of api-conventions.md file in comments
niconical Nov 2, 2022
c245e6f
Merge pull request #5287 from x13n/patch-5
k8s-ci-robot Nov 2, 2022
9e6d714
Merge pull request #5265 from BigDarkClown/deletion
k8s-ci-robot Nov 2, 2022
524886f
Support scaling up node groups to the configured min size if needed
liuxintong Sep 17, 2022
49bd76b
Fix: add missing RBAC permissions to magnum examples
GanjMonk Nov 3, 2022
6dae4ae
make spellchecker happy
voelzmo Nov 3, 2022
de56060
Merge pull request #5195 from liuxintong/0916-ca-su
k8s-ci-robot Nov 3, 2022
08dfc7e
Changing deletion logic to rely on a new helper method in ClusterStat…
fookenc Nov 5, 2022
bd51490
Merge pull request #5282 from niconical/master
k8s-ci-robot Nov 7, 2022
10df35a
Fix VPA deployment
jbartosik Nov 7, 2022
6067c47
Merge pull request #5299 from jbartosik/fix5298
k8s-ci-robot Nov 7, 2022
9cb12e7
Don't say that `Recreate` and `Auto` VPA modes are experimental
jbartosik Nov 3, 2022
ab4fff6
Fixing go formatting issue in cloudstack cloud provider code.
fookenc Nov 7, 2022
b5e2dca
Merge pull request #5294 from jbartosik/jbartosik-patch-1
k8s-ci-robot Nov 8, 2022
151d0fb
Add missing cloud providers to readme and sort alphabetically
AverageMarcus Nov 9, 2022
88157c3
huawei-cloudprovider: enable taints resolve for as, modify the exampl…
Nov 8, 2022
3d09af6
Update cluster-autoscaler/README.md
AverageMarcus Nov 11, 2022
6be7e80
Merge pull request #5306 from AverageMarcus/sort_providers
k8s-ci-robot Nov 14, 2022
62f29d2
cluster-autoscaler: refactor BalanceScaleUpBetweenGroups
grosser Nov 15, 2022
92bba5c
Allow forking snapshot more than 1 time
yaroslava-serdiuk Nov 3, 2022
bee2ab6
Fork ClusterSnapshot in UpdateClusterState
yaroslava-serdiuk Nov 8, 2022
01389b0
Merge pull request #5290 from yaroslava-serdiuk/scale-down
k8s-ci-robot Nov 16, 2022
d20dbb8
add logging information to FAQ
elmiko Nov 11, 2022
96b3134
Merge pull request #5301 from jwcesign/master
k8s-ci-robot Nov 21, 2022
09683d6
fix(cluster-autoscaler/hetzner): pre-existing volumes break scheduling
apricote Nov 21, 2022
5ba8279
Added RBAC Permission to Azure.
Shubham82 Nov 22, 2022
1ea4ff9
Merge pull request #5323 from Shubham82/add_RBAC_permissions_Azure
k8s-ci-robot Nov 22, 2022
5152fdf
Merge pull request #5292 from GanjMonk/magnum-examples-fix
k8s-ci-robot Nov 22, 2022
ed0e9b5
Merge pull request #5310 from elmiko/update-ca-faq
k8s-ci-robot Nov 22, 2022
d9100cd
Log node group min and current size when skipping scale down
x13n Nov 23, 2022
10d3f25
Use scheduling package in filterOutSchedulable processor
BigDarkClown Oct 19, 2022
6a4eaf5
Merge pull request #5259 from BigDarkClown/schedulable
k8s-ci-robot Nov 23, 2022
bc48327
Merge pull request #5325 from x13n/master
k8s-ci-robot Nov 24, 2022
6b7c291
Check owner reference in scale down planner to avoid double-counting
olagacek Oct 31, 2022
41e1ea6
Merge pull request #5284 from olagacek/master
k8s-ci-robot Nov 25, 2022
684184c
Add note regarding GPU label for the CAPI provider
yankcrime Nov 25, 2022
a16edea
chore(cluster-autoscaler/hetzner): add myself to OWNERS file
apricote Nov 25, 2022
a20685b
Use ScaleDownSetProcessor.GetNodesToRemove in scale down planner to
olagacek Nov 25, 2022
aa7733c
Merge pull request #5330 from olagacek/master
k8s-ci-robot Nov 29, 2022
f2ccfb5
Handle pagination when looking through supported shapes.
jlamillan Oct 27, 2022
c4c611e
Add OCI API files to handle OCI work-request operations.
jlamillan Oct 28, 2022
fd3fbd0
Fail fast if OCI instance pool is out of capacity/quota.
jlamillan Oct 28, 2022
bd2ff82
update vendor to v1.26.0-rc.1
liggitt Nov 29, 2022
170cf0f
Merge pull request #5329 from hetznercloud/ca-hcloud-owners-apricote
k8s-ci-robot Dec 1, 2022
5e74894
fix issue 5332
McGon-Fid Dec 2, 2022
6000d68
Merge pull request #5336 from liggitt/1-26-rc.0
k8s-ci-robot Dec 2, 2022
d0b14ce
Deprecate v1beta1 API
jbartosik Dec 2, 2022
7783c3e
Add note about `v1beta2` deprecation to README
jbartosik Dec 2, 2022
bcc0645
fix issue 5332 - adding suggestied change
McGon-Fid Dec 5, 2022
bae587d
Break node categorization in scale down planner on timeout.
olagacek Dec 2, 2022
b2e250f
Automatically label cluster-autoscaler PRs
x13n Dec 5, 2022
dea52d7
Merge pull request #5348 from x13n/patch-6
k8s-ci-robot Dec 5, 2022
0cb8a7d
Merge pull request #5346 from fidelity/fidelity-20221201-141629
k8s-ci-robot Dec 5, 2022
98d3aec
Merge pull request #5345 from jbartosik/deprecate-v1beta1-2
k8s-ci-robot Dec 5, 2022
e7146e4
Merge pull request #5322 from hetznercloud/fix-volume-topology-v1.x
k8s-ci-robot Dec 5, 2022
9c7bc60
Merge pull request #5328 from yankcrime/capi_gpu_label
k8s-ci-robot Dec 5, 2022
735cf98
Add missing dot
x13n Dec 5, 2022
35b7597
fix generate ec2 instance types
khizunov Dec 5, 2022
5238cbe
Introduce a formal policy for maintaining cloudproviders
MaciekPytel Sep 12, 2022
9a2844c
Introduce Cloudprovider Maintenance Request to policy
MaciekPytel Dec 5, 2022
fcb1859
feat(helm): add rancher cloud config support
24601 Dec 5, 2022
1198fbc
Updating error messaging and fallback behavior of hasCloudProviderIns…
fookenc Dec 5, 2022
c94740f
Fixing helper function to simplify for loop to retrieve deleted node …
fookenc Dec 5, 2022
df627e2
Merge pull request #5344 from olagacek/master
k8s-ci-robot Dec 6, 2022
94f1920
Use PdbRemainingDisruptions in Planner
yaroslava-serdiuk Nov 28, 2022
d1a89cf
Put risky NodeToRemove in the end of needDrain list
yaroslava-serdiuk Dec 2, 2022
a753813
Auto Label Helm Chart PRs
gjtempleton Dec 6, 2022
7d31327
psp_api
xval2307 Dec 7, 2022
ae45571
Create a Planner object if --parallelDrain=true
yaroslava-serdiuk Dec 6, 2022
73342a4
Export execution_latency_seconds metric from VPA admission controller
jbartosik Dec 7, 2022
0032a6c
Merge pull request #5353 from yaroslava-serdiuk/planner
k8s-ci-robot Dec 7, 2022
3f4851b
aws: add nodegroup name to default labels
yznima Oct 31, 2022
b4a47c3
Fix int formatting in threshold_based_limiter logs
x13n Dec 8, 2022
ee66755
Merge pull request #5333 from yaroslava-serdiuk/scale-down
k8s-ci-robot Dec 8, 2022
79c4be8
rancher-cloudprovider: Improve node group discovery
ctrox Dec 8, 2022
793012e
Merge pull request #5359 from x13n/patch-8
k8s-ci-robot Dec 9, 2022
c3d8e81
Don't add pods from drained nodes in scale-down
BigDarkClown Dec 2, 2022
2e1b04f
Add default PodListProcessor wrapper
BigDarkClown Dec 1, 2022
fb29a1d
Add currently drained pods before scale-up
BigDarkClown Dec 2, 2022
6d9fed5
set cluster_autoscaler_max_nodes_count dynamically
yasinlachiny Dec 10, 2022
5be418e
Merge pull request #5120 from khizunov/move-instance-types-to-wrapper
k8s-ci-robot Dec 11, 2022
fd2b2b0
Merge pull request #5285 from yznima/aws-ng-name
k8s-ci-robot Dec 11, 2022
fb0fee2
fix(helm): bump chart ver -> 9.21.1
24601 Dec 11, 2022
5e41418
CA - AWS - Update Hardcoded Instance Details List to 11-12-2022
gjtempleton Dec 11, 2022
2da5b73
Merge pull request #5358 from jbartosik/add-execution-latency-from-vp…
k8s-ci-robot Dec 12, 2022
313df69
Merge pull request #5349 from x13n/patch-7
k8s-ci-robot Dec 12, 2022
c78744b
Merge pull request #5354 from BigDarkClown/remove_nodes
k8s-ci-robot Dec 12, 2022
40f6447
Merge pull request #5357 from xval2307/master
k8s-ci-robot Dec 12, 2022
db4de87
Merge pull request #5361 from ninech/rancher-fix-provider-id
k8s-ci-robot Dec 12, 2022
ca6b5bc
Merge pull request #5363 from gjtempleton/AWS-Instance-List-Update-11…
k8s-ci-robot Dec 12, 2022
3806348
Merge pull request #5335 from jlamillan/jlamillan/oci-provider-fail-f…
k8s-ci-robot Dec 12, 2022
c296a14
Merge pull request #5198 from MaciekPytel/cp_policy
k8s-ci-robot Dec 14, 2022
666b4a1
Add x13n to cluster autoscaler approvers
MaciekPytel Dec 14, 2022
7a1668e
update prometheus metric min maxNodesCount and a.MaxNodesTotal
yasinlachiny Dec 14, 2022
0d6cc1d
Merge pull request #5356 from gjtempleton/Label-Helm-Chart-PRs
k8s-ci-robot Dec 15, 2022
841ede8
Merge pull request #5351 from 24601/patch-1
k8s-ci-robot Dec 15, 2022
6186d70
Merge pull request #5367 from MaciekPytel/add_x13n
k8s-ci-robot Dec 15, 2022
bcd6447
Merge pull request #5350 from MaciekPytel/cp_policy2
k8s-ci-robot Dec 16, 2022
af23e61
Merge pull request #5276 from pacoxu/master
k8s-ci-robot Dec 16, 2022
ba3b244
Merge pull request #5054 from fookenc/fix-autoscaler-node-deletion
k8s-ci-robot Dec 16, 2022
a46a095
Merge pull request #5362 from yasinlachiny/maxnodetotal
k8s-ci-robot Dec 19, 2022
d9ffb8f
Merge pull request #5317 from grosser/grosser/ref2
k8s-ci-robot Dec 19, 2022
ef126a1
CA - AWS - Update Docs all actions IAM policy
gjtempleton Dec 19, 2022
33341a1
Merge pull request #4831 from voelzmo/enh/vpa-kep-upscale-only
k8s-ci-robot Dec 19, 2022
52f25d6
Merge pull request #5373 from gjtempleton/AWS-CloudProvider-Update-Al…
k8s-ci-robot Dec 19, 2022
970874e
Cluster Autoscaler: update vendor to k8s v1.26.0
towca Dec 20, 2022
3b2e3db
Merge pull request #5376 from towca/jtuznik/ca-126
k8s-ci-robot Dec 20, 2022
bd3c393
Sync with upstream v1.26.0
elankath Mar 6, 2023
2d5d7f3
removed dotimports from framework.go
elankath Mar 7, 2023
3009fbd
fixed another dotimport
elankath Mar 7, 2023
4afd000
add missing vpa vendor,e2e/vendor to sync branch
elankath Mar 7, 2023
dd6b043
removed old files from vpa vendor to fix test
elankath Mar 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.18.1
go-version: 1.19

- uses: actions/checkout@v2
with:
Expand Down
4 changes: 2 additions & 2 deletions OWNERS
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
approvers:
- mwielgus
- maciekpytel
- bskiba
- gjtempleton
reviewers:
- mwielgus
- maciekpytel
- bskiba
- gjtempleton
emeritus_approvers:
- bskiba # 2022-09-30
7 changes: 3 additions & 4 deletions addon-resizer/OWNERS
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
approvers:
- bskiba
- wojtek-t
- jbartosik
reviewers:
- bskiba
- wojtek-t
- jbartosik
emeritus_approvers:
- bskiba # 2022-09-30
- wojtek-t # 2022-09-30
182 changes: 182 additions & 0 deletions balancer/proposals/balancer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@

# KEP - Balancer

## Introduction

One of the problems that the users are facing when running Kubernetes deployments is how to
deploy pods across several domains and keep them balanced and autoscaled at the same time.
These domains may include:

* Cloud provider zones inside a single region, to ensure that the application is still up and running, even if one of the zones has issues.
* Different types of Kubernetes nodes. These may involve nodes that are spot/preemptible, or of different machine families.

A single Kuberentes deployment may either leave the placement entirely up to the scheduler
(most likely leading to something not entirely desired, like all pods going to a single domain) or
focus on a single domain (thus not achieving the goal of being in two or more domains).

PodTopologySpreading solves the problem a bit, but not completely. It allows only even spreading
and once the deployment gets skewed it doesn’t do anything to rebalance. Pod topology spreading
(with skew and/or ScheduleAnyway flag) is also just a hint, if skewed placement is available and
allowed then Cluster Autoscaler is not triggered and the user ends up with a skewed deployment.
A user could specify a strict pod topolog spreading but then, in case of problems the deployment
would not move its pods to the domains that are available. The growth of the deployment would also
be totally blocked as the available domains would be too much skewed.

Thus, if full flexibility is needed, the only option is to have multiple deployments, targeting
different domains. This setup however creates one big problem. How to consistently autoscale multiple
deployments? The simplest idea - having multiple HPAs is not stable, due to different loads, race
conditions or so, some domains may grow while the others are shrunk. As HPAs and deployments are
not connected anyhow, the skewed setup will not fix itself automatically. It may eventually come to
a semi-balanced state but it is not guaranteed.


Thus there is a need for some component that will:

* Keep multiple deployments aligned. For example it may keep an equal ratio between the number of
pods in one deployment and the other. Or put everything to the first and overflow to the second and so on.
* React to individual deployment problems should it be zone outage or lack of spot/preemptible vms.
* Actively try to rebalance and get to the desired layout.
* Allow to autoscale all deployments with a single target, while maintaining the placement policy.

## Balancer

Balancer is a stand-alone controller, living in userspace (or in control plane, if needed) exposing
a CRD API object, also called Balancer. Each balancer object has pointers to multiple deployments
or other pod-controlling objects that expose the Scale subresource. Balancer periodically checks
the number of running and problematic pods inside each of the targets, compares it with the desired
number of replicas, constraints and policies and adjusts the number of replicas on the targets,
should some of them run too many or too few of them. To allow being an HPA target Balancer itself
exposes the Scale subresource.

## Balancer API

```go
// Balancer is an object used to automatically keep the desired number of
// replicas (pods) distributed among the specified set of targets (deployments
// or other objects that expose the Scale subresource).
type Balancer struct {
metav1.TypeMeta
// Standard object metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
// +optional
metav1.ObjectMeta
// Specification of the Balancer behavior.
// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.
Spec BalancerSpec
// Current information about the Balancer.
// +optional
Status BalancerStatus
}

// BalancerSpec is the specification of the Balancer behavior.
type BalancerSpec struct {
// Targets is a list of targets between which Balancer tries to distribute
// replicas.
Targets []BalancerTarget
// Replicas is the number of pods that should be distributed among the
// declared targets according to the specified policy.
Replicas int32
// Selector that groups the pods from all targets together (and only those).
// Ideally it should match the selector used by the Service built on top of the
// Balancer. All pods selectable by targets' selector must match to this selector,
// however target's selector don't have to be a superset of this one (although
// it is recommended).
Selector metav1.LabelSelector
// Policy defines how the balancer should distribute replicas among targets.
Policy BalancerPolicy
}

// BalancerTarget is the declaration of one of the targets between which the balancer
// tries to distribute replicas.
type BalancerTarget struct {
// Name of the target. The name can be later used to specify
// additional balancer details for this target.
Name string
// ScaleTargetRef is a reference that points to a target resource to balance.
// The target needs to expose the Scale subresource.
ScaleTargetRef hpa.CrossVersionObjectReference
// MinReplicas is the minimum number of replicas inside of this target.
// Balancer will set at least this amount on the target, even if the total
// desired number of replicas for Balancer is lower.
// +optional
MinReplicas *int32
// MaxReplicas is the maximum number of replicas inside of this target.
// Balancer will set at most this amount on the target, even if the total
// desired number of replicas for the Balancer is higher.
// +optional
MaxReplicas *int32
}

// BalancerPolicyName is the name of the balancer Policy.
type BalancerPolicyName string
const (
PriorityPolicyName BalancerPolicyName = "priority"
ProportionalPolicyName BalancerPolicyName = "proportional"
)

// BalancerPolicy defines Balancer policy for replica distribution.
type BalancerPolicy struct {
// PolicyName decides how to balance replicas across the targets.
// Depending on the name one of the fields Priorities or Proportions must be set.
PolicyName BalancerPolicyName
// Priorities contains detailed specification of how to balance when balancer
// policy name is set to Priority.
// +optional
Priorities *PriorityPolicy
// Proportions contains detailed specification of how to balance when
// balancer policy name is set to Proportional.
// +optional
Proportions *ProportionalPolicy
// Fallback contains specification of how to recognize and what to do if some
// replicas fail to start in one or more targets. No fallback happens if not-set.
// +optional
Fallback *Fallback
}

// PriorityPolicy contains details for Priority-based policy for Balancer.
type PriorityPolicy struct {
// TargetOrder is the priority-based list of Balancer targets names. The first target
// on the list gets the replicas until its maxReplicas is reached (or replicas
// fail to start). Then the replicas go to the second target and so on. MinReplicas
// is guaranteed to be fulfilled, irrespective of the order, presence on the
// list, and/or total Balancer's replica count.
TargetOrder []string
}

// ProportionalPolicy contains details for Proportion-based policy for Balancer.
type ProportionalPolicy struct {
// TargetProportions is a map from Balancer targets names to rates. Replicas are
// distributed so that the max difference between the current replica share
// and the desired replica share is minimized. Once a target reaches maxReplicas
// it is removed from the calculations and replicas are distributed with
// the updated proportions. MinReplicas is guaranteed for a target, irrespective
// of the total Balancer's replica count, proportions or the presence in the map.
TargetProportions map[string]int32
}

// Fallback contains information how to recognize and handle replicas
// that failed to start within the specified time period.
type Fallback struct {
// StartupTimeout defines how long will the Balancer wait before considering
// a pending/not-started pod as blocked and starting another replica in some other
// target. Once the replica is finally started, replicas in other targets
// may be stopped.
StartupTimeout metav1.Duration
}

// BalancerStatus describes the Balancer runtime state.
type BalancerStatus struct {
// Replicas is an actual number of observed pods matching Balancer selector.
Replicas int32
// Selector is a query over pods that should match the replicas count. This is same
// as the label selector but in the string format to avoid introspection
// by clients. The string will be in the same format as the query-param syntax.
// More info about label selectors: http://kubernetes.io/docs/user-guide/labels#label-selectors
Selector string
// Conditions is the set of conditions required for this Balancer to work properly,
// and indicates whether or not those conditions are met.
// +optional
// +patchMergeKey=type
// +patchStrategy=merge
Conditions []metav1.Condition
}
```
7 changes: 3 additions & 4 deletions builder/OWNERS
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
approvers:
- aleksandra-malinowska
- losipiuk
- maciekpytel
- mwielgus
reviewers:
- aleksandra-malinowska
- losipiuk
- maciekpytel
- mwielgus
emeritus_approvers:
- aleksandra-malinowska # 2022-09-30
- losipiuk # 2022-09-30
3 changes: 3 additions & 0 deletions charts/OWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,6 @@ approvers:
- gjtempleton
reviewers:
- gjtempleton

labels:
- helm-charts
2 changes: 1 addition & 1 deletion charts/cluster-autoscaler/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ name: cluster-autoscaler
sources:
- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
type: application
version: 9.20.1
version: 9.21.1
1 change: 1 addition & 0 deletions charts/cluster-autoscaler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,7 @@ Though enough for the majority of installations, the default PodSecurityPolicy _
| serviceMonitor.annotations | object | `{}` | Annotations to add to service monitor |
| serviceMonitor.enabled | bool | `false` | If true, creates a Prometheus Operator ServiceMonitor. |
| serviceMonitor.interval | string | `"10s"` | Interval that Prometheus scrapes Cluster Autoscaler metrics. |
| serviceMonitor.metricRelabelings | object | `{}` | MetricRelabelConfigs to apply to samples before ingestion. |
| serviceMonitor.namespace | string | `"monitoring"` | Namespace which Prometheus is running in. |
| serviceMonitor.path | string | `"/metrics"` | The path to scrape for metrics; autoscaler exposes `/metrics` (this is standard) |
| serviceMonitor.selector | object | `{"release":"prometheus-operator"}` | Default to kube-prometheus install (CoreOS recommended), but should be set according to Prometheus install. |
Expand Down
3 changes: 3 additions & 0 deletions charts/cluster-autoscaler/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,13 @@ Return the appropriate apiVersion for podsecuritypolicy.
{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }}
{{- if semverCompare "<1.10-0" $kubeTargetVersion -}}
{{- print "extensions/v1beta1" -}}
{{- if semverCompare ">1.21-0" $kubeTargetVersion -}}
{{- print "policy/v1" -}}
{{- else -}}
{{- print "policy/v1beta1" -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{/*
Return the appropriate apiVersion for podDisruptionBudget.
Expand Down
5 changes: 5 additions & 0 deletions charts/cluster-autoscaler/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ spec:
- --nodes={{ .minSize }}:{{ .maxSize }}:{{ .name }}
{{- end }}
{{- end }}
{{- if eq .Values.cloudProvider "rancher" }}
{{- if .Values.cloudConfigPath }}
- --cloud-config={{ .Values.cloudConfigPath }}
{{- end }}
{{- end }}
{{- if eq .Values.cloudProvider "aws" }}
{{- if .Values.autoDiscovery.clusterName }}
- --node-group-auto-discovery=asg:tag={{ tpl (join "," .Values.autoDiscovery.tags) . }}
Expand Down
4 changes: 4 additions & 0 deletions charts/cluster-autoscaler/templates/servicemonitor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ spec:
- port: {{ .Values.service.portName }}
interval: {{ .Values.serviceMonitor.interval }}
path: {{ .Values.serviceMonitor.path }}
{{- if .Values.serviceMonitor.metricRelabelings }}
metricRelabelings:
{{ tpl (toYaml .Values.serviceMonitor.metricRelabelings | indent 6) . }}
{{- end }}
namespaceSelector:
matchNames:
- {{.Release.Namespace}}
Expand Down
3 changes: 3 additions & 0 deletions charts/cluster-autoscaler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,9 @@ serviceMonitor:
path: /metrics
# serviceMonitor.annotations -- Annotations to add to service monitor
annotations: {}
## [RelabelConfig](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.RelabelConfig)
# serviceMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
metricRelabelings: {}

## Custom PrometheusRule to be defined
## The value is evaluated as a template, so, for example, the value can depend on .Release or .Chart
Expand Down
Loading