Skip to content

Commit

Permalink
Merge pull request kubernetes#184 from elankath/sync-upstream-v1.26.0
Browse files Browse the repository at this point in the history
* Drop redudant parameter in utilization calculation

* Extract checks for scale down eligibility

* Limit amount of node utilization logging

* Increase timeout for VPA E2E

After kubernetes#5151 e2e are still failing because we're still hitting ginkgo timeout

* Add podScaleUpDelay annotation support

* Corrected the links for Priority in k8s API and Pod Preemption in k8s.

* Restrict Updater PodLister to namespace

* Update controller-gen to latest and use go install

* Run hack/generate-crd-yaml.sh

* update owners list for cluster autoscaler azure

* Change VPA default version to 0.12.0

* Pin controller-gen to 0.9.2

* AWS ReadMe update

* Move resource limits checking to a separate package

* Allow simulator to persist changes in cluster snapshot

* Don't depend on IsNodeBeingDeleted implementation

The fact that it only considers nodes as deleted only until a certain
timeout is of no concern to the eligibility.Checker.

* Stop treating masters differently in scale down

This filtering was used for two purposes:
- Excluding masters from destination candidates
- Excluding masters from calculating cluster resources

Excluding from destination candidates isn't useful: if pods can schedule
there, they will, so removing them from CA simulation doesn't change
anything.
Excluding from calculating cluster resources actually matches scale up
behavior, where master nodes are treated the same way as regular nodes.

* CA - AWS - Instance List Update 2022-09-16

* fix typo

* Modifying taint removal logic on startup to consider all nodes instead of ready nodes.

* fix typo

* Update VPA compatibility for 0.12 release

* Updated the golang version for GitHub workflow.

* Create GCE CloudProvider Owners file

* Fix error formatting in GCE client

%v results in a list of numbers when byte array is passed

* Introduce NodeDeleterBatcher to ScaleDown actuator

* handle directx nodes the same as gpu nodes

* magnum: add an option to create insecure TLS connections

We use self-signed certificates in the openstack for test purposes.
It is not always easy to bring a CA certificate. And so we ran into
the problem that there is no option to not check the validity of the
certificate in the autoscaler.

This patch adds a new option for the magnum plugin: tls-insecure

Signed-off-by: Anton Kurbatov <[email protected]>

* Drop unused maps

* Extract criteria for removing unneded nodes to a separate package

* skip instances on validation error

if an instance is already being deleted/abandoned/not a member just continue

* cleanup unused constants in clusterapi provider

this change removes some unused values and adjusts the names in the unit
tests to better reflect usage.

* Update the example spec of civo cloudprovider

Signed-off-by: Vishal Anarse <[email protected]>

* Fix race condition in scale down test

* Clean up stale OWNERS

* add example for multiple recommenders

* Balancer KEP

* Add VPA E2E for recomemndation not exaclty matching pod

Containers in recommendation can be different from recommendations in pod:

- A new container can be added to a pod. At first there will be no
  recommendation for the container
- A container can be removed from pod. For some time recommendation will contain
  recommendation for the old container
- Container can be renamed. Then there will be recommendation for container
  under its old name.

Add tests for what VPA does in those situations.

* Add VPA E2E for recomemndation not exaclty matching pod with limit range

Containers in recommendation can be different from recommendations in pod:

- A new container can be added to a pod. At first there will be no
  recommendation for the container
- A container can be removed from pod. For some time recommendation will contain
  recommendation for the old container
- Container can be renamed. Then there will be recommendation for container
  under its old name.

Add tests for what VPA does in those situations, when limit range exists.

* Remove units for default boot disk size

* Fix accessing index out of bonds

The function should match containers to their recommendations directly instead
of hoping thier order will match,

See [this comment](kubernetes#3966 (comment))

* [vpa] introduce recommendation post processor

* Fixed gofmt error.

* Don't break scale up with priority expander config

* added replicas count for daemonsets to prevent massive pod eviction

Signed-off-by: Denis Romanenko <[email protected]>

* code review, move flag to boolean for post processor

* Add support for extended resource definition in GCE MIG template

This commit adds the possibility to define extended resources for a node group on GCE,
so that the cluster-autoscaler can account for them when taking scaling decisions.

This is done through the `extended_resources` key inside the AUTOSCALER_ENV_VARS variable set on a MIG template.

Signed-off-by: Mayeul Blanzat <[email protected]>

* Make expander factory logic more pluggable

* Add option to wait for a period of time after node tainting/cordoning
Node state is refreshed and checked again before deleting the node
It gives kube-scheduler time to acknowledge that nodes state has
changed and to stop scheduling pods on them

* remove the flag for Capping post-processor

* remove unsupported functionality from cluster-api provider

this change removes the code for the `Labels` and `Taints` interface
functions of the clusterapi provider when scaling from zero. The body
of these functions was added erronesouly and the Cluster API community
is still deciding on how these values will be expose to the autoscaler.

also updates the tests and readme to be more clear about the usage of
labels and taints when scaling from zero.

* Remove ScaleDown dependency on clusterStateRegistry

* Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes. Avoids misidentifying not autoscaled nodes as deleted. Simplified implementation to use apiv1.Node instead of new struct. Expanded test cases to include not autoscaled nodes and tracking deleted nodes over multiple updates.

Adding check to backfill loop to confirm cloud provider node no longer exists before flagging the node as deleted. Modifying some comments to be more accurate. Replacing erroneous line deletion.

* Implementing new cloud provider method for node deletion detection (kubernetes#1)

* Adding isNodeDeleted method to CloudProvider interface. Supports detecting whether nodes are fully deleted or are not-autoscaled. Updated cloud providers to provide initial implementation of new method that will return an ErrNotImplemented to maintain existing taint-based deletion clusterstate calculation.

* Fixing go formatting issues with clusterstate_test

* Fixing errors due to merge on branches.

* Adjusting initial implementation of NodeExists to be consistent among cloud providers to return true and ErrNotImplemented.

* Fix list scaling group instance pages bug

Signed-off-by: jwcesign <[email protected]>

* Format log output

Signed-off-by: jwcesign <[email protected]>

* Split out code from simulator package

* Code Review: Do not return an error on malformed extended_resource + add more tests

* Malformed extended resource definition should not fail the template building function. Instead, log the error and ignore extended resources
* Remove useless existence check
* Add tests around the extractExtendedResourcesFromKubeEnv function
* Add a test case to verify that malformed extended resource definition does not fail the template build function

Signed-off-by: Mayeul Blanzat <[email protected]>

* huawei-cloudprovider:enable tags resolve for as

Signed-off-by: jwcesign <[email protected]>

* Magnum provider: switch UUID dependency from satori to gofrs

Addresses issue kubernetes#5218, that the satori UUID package
is unmaintained and has security vulnerabilities
affecting generating random UUIDs.

In the magnum cloud provider, this package was only
used to check whether a string matches a UUIDv4 or
not, so the vulnerability with generating UUIDs could
not have been exploited. (Generating UUIDs is only
done in the unit tests).

The gofrs/uuid package is currenly at version 4.0.0
in go.mod, well past point at which it was forked
and the vulnerability was fixed. It is a drop in
replacement for verifying a UUID, and only a small
change was needed in the testing code to handle
a new returned error when generating a random UUID.

* change uuid dependency in cluster autoscaler kamatera provider

* Extract scheduling hints to a dedicated object

This removes the need for passing maps back and forth when doing
scheduling simulations.

* Remove dead code for handling simulation errors

* Fix typo, move service accounts to RBAC

* VPA: Add missing --- to CRD manifests

* Base parallel scale down implementation

* Stop applying the beta.kubernetes.io/os and arch

* [CA] Register recently evicted pods in NodeDeletionTracker.

* Add KEP to introduce UpdateMode: UpscaleOnly

* Clarify prometheus use-case

* Adapt to review comments

* Adapt KEP according to review

* Add newline after header

* Rename proposal directory to fit KEP title

* Make KEP and implementation proposal consistent

* remove post-processor factory

* update test for MapToListOfRecommendedContainerResources

* Update aws OWNERS

Set all aws cloudprovider approvers as reviewers, so that aws-specific PRs can be handled without involving global CA reviewers.

* Add ScaleDown.Actuator to AutoscalingContext

* update the hyperlink of api-conventions.md file in comments

* Support scaling up node groups to the configured min size if needed

* Fix: add missing RBAC permissions to magnum examples

Adding permissions to the ClusterRole in the example to avoid the error
messages.

* make spellchecker happy

* Changing deletion logic to rely on a new helper method in ClusterStateRegistry, and remove old complicated logic. Adjust the naming of the method for cloud instance deletion from NodeExists to HasInstance.

* Fix VPA deployment

Use `kube-system` namespace for ServiceAccounts like it did before kubernetes#5268

* Don't say that `Recreate` and `Auto` VPA modes are experimental

* Fixing go formatting issue in cloudstack cloud provider code.

* Add missing cloud providers to readme and sort alphabetically

Signed-off-by: Marcus Noble <[email protected]>

* huawei-cloudprovider: enable taints resolve for as, modify the example yaml to accelerate node scale-down

Signed-off-by: jwcesign <[email protected]>

* Update cluster-autoscaler/README.md

Co-authored-by: Guy Templeton <[email protected]>

* cluster-autoscaler: refactor BalanceScaleUpBetweenGroups

* Allow forking snapshot more than 1 time

* Fork ClusterSnapshot in UpdateClusterState

* add logging information to FAQ

this change adds a section about how to increase the logging verbosity
and why you might want to do that.

* fix(cluster-autoscaler/hetzner): pre-existing volumes break scheduling

The `hcloud-csi-driver` v1.x uses the label `csi.hetzner.cloud/location`
for topology. This label was not added in the response to
`n.TemplateNodeInfo()`, causing cluster-autoscaler to not consider any
node group for scaling when a pre-existing volume was attached to the
pending pod.

This is fixed by adding the appropriatly named label to the `NodeInfo`.
In practice this label is added by the `hcloud-csi-driver`.

In the upcoming v2 of the driver we migrated to using
`apiv1.LabelZoneRegionStable` for topology constraints, but this fix is
still required so customers do not have to re-create all `PersistentVolumes`.

Further details on the bug are available in the original issue:
hetznercloud/csi-driver#302

* Added RBAC Permission to Azure.

* Log node group min and current size when skipping scale down

* Use scheduling package in filterOutSchedulable processor

* Check owner reference in scale down planner to avoid double-counting
already delete pods.

* Add note regarding GPU label for the CAPI provider

cluster-autoscaler takes into consideration the time that a node takes
to initialise a GPU resource on a node, as long as a particular label is
in place.  This label differs from provider to provider, and is
documented in some cases but not for CAPI.

This commit adds a note with the specific label that should be applied
when a node is instantiated.

* chore(cluster-autoscaler/hetzner): add myself to OWNERS file

* Use ScaleDownSetProcessor.GetNodesToRemove in scale down planner to
filter  NodesToDelete.

* Handle pagination when looking through supported shapes.

* Add OCI API files to handle OCI work-request operations.

* Fail fast if OCI instance pool is out of capacity/quota.

* update vendor to v1.26.0-rc.1

* fix issue 5332

* Deprecate v1beta1 API

v1beta2 API was introduced in kubernetes#1668, it's present in VPA
[0.4.0](https://github.com/kubernetes/autoscaler/tree/vertical-pod-autoscaler-0.4.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2)
but not in
[0.3.1](https://github.com/kubernetes/autoscaler/tree/vertical-pod-autoscaler-0.3.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2).

I added comments to vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1beta2/types.go

I generated changes to
`vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml` with
`vertical-pod-autoscaler/hack/generate-crd-yaml.sh`

* Add note about `v1beta2` deprecation to README

* fix issue 5332 - adding suggestied change

* Break node categorization in scale down planner on timeout.

* Automatically label cluster-autoscaler PRs

* Add missing dot

* fix generate ec2 instance types

* Introduce a formal policy for maintaining cloudproviders

The policy largely codifies what we've already been doing for years
(including the requirements we've already imposed on new providers).

* Introduce Cloudprovider Maintenance Request to policy

* feat(helm): add rancher cloud config support

Autoscaler 1.25.0 adds "rancher" cloud provider support, it requires setting cloudConfigPath. If the user mounts this as a secret and sets this value appropriately, this change sets the argument required to point to the mounted secret. Previously, this was only set if cloud provider was magnum or aws.

* Updating error messaging and fallback behavior of hasCloudProviderInstance. Changing deletedNodes to store empty struct instead of node values, and modifying the helper function to utilize that information for tests.

* Fixing helper function to simplify for loop to retrieve deleted node names.

* Use PdbRemainingDisruptions in Planner

* Put risky NodeToRemove in the end of needDrain list

* Auto Label Helm Chart PRs

* psp_api

* Create a Planner object if --parallelDrain=true

* Export execution_latency_seconds metric from VPA admission controller

Sometimes I see admissions that are slower than the rest. Logs indicate that
`AdmissionServer.admit` doesn't get slow (it's only part with logging). I'd like
to have a metric which will tell us what's slow so that we can maybe improve
that.

* aws: add nodegroup name to default labels

* Fix int formatting in threshold_based_limiter logs

* rancher-cloudprovider: Improve node group discovery

Previsouly the rancher provider tried to parse the node `spec.providerID`
to extract the node group name. Instead, we now get the machines by the
node name and then use a rancher specific label that should always be
on the machine. This should work more reliably for all the different
node drivers that rancher supports.

Signed-off-by: Cyrill Troxler <[email protected]>

* Don't add pods from drained nodes in scale-down

* Add default PodListProcessor wrapper

* Add currently drained pods before scale-up

* set cluster_autoscaler_max_nodes_count dynamically

Signed-off-by: yasin.lachiny <[email protected]>

* fix(helm): bump chart ver -> 9.21.1

* CA - AWS - Update Hardcoded Instance Details List to 11-12-2022

* Add x13n to cluster autoscaler approvers

* update prometheus metric min maxNodesCount and a.MaxNodesTotal

Signed-off-by: yasin.lachiny <[email protected]>

* CA - AWS - Update Docs all actions IAM policy

* Cluster Autoscaler: update vendor to k8s v1.26.0

* removed dotimports from framework.go

* fixed another dotimport

* add missing vpa vendor,e2e/vendor to sync branch

* removed old files from vpa vendor to fix test

---------

Signed-off-by: Anton Kurbatov <[email protected]>
Signed-off-by: Vishal Anarse <[email protected]>
Signed-off-by: Denis Romanenko <[email protected]>
Signed-off-by: Mayeul Blanzat <[email protected]>
Signed-off-by: jwcesign <[email protected]>
Signed-off-by: Marcus Noble <[email protected]>
Signed-off-by: Cyrill Troxler <[email protected]>
Signed-off-by: yasin.lachiny <[email protected]>
Co-authored-by: Daniel Kłobuszewski <[email protected]>
Co-authored-by: Kubernetes Prow Robot <[email protected]>
Co-authored-by: Joachim Bartosik <[email protected]>
Co-authored-by: Damir Markovic <[email protected]>
Co-authored-by: Shubham Kuchhal <[email protected]>
Co-authored-by: Marco Voelz <[email protected]>
Co-authored-by: Prachi Gandhi <[email protected]>
Co-authored-by: bdobay <[email protected]>
Co-authored-by: Juan Borda <[email protected]>
Co-authored-by: Fabio Berchtold <[email protected]>
Co-authored-by: Clint Fooken <[email protected]>
Co-authored-by: Jayant Jain <[email protected]>
Co-authored-by: Yaroslava Serdiuk <[email protected]>
Co-authored-by: Flavian <[email protected]>
Co-authored-by: Anton Kurbatov <[email protected]>
Co-authored-by: Fulton Byrne <[email protected]>
Co-authored-by: Michael McCune <[email protected]>
Co-authored-by: Vishal Anarse <[email protected]>
Co-authored-by: Matthias Bertschy <[email protected]>
Co-authored-by: Marcin Wielgus <[email protected]>
Co-authored-by: David Benque <[email protected]>
Co-authored-by: Denis Romanenko <[email protected]>
Co-authored-by: Mayeul Blanzat <[email protected]>
Co-authored-by: Alexandru Matei <[email protected]>
Co-authored-by: Clint <[email protected]>
Co-authored-by: jwcesign <[email protected]>
Co-authored-by: Thomas Hartland <[email protected]>
Co-authored-by: Ori Hoch <[email protected]>
Co-authored-by: Joel Smith <[email protected]>
Co-authored-by: Paco Xu <[email protected]>
Co-authored-by: Aleksandra Gacek <[email protected]>
Co-authored-by: Marco Voelz <[email protected]>
Co-authored-by: Bartłomiej Wróblewski <[email protected]>
Co-authored-by: hangcui <[email protected]>
Co-authored-by: Xintong Liu <[email protected]>
Co-authored-by: GanjMonk <[email protected]>
Co-authored-by: Marcus Noble <[email protected]>
Co-authored-by: Marcus Noble <[email protected]>
Co-authored-by: Guy Templeton <[email protected]>
Co-authored-by: Michael Grosser <[email protected]>
Co-authored-by: Julian Tölle <[email protected]>
Co-authored-by: Nick Jones <[email protected]>
Co-authored-by: jesse.millan <[email protected]>
Co-authored-by: Jordan Liggitt <[email protected]>
Co-authored-by: McGonigle, Neil <[email protected]>
Co-authored-by: Anton Khizunov <[email protected]>
Co-authored-by: Maciek Pytel <[email protected]>
Co-authored-by: Basit Mustafa <[email protected]>
Co-authored-by: xval2307 <[email protected]>
Co-authored-by: yznima <[email protected]>
Co-authored-by: Cyrill Troxler <[email protected]>
Co-authored-by: yasin.lachiny <[email protected]>
Co-authored-by: Kuba Tużnik <[email protected]>
  • Loading branch information
Show file tree
Hide file tree
Showing 2,789 changed files with 374,881 additions and 128,448 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.18.1
go-version: 1.19

- uses: actions/checkout@v2
with:
Expand Down
4 changes: 2 additions & 2 deletions OWNERS
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
approvers:
- mwielgus
- maciekpytel
- bskiba
- gjtempleton
reviewers:
- mwielgus
- maciekpytel
- bskiba
- gjtempleton
emeritus_approvers:
- bskiba # 2022-09-30
7 changes: 3 additions & 4 deletions addon-resizer/OWNERS
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
approvers:
- bskiba
- wojtek-t
- jbartosik
reviewers:
- bskiba
- wojtek-t
- jbartosik
emeritus_approvers:
- bskiba # 2022-09-30
- wojtek-t # 2022-09-30
182 changes: 182 additions & 0 deletions balancer/proposals/balancer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@

# KEP - Balancer

## Introduction

One of the problems that the users are facing when running Kubernetes deployments is how to
deploy pods across several domains and keep them balanced and autoscaled at the same time.
These domains may include:

* Cloud provider zones inside a single region, to ensure that the application is still up and running, even if one of the zones has issues.
* Different types of Kubernetes nodes. These may involve nodes that are spot/preemptible, or of different machine families.

A single Kuberentes deployment may either leave the placement entirely up to the scheduler
(most likely leading to something not entirely desired, like all pods going to a single domain) or
focus on a single domain (thus not achieving the goal of being in two or more domains).

PodTopologySpreading solves the problem a bit, but not completely. It allows only even spreading
and once the deployment gets skewed it doesn’t do anything to rebalance. Pod topology spreading
(with skew and/or ScheduleAnyway flag) is also just a hint, if skewed placement is available and
allowed then Cluster Autoscaler is not triggered and the user ends up with a skewed deployment.
A user could specify a strict pod topolog spreading but then, in case of problems the deployment
would not move its pods to the domains that are available. The growth of the deployment would also
be totally blocked as the available domains would be too much skewed.

Thus, if full flexibility is needed, the only option is to have multiple deployments, targeting
different domains. This setup however creates one big problem. How to consistently autoscale multiple
deployments? The simplest idea - having multiple HPAs is not stable, due to different loads, race
conditions or so, some domains may grow while the others are shrunk. As HPAs and deployments are
not connected anyhow, the skewed setup will not fix itself automatically. It may eventually come to
a semi-balanced state but it is not guaranteed.


Thus there is a need for some component that will:

* Keep multiple deployments aligned. For example it may keep an equal ratio between the number of
pods in one deployment and the other. Or put everything to the first and overflow to the second and so on.
* React to individual deployment problems should it be zone outage or lack of spot/preemptible vms.
* Actively try to rebalance and get to the desired layout.
* Allow to autoscale all deployments with a single target, while maintaining the placement policy.

## Balancer

Balancer is a stand-alone controller, living in userspace (or in control plane, if needed) exposing
a CRD API object, also called Balancer. Each balancer object has pointers to multiple deployments
or other pod-controlling objects that expose the Scale subresource. Balancer periodically checks
the number of running and problematic pods inside each of the targets, compares it with the desired
number of replicas, constraints and policies and adjusts the number of replicas on the targets,
should some of them run too many or too few of them. To allow being an HPA target Balancer itself
exposes the Scale subresource.

## Balancer API

```go
// Balancer is an object used to automatically keep the desired number of
// replicas (pods) distributed among the specified set of targets (deployments
// or other objects that expose the Scale subresource).
type Balancer struct {
metav1.TypeMeta
// Standard object metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
// +optional
metav1.ObjectMeta
// Specification of the Balancer behavior.
// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.
Spec BalancerSpec
// Current information about the Balancer.
// +optional
Status BalancerStatus
}

// BalancerSpec is the specification of the Balancer behavior.
type BalancerSpec struct {
// Targets is a list of targets between which Balancer tries to distribute
// replicas.
Targets []BalancerTarget
// Replicas is the number of pods that should be distributed among the
// declared targets according to the specified policy.
Replicas int32
// Selector that groups the pods from all targets together (and only those).
// Ideally it should match the selector used by the Service built on top of the
// Balancer. All pods selectable by targets' selector must match to this selector,
// however target's selector don't have to be a superset of this one (although
// it is recommended).
Selector metav1.LabelSelector
// Policy defines how the balancer should distribute replicas among targets.
Policy BalancerPolicy
}

// BalancerTarget is the declaration of one of the targets between which the balancer
// tries to distribute replicas.
type BalancerTarget struct {
// Name of the target. The name can be later used to specify
// additional balancer details for this target.
Name string
// ScaleTargetRef is a reference that points to a target resource to balance.
// The target needs to expose the Scale subresource.
ScaleTargetRef hpa.CrossVersionObjectReference
// MinReplicas is the minimum number of replicas inside of this target.
// Balancer will set at least this amount on the target, even if the total
// desired number of replicas for Balancer is lower.
// +optional
MinReplicas *int32
// MaxReplicas is the maximum number of replicas inside of this target.
// Balancer will set at most this amount on the target, even if the total
// desired number of replicas for the Balancer is higher.
// +optional
MaxReplicas *int32
}

// BalancerPolicyName is the name of the balancer Policy.
type BalancerPolicyName string
const (
PriorityPolicyName BalancerPolicyName = "priority"
ProportionalPolicyName BalancerPolicyName = "proportional"
)

// BalancerPolicy defines Balancer policy for replica distribution.
type BalancerPolicy struct {
// PolicyName decides how to balance replicas across the targets.
// Depending on the name one of the fields Priorities or Proportions must be set.
PolicyName BalancerPolicyName
// Priorities contains detailed specification of how to balance when balancer
// policy name is set to Priority.
// +optional
Priorities *PriorityPolicy
// Proportions contains detailed specification of how to balance when
// balancer policy name is set to Proportional.
// +optional
Proportions *ProportionalPolicy
// Fallback contains specification of how to recognize and what to do if some
// replicas fail to start in one or more targets. No fallback happens if not-set.
// +optional
Fallback *Fallback
}

// PriorityPolicy contains details for Priority-based policy for Balancer.
type PriorityPolicy struct {
// TargetOrder is the priority-based list of Balancer targets names. The first target
// on the list gets the replicas until its maxReplicas is reached (or replicas
// fail to start). Then the replicas go to the second target and so on. MinReplicas
// is guaranteed to be fulfilled, irrespective of the order, presence on the
// list, and/or total Balancer's replica count.
TargetOrder []string
}

// ProportionalPolicy contains details for Proportion-based policy for Balancer.
type ProportionalPolicy struct {
// TargetProportions is a map from Balancer targets names to rates. Replicas are
// distributed so that the max difference between the current replica share
// and the desired replica share is minimized. Once a target reaches maxReplicas
// it is removed from the calculations and replicas are distributed with
// the updated proportions. MinReplicas is guaranteed for a target, irrespective
// of the total Balancer's replica count, proportions or the presence in the map.
TargetProportions map[string]int32
}

// Fallback contains information how to recognize and handle replicas
// that failed to start within the specified time period.
type Fallback struct {
// StartupTimeout defines how long will the Balancer wait before considering
// a pending/not-started pod as blocked and starting another replica in some other
// target. Once the replica is finally started, replicas in other targets
// may be stopped.
StartupTimeout metav1.Duration
}

// BalancerStatus describes the Balancer runtime state.
type BalancerStatus struct {
// Replicas is an actual number of observed pods matching Balancer selector.
Replicas int32
// Selector is a query over pods that should match the replicas count. This is same
// as the label selector but in the string format to avoid introspection
// by clients. The string will be in the same format as the query-param syntax.
// More info about label selectors: http://kubernetes.io/docs/user-guide/labels#label-selectors
Selector string
// Conditions is the set of conditions required for this Balancer to work properly,
// and indicates whether or not those conditions are met.
// +optional
// +patchMergeKey=type
// +patchStrategy=merge
Conditions []metav1.Condition
}
```
7 changes: 3 additions & 4 deletions builder/OWNERS
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
approvers:
- aleksandra-malinowska
- losipiuk
- maciekpytel
- mwielgus
reviewers:
- aleksandra-malinowska
- losipiuk
- maciekpytel
- mwielgus
emeritus_approvers:
- aleksandra-malinowska # 2022-09-30
- losipiuk # 2022-09-30
3 changes: 3 additions & 0 deletions charts/OWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,6 @@ approvers:
- gjtempleton
reviewers:
- gjtempleton

labels:
- helm-charts
2 changes: 1 addition & 1 deletion charts/cluster-autoscaler/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ name: cluster-autoscaler
sources:
- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
type: application
version: 9.20.1
version: 9.21.1
1 change: 1 addition & 0 deletions charts/cluster-autoscaler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,7 @@ Though enough for the majority of installations, the default PodSecurityPolicy _
| serviceMonitor.annotations | object | `{}` | Annotations to add to service monitor |
| serviceMonitor.enabled | bool | `false` | If true, creates a Prometheus Operator ServiceMonitor. |
| serviceMonitor.interval | string | `"10s"` | Interval that Prometheus scrapes Cluster Autoscaler metrics. |
| serviceMonitor.metricRelabelings | object | `{}` | MetricRelabelConfigs to apply to samples before ingestion. |
| serviceMonitor.namespace | string | `"monitoring"` | Namespace which Prometheus is running in. |
| serviceMonitor.path | string | `"/metrics"` | The path to scrape for metrics; autoscaler exposes `/metrics` (this is standard) |
| serviceMonitor.selector | object | `{"release":"prometheus-operator"}` | Default to kube-prometheus install (CoreOS recommended), but should be set according to Prometheus install. |
Expand Down
3 changes: 3 additions & 0 deletions charts/cluster-autoscaler/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,13 @@ Return the appropriate apiVersion for podsecuritypolicy.
{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }}
{{- if semverCompare "<1.10-0" $kubeTargetVersion -}}
{{- print "extensions/v1beta1" -}}
{{- if semverCompare ">1.21-0" $kubeTargetVersion -}}
{{- print "policy/v1" -}}
{{- else -}}
{{- print "policy/v1beta1" -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{/*
Return the appropriate apiVersion for podDisruptionBudget.
Expand Down
5 changes: 5 additions & 0 deletions charts/cluster-autoscaler/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ spec:
- --nodes={{ .minSize }}:{{ .maxSize }}:{{ .name }}
{{- end }}
{{- end }}
{{- if eq .Values.cloudProvider "rancher" }}
{{- if .Values.cloudConfigPath }}
- --cloud-config={{ .Values.cloudConfigPath }}
{{- end }}
{{- end }}
{{- if eq .Values.cloudProvider "aws" }}
{{- if .Values.autoDiscovery.clusterName }}
- --node-group-auto-discovery=asg:tag={{ tpl (join "," .Values.autoDiscovery.tags) . }}
Expand Down
4 changes: 4 additions & 0 deletions charts/cluster-autoscaler/templates/servicemonitor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ spec:
- port: {{ .Values.service.portName }}
interval: {{ .Values.serviceMonitor.interval }}
path: {{ .Values.serviceMonitor.path }}
{{- if .Values.serviceMonitor.metricRelabelings }}
metricRelabelings:
{{ tpl (toYaml .Values.serviceMonitor.metricRelabelings | indent 6) . }}
{{- end }}
namespaceSelector:
matchNames:
- {{.Release.Namespace}}
Expand Down
3 changes: 3 additions & 0 deletions charts/cluster-autoscaler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,9 @@ serviceMonitor:
path: /metrics
# serviceMonitor.annotations -- Annotations to add to service monitor
annotations: {}
## [RelabelConfig](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.RelabelConfig)
# serviceMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
metricRelabelings: {}

## Custom PrometheusRule to be defined
## The value is evaluated as a template, so, for example, the value can depend on .Release or .Chart
Expand Down
Loading

0 comments on commit 5c55843

Please sign in to comment.