Skip to content

Commit

Permalink
Merge pull request #1555 from ingvagabund/actual-utilization-kubernet…
Browse files Browse the repository at this point in the history
…es-metrics

Use actual node resource utilization by consuming kubernetes metrics
  • Loading branch information
k8s-ci-robot authored Nov 20, 2024
2 parents a4c09bf + 6567f01 commit a962cca
Show file tree
Hide file tree
Showing 75 changed files with 8,603 additions and 278 deletions.
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ These are top level keys in the Descheduler Policy that you can use to configure
| `maxNoOfPodsToEvictPerNode` |`int`| `nil` | maximum number of pods evicted from each node (summed through all strategies) |
| `maxNoOfPodsToEvictPerNamespace` |`int`| `nil` | maximum number of pods evicted from each namespace (summed through all strategies) |
| `maxNoOfPodsToEvictTotal` |`int`| `nil` | maximum number of pods evicted per rescheduling cycle (summed through all strategies) |
| `metricsCollector` |`object`| `nil` | configures collection of metrics for actual resource utilization |
| `metricsCollector.enabled` |`bool`| `false` | enables kubernetes [metrics server](https://kubernetes-sigs.github.io/metrics-server/) collection |

### Evictor Plugin configuration (Default Evictor)

Expand Down Expand Up @@ -158,6 +160,8 @@ nodeSelector: "node=node1" # you don't need to set this, if not set all will be
maxNoOfPodsToEvictPerNode: 5000 # you don't need to set this, unlimited if not set
maxNoOfPodsToEvictPerNamespace: 5000 # you don't need to set this, unlimited if not set
maxNoOfPodsToEvictTotal: 5000 # you don't need to set this, unlimited if not set
metricsCollector:
enabled: true # you don't need to set this, metrics are not collected if not set
profiles:
- name: ProfileName
pluginConfig:
Expand Down Expand Up @@ -277,11 +281,13 @@ If that parameter is set to `true`, the thresholds are considered as percentage
`thresholds` will be deducted from the mean among all nodes and `targetThresholds` will be added to the mean.
A resource consumption above (resp. below) this window is considered as overutilization (resp. underutilization).

**NOTE:** Node resource consumption is determined by the requests and limits of pods, not actual usage.
**NOTE:** By default node resource consumption is determined by the requests and limits of pods, not actual usage.
This approach is chosen in order to maintain consistency with the kube-scheduler, which follows the same
design for scheduling pods onto nodes. This means that resource usage as reported by Kubelet (or commands
like `kubectl top`) may differ from the calculated consumption, due to these components reporting
actual usage metrics. Implementing metrics-based descheduling is currently TODO for the project.
actual usage metrics. Metrics-based descheduling can be enabled by setting `metricsUtilization.metricsServer` field.
In order to have the plugin consume the metrics the metric collector needs to be configured as well.
See `metricsCollector` field at [Top Level configuration](#top-level-configuration) for available options.

**Parameters:**

Expand All @@ -292,6 +298,9 @@ actual usage metrics. Implementing metrics-based descheduling is currently TODO
|`targetThresholds`|map(string:int)|
|`numberOfNodes`|int|
|`evictableNamespaces`|(see [namespace filtering](#namespace-filtering))|
|`metricsUtilization`|object|
|`metricsUtilization.metricsServer`|bool|


**Example:**

Expand All @@ -311,6 +320,8 @@ profiles:
"cpu" : 50
"memory": 50
"pods": 50
metricsUtilization:
metricsServer: true
plugins:
balance:
enabled:
Expand Down
3 changes: 3 additions & 0 deletions cmd/descheduler/app/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"time"

"github.com/spf13/pflag"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
apiserver "k8s.io/apiserver/pkg/server"
apiserveroptions "k8s.io/apiserver/pkg/server/options"
Expand All @@ -33,6 +34,7 @@ import (
componentbaseoptions "k8s.io/component-base/config/options"
"k8s.io/component-base/featuregate"
"k8s.io/klog/v2"
metricsclient "k8s.io/metrics/pkg/client/clientset/versioned"

"sigs.k8s.io/descheduler/pkg/apis/componentconfig"
"sigs.k8s.io/descheduler/pkg/apis/componentconfig/v1alpha1"
Expand All @@ -51,6 +53,7 @@ type DeschedulerServer struct {

Client clientset.Interface
EventClient clientset.Interface
MetricsClient metricsclient.Interface
SecureServing *apiserveroptions.SecureServingOptionsWithLoopback
SecureServingInfo *apiserver.SecureServingInfo
DisableMetrics bool
Expand Down
9 changes: 5 additions & 4 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,15 @@ require (
go.opentelemetry.io/otel/sdk v1.28.0
go.opentelemetry.io/otel/trace v1.28.0
google.golang.org/grpc v1.65.0
k8s.io/api v0.31.0
k8s.io/apimachinery v0.31.0
k8s.io/api v0.31.2
k8s.io/apimachinery v0.31.2
k8s.io/apiserver v0.31.0
k8s.io/client-go v0.31.0
k8s.io/code-generator v0.31.0
k8s.io/client-go v0.31.2
k8s.io/code-generator v0.31.2
k8s.io/component-base v0.31.0
k8s.io/component-helpers v0.31.0
k8s.io/klog/v2 v2.130.1
k8s.io/metrics v0.31.2
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8
kubevirt.io/api v1.3.0
kubevirt.io/client-go v1.3.0
Expand Down
18 changes: 10 additions & 8 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -639,20 +639,20 @@ gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
k8s.io/api v0.23.3/go.mod h1:w258XdGyvCmnBj/vGzQMj6kzdufJZVUwEM1U2fRJwSQ=
k8s.io/api v0.31.0 h1:b9LiSjR2ym/SzTOlfMHm1tr7/21aD7fSkqgD/CVJBCo=
k8s.io/api v0.31.0/go.mod h1:0YiFF+JfFxMM6+1hQei8FY8M7s1Mth+z/q7eF1aJkTE=
k8s.io/api v0.31.2 h1:3wLBbL5Uom/8Zy98GRPXpJ254nEFpl+hwndmk9RwmL0=
k8s.io/api v0.31.2/go.mod h1:bWmGvrGPssSK1ljmLzd3pwCQ9MgoTsRCuK35u6SygUk=
k8s.io/apiextensions-apiserver v0.30.0 h1:jcZFKMqnICJfRxTgnC4E+Hpcq8UEhT8B2lhBcQ+6uAs=
k8s.io/apiextensions-apiserver v0.30.0/go.mod h1:N9ogQFGcrbWqAY9p2mUAL5mGxsLqwgtUce127VtRX5Y=
k8s.io/apimachinery v0.23.3/go.mod h1:BEuFMMBaIbcOqVIJqNZJXGFTP4W6AycEpb5+m/97hrM=
k8s.io/apimachinery v0.31.0 h1:m9jOiSr3FoSSL5WO9bjm1n6B9KROYYgNZOb4tyZ1lBc=
k8s.io/apimachinery v0.31.0/go.mod h1:rsPdaZJfTfLsNJSQzNHQvYoTmxhoOEofxtOsF3rtsMo=
k8s.io/apimachinery v0.31.2 h1:i4vUt2hPK56W6mlT7Ry+AO8eEsyxMD1U44NR22CLTYw=
k8s.io/apimachinery v0.31.2/go.mod h1:rsPdaZJfTfLsNJSQzNHQvYoTmxhoOEofxtOsF3rtsMo=
k8s.io/apiserver v0.31.0 h1:p+2dgJjy+bk+B1Csz+mc2wl5gHwvNkC9QJV+w55LVrY=
k8s.io/apiserver v0.31.0/go.mod h1:KI9ox5Yu902iBnnyMmy7ajonhKnkeZYJhTZ/YI+WEMk=
k8s.io/client-go v0.31.0 h1:QqEJzNjbN2Yv1H79SsS+SWnXkBgVu4Pj3CJQgbx0gI8=
k8s.io/client-go v0.31.0/go.mod h1:Y9wvC76g4fLjmU0BA+rV+h2cncoadjvjjkkIGoTLcGU=
k8s.io/client-go v0.31.2 h1:Y2F4dxU5d3AQj+ybwSMqQnpZH9F30//1ObxOKlTI9yc=
k8s.io/client-go v0.31.2/go.mod h1:NPa74jSVR/+eez2dFsEIHNa+3o09vtNaWwWwb1qSxSs=
k8s.io/code-generator v0.23.3/go.mod h1:S0Q1JVA+kSzTI1oUvbKAxZY/DYbA/ZUb4Uknog12ETk=
k8s.io/code-generator v0.31.0 h1:w607nrMi1KeDKB3/F/J4lIoOgAwc+gV9ZKew4XRfMp8=
k8s.io/code-generator v0.31.0/go.mod h1:84y4w3es8rOJOUUP1rLsIiGlO1JuEaPFXQPA9e/K6U0=
k8s.io/code-generator v0.31.2 h1:xLWxG0HEpMSHfcM//3u3Ro2Hmc6AyyLINQS//Z2GEOI=
k8s.io/code-generator v0.31.2/go.mod h1:eEQHXgBU/m7LDaToDoiz3t97dUUVyOblQdwOr8rivqc=
k8s.io/component-base v0.31.0 h1:/KIzGM5EvPNQcYgwq5NwoQBaOlVFrghoVGr8lG6vNRs=
k8s.io/component-base v0.31.0/go.mod h1:TYVuzI1QmN4L5ItVdMSXKvH7/DtvIuas5/mm8YT3rTo=
k8s.io/component-helpers v0.31.0 h1:jyRUKA+GX+q19o81k4x94imjNICn+e6Gzi6T89va1/A=
Expand All @@ -673,6 +673,8 @@ k8s.io/kms v0.31.0 h1:KchILPfB1ZE+ka7223mpU5zeFNkmb45jl7RHnlImUaI=
k8s.io/kms v0.31.0/go.mod h1:OZKwl1fan3n3N5FFxnW5C4V3ygrah/3YXeJWS3O6+94=
k8s.io/kube-openapi v0.0.0-20240430033511-f0e62f92d13f h1:0LQagt0gDpKqvIkAMPaRGcXawNMouPECM1+F9BVxEaM=
k8s.io/kube-openapi v0.0.0-20240430033511-f0e62f92d13f/go.mod h1:S9tOR0FxgyusSNR+MboCuiDpVWkAifZvaYI1Q2ubgro=
k8s.io/metrics v0.31.2 h1:sQhujR9m3HN/Nu/0fTfTscjnswQl0qkQAodEdGBS0N4=
k8s.io/metrics v0.31.2/go.mod h1:QqqyReApEWO1UEgXOSXiHCQod6yTxYctbAAQBWZkboU=
k8s.io/utils v0.0.0-20211116205334-6203023598ed/go.mod h1:jPW/WVKK9YHAvNhRxK0md/EJ228hCsBRufyofKtW8HA=
k8s.io/utils v0.0.0-20230726121419-3b25d923346b/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8 h1:pUdcCO1Lk/tbT5ztQWOBi5HBgbBP1J8+AsQnQCKsi8A=
Expand Down
3 changes: 3 additions & 0 deletions kubernetes/base/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ rules:
resources: ["leases"]
resourceNames: ["descheduler"]
verbs: ["get", "patch", "delete"]
- apiGroups: ["metrics.k8s.io"]
resources: ["nodes", "pods"]
verbs: ["get", "list"]
---
apiVersion: v1
kind: ServiceAccount
Expand Down
10 changes: 10 additions & 0 deletions pkg/api/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ type DeschedulerPolicy struct {

// MaxNoOfPodsToTotal restricts maximum of pods to be evicted total.
MaxNoOfPodsToEvictTotal *uint

// MetricsCollector configures collection of metrics about actual resource utilization
MetricsCollector MetricsCollector
}

// Namespaces carries a list of included/excluded namespaces
Expand Down Expand Up @@ -84,3 +87,10 @@ type PluginSet struct {
Enabled []string
Disabled []string
}

// MetricsCollector configures collection of metrics about actual resource utilization
type MetricsCollector struct {
// Enabled metrics collection from kubernetes metrics.
// Later, the collection can be extended to other providers.
Enabled bool
}
10 changes: 10 additions & 0 deletions pkg/api/v1alpha2/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ type DeschedulerPolicy struct {

// MaxNoOfPodsToTotal restricts maximum of pods to be evicted total.
MaxNoOfPodsToEvictTotal *uint `json:"maxNoOfPodsToEvictTotal,omitempty"`

// MetricsCollector configures collection of metrics for actual resource utilization
MetricsCollector MetricsCollector `json:"metricsCollector,omitempty"`
}

type DeschedulerProfile struct {
Expand All @@ -66,3 +69,10 @@ type PluginSet struct {
Enabled []string `json:"enabled"`
Disabled []string `json:"disabled"`
}

// MetricsCollector configures collection of metrics about actual resource utilization
type MetricsCollector struct {
// Enabled metrics collection from kubernetes metrics.
// Later, the collection can be extended to other providers.
Enabled bool `json:"enabled,omitempty"`
}
36 changes: 36 additions & 0 deletions pkg/api/v1alpha2/zz_generated.conversion.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions pkg/api/v1alpha2/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions pkg/api/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 23 additions & 4 deletions pkg/descheduler/client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ package client
import (
"fmt"

clientset "k8s.io/client-go/kubernetes"
componentbaseconfig "k8s.io/component-base/config"

// Ensure to load all auth plugins.
clientset "k8s.io/client-go/kubernetes"
_ "k8s.io/client-go/plugin/pkg/client/auth"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/clientcmd"
componentbaseconfig "k8s.io/component-base/config"
metricsclient "k8s.io/metrics/pkg/client/clientset/versioned"
)

func CreateClient(clientConnection componentbaseconfig.ClientConnectionConfiguration, userAgt string) (clientset.Interface, error) {
func createConfig(clientConnection componentbaseconfig.ClientConnectionConfiguration, userAgt string) (*rest.Config, error) {
var cfg *rest.Config
if len(clientConnection.Kubeconfig) != 0 {
master, err := GetMasterFromKubeconfig(clientConnection.Kubeconfig)
Expand Down Expand Up @@ -56,9 +56,28 @@ func CreateClient(clientConnection componentbaseconfig.ClientConnectionConfigura
cfg = rest.AddUserAgent(cfg, userAgt)
}

return cfg, nil
}

func CreateClient(clientConnection componentbaseconfig.ClientConnectionConfiguration, userAgt string) (clientset.Interface, error) {
cfg, err := createConfig(clientConnection, userAgt)
if err != nil {
return nil, fmt.Errorf("unable to create config: %v", err)
}

return clientset.NewForConfig(cfg)
}

func CreateMetricsClient(clientConnection componentbaseconfig.ClientConnectionConfiguration, userAgt string) (metricsclient.Interface, error) {
cfg, err := createConfig(clientConnection, userAgt)
if err != nil {
return nil, fmt.Errorf("unable to create config: %v", err)
}

// Create the metrics clientset to access the metrics.k8s.io API
return metricsclient.NewForConfig(cfg)
}

func GetMasterFromKubeconfig(filename string) (string, error) {
config, err := clientcmd.LoadFromFile(filename)
if err != nil {
Expand Down
Loading

0 comments on commit a962cca

Please sign in to comment.