From 5e8b990e3ca25bda813e031975e9e079d6b7faaa Mon Sep 17 00:00:00 2001 From: Richa Banker Date: Fri, 26 May 2023 18:20:52 -0700 Subject: [PATCH] Update KEP for Kubelet Resource Metrics endpoint GA release Co-authored-by: David Ashpole cleanup --- keps/prod-readiness/sig-node/727.yaml | 3 + .../727-resource-metrics-endpoint/README.md | 344 +++++++++++++++--- .../727-resource-metrics-endpoint/kep.yaml | 23 +- 3 files changed, 312 insertions(+), 58 deletions(-) create mode 100644 keps/prod-readiness/sig-node/727.yaml diff --git a/keps/prod-readiness/sig-node/727.yaml b/keps/prod-readiness/sig-node/727.yaml new file mode 100644 index 000000000000..8fc1b1360b98 --- /dev/null +++ b/keps/prod-readiness/sig-node/727.yaml @@ -0,0 +1,3 @@ +kep-number: 727 +stable: + approver: "wojtek-t" \ No newline at end of file diff --git a/keps/sig-node/727-resource-metrics-endpoint/README.md b/keps/sig-node/727-resource-metrics-endpoint/README.md index e533b06016a6..0847917efc2b 100644 --- a/keps/sig-node/727-resource-metrics-endpoint/README.md +++ b/keps/sig-node/727-resource-metrics-endpoint/README.md @@ -3,33 +3,99 @@ ## Table of Contents +- [Release Signoff Checklist](#release-signoff-checklist) - [Summary](#summary) -- [Background](#background) - [Motivation](#motivation) + - [Background](#background) - [Goals](#goals) - [Non-Goals](#non-goals) - [Proposal](#proposal) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) - [API](#api) -- [Future Improvements](#future-improvements) -- [Benchmarking](#benchmarking) - - [Round 1](#round-1) - - [Methods](#methods) - - [Results](#results) - - [Round 2](#round-2) - - [Methods](#methods-1) - - [Results](#results-1) -- [Alternatives Considered](#alternatives-considered) - - [gRPC API](#grpc-api) + - [Future Improvements](#future-improvements) + - [Benchmarking](#benchmarking) + - [Round 1](#round-1) + - [Methods](#methods) + - [Results](#results) + - [Round 2](#round-2) + - [Methods](#methods-1) + - [Results](#results-1) - [Test Plan](#test-plan) -- [Graduation Criteria](#graduation-criteria) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) - [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives Considered](#alternatives-considered) + - [gRPC API](#grpc-api) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [x] (R) KEP approvers have approved the KEP status as `implementable` +- [x] (R) Design details are appropriately documented +- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [x] e2e Tests for all Beta API Operations (endpoints) + - [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [x] (R) Graduation criteria is in place + - [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [x] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [x] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + ## Summary The Kubelet Resource Metrics Endpoint is a new kubelet metrics endpoint which serves metrics required by the cluster-level [Resource Metrics API](https://github.com/kubernetes/metrics#resource-metrics-api). The proposed design uses the prometheus text format, and provides the minimum required metrics for serving the [Resource Metrics API](https://github.com/kubernetes/metrics#resource-metrics-api). -## Background +## Motivation + +The Kubelet Summary API is a source of both Resource and Monitoring Metrics. Because of it’s dual purpose, it does a poor job of both. It provides much more information than required by the Metrics Server, as demonstrated by [kubernetes/kubernetes#68841](https://github.com/kubernetes/kubernetes/pull/68841). Additionally, we have pushed back on adding metrics to the Summary API for monitoring, such as DiskIO or tcp/udp metrics, because they are expensive to collect, and not required by all users. + +This proposal deals with the first problem, which is that the Summary API is a poor provider of Resource Metrics. It proposes a purpose-built API for supplying Resource Metrics. + +### Background The [Monitoring Architecture](https://github.com/kubernetes/design-proposals-archive/blob/master/instrumentation/monitoring_architecture.md) proposal established separate pipelines for Resource Metrics, and for Monitoring Metrics. The [Core Metrics](https://github.com/kubernetes/design-proposals-archive/blob/master/instrumentation/core-metrics-pipeline.md#core-metrics-in-kubelet) proposal describes the set of metrics that we consider core, and their uses. Note that the term “core” is overloaded, and this document will refer to these as Resource Metrics, since they are for first class kubernetes resources and are served by the [Resource Metrics API](https://github.com/kubernetes/metrics#resource-metrics-api) at the cluster-level. @@ -49,12 +115,6 @@ The Kubelet’s [JSON Summary API](https://github.com/kubernetes/kubernetes/blob [GRPC](https://grpc.io/) is commonly used for interfaces between components in kubernetes, such as the [Container Runtime Interface](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto). GRPC uses [protocol-buffers](https://developers.google.com/protocol-buffers/docs/overview) (protobuf) for serialization and deserialization, which is more performant than other formats. -## Motivation - -The Kubelet Summary API is a source of both Resource and Monitoring Metrics. Because of it’s dual purpose, it does a poor job of both. It provides much more information than required by the Metrics Server, as demonstrated by [kubernetes/kubernetes#68841](https://github.com/kubernetes/kubernetes/pull/68841). Additionally, we have pushed back on adding metrics to the Summary API for monitoring, such as DiskIO or tcp/udp metrics, because they are expensive to collect, and not required by all users. - -This proposal deals with the first problem, which is that the Summary API is a poor provider of Resource Metrics. It proposes a purpose-built API for supplying Resource Metrics. - ### Goals * [Primary] Provide the minimum set of metrics required to serve the Resource Metrics API @@ -75,6 +135,9 @@ The kubelet will expose an endpoint at `/metrics/resource` in prometheus text ex The metrics in this endpoint will make use of the [Kubernetes Metrics Stability framework](https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/1209-metrics-stability/kubernetes-control-plane-metrics-stability.md) for stability and deprecation policies. +### Risks and Mitigations + +## Design Details ### API @@ -106,21 +169,21 @@ Labels are named in accordance with the [kubernetes instrumentation guidelines]( Example implementation: https://github.com/kubernetes/kubernetes/compare/master...dashpole:prometheus_core_metrics -## Future Improvements +### Future Improvements [OpenMetrics](https://openmetrics.io/) is an upcoming prometheus-based standard which has support for protocol buffers. By using this format when it becomes available, we can further improve the efficiency of the Resource Metrics Pipeline, while maintaining compatibility with other monitoring pipelines. -## Benchmarking +### Benchmarking -### Round 1 +#### Round 1 This experiment compares the current JSON Summary API to prometheus and GRPC at 1s and 30s scrape intervals. Prometheus uses basic text parsing, and grpc uses a basic `Get()` API. -#### Methods +##### Methods The setup has 10 nodes, 500 pods, and 6500 containers (running pause). Nodes have 1 CPU core, and 3.75Gb memory. The same cluster was used for all benchmarks for consistency, with a different Metrics Server running. The values below are the maximum values reported during a 10 minute period. -#### Results +##### Results We can see that GRPC has the lowest CPU usage of all formats tested, and is an order-of-magnitude improvement over the current JSON Summary API. Memory Usage for both GRPC and Prometheus are similarly lower than the JSON Summary API. @@ -128,19 +191,19 @@ We can see that GRPC has the lowest CPU usage of all formats tested, and is an o -### Round 2 +#### Round 2 After learning that the prometheus server achieves better performance with caching, I performed an additional round of tests. These used a metrics-server which caches metric descriptors it has parsed before, and tested with larger numbers of container metrics. This experiment compares basic prometheus, optimized prometheus parsing and GRPC at 1s scrape intervals with higher numbers of container metrics. "Unoptimized Prometheus" uses basic text parsing, "Prometheus w/ Caching" borrows [caching logic from the prometheus server](https://github.com/prometheus/prometheus/blob/master/scrape/scrape.go#L991) to avoid re-parsing metric descriptors it has already parsed and grpc uses a basic `Get()` API. -#### Methods +##### Methods The setup has 10 nodes, and up to 40,000 containers (running pause). Nodes have 2 CPU core, and 7.5Gb memory. The same cluster was used for all benchmarks for consistency, with a different Metrics Server running. The values below are the maximum values reported during a 10 minute period. This experiment "fakes" large numbers of containers by having the kubelet return 100 container metrics for each actual container run on the node. -#### Results +##### Results Both gRPC and the optimized prometheus were able to scale to 40k containers. The gRPC implementation was more efficient by a factor of approx. 3. @@ -148,6 +211,201 @@ Both gRPC and the optimized prometheus were able to scale to 40k containers. Th +### Test Plan + +[X] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + +##### Unit tests + +- ``: `` - `` + +##### Integration tests + +- : + +##### e2e tests + +Test the new endpoint with a node-e2e test similar to the current summary API test. +Testgrid: https://k8s-testgrid.appspot.com/sig-node-kubelet#node-kubelet-features-master&include-filter-by-regex=ResourceMetricsAPI + +### Graduation Criteria + +Alpha: + +- [X] Implement the kubelet resource metrics endpoint as described above + +Beta: + +- [X] Modify the metrics server to consume the kubelet resource metrics endpoint 3 releases after it is added to the kubelet + +GA: + +- [X] Add [node-e2e test](https://github.com/kubernetes/kubernetes/pull/116897/files#diff-3859a7587ac4b3d1e162a2360b1fd2d3e88d4589be9b0bf19029fa7489294796R59-R70) + +### Upgrade / Downgrade Strategy + +The kubelet can be upgraded or downgraded normally with respect to this feature. Users of the metrics endpoint, such as the metrics server, should use other kubelet metrics endpoints (such as the summary api) before downgrading. + +### Version Skew Strategy + +This feature affects only the kubelet - in that it will expose the resource metrics for kubelet in a new endpoint, so there is no issue with version skew with other components. + +## Production Readiness Review Questionnaire + +### Feature Enablement and Rollback + +###### How can this feature be enabled / disabled in a live cluster? + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [x] Other + - Describe the mechanism: This feature exposes the /metrics/resource endpoint for kubelet, with all metrics annotated as STABLE. **Note:** Because this feature was built before the PRR process was established, it unfortunately does not adhere to the best practices of feature enablement/disablement + - Will enabling / disabling the feature require downtime of the control + plane? No + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? No + +###### Does enabling the feature change any default behavior? + +It will expose the /metrics/resource endpoint for kubelet by default + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + +No, this feature can not be disabled once it has been enabled since we do not have a feature flag for this. To rollback, one will have to downgrade the kubernetes version. Note: This version was added in v1.14, so to disable this feature, one would need to switch back to a version older than v1.14 + +###### What happens if we reenable the feature if it was previously rolled back? + +/metrics/resource endpoint for kubelet will become available + +###### Are there any tests for feature enablement/disablement? + +Since there is no feature gate involved for this, there are no feature enablement/disablement test + +### Rollout, Upgrade and Rollback Planning + +###### How can a rollout or rollback fail? Can it impact already running workloads? + +A rollback can impact running workloads if clients, such as the metrics server, are relying on metrics provided by the endpoint. The rollback could break cluster functions, such as HPA, if the metrics were no longer available. + +###### What specific metrics should inform a rollback? + +The following metrics exposed by /kubelet/resource endpoint could be used: +- node_memory_working_set_bytes +- pod_memory_working_set_bytes + +We could compute node_memory_working_set_bytes - sum(pod_memory_working_set_bytes) to know if there's a memory leak. + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + +No, because the feature was enabled (with no way to disable) since v1.14. + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + +No + +### Monitoring Requirements + +###### How can an operator determine if the feature is in use by workloads? + +By checking kubelet's /metrics/resource endpoint + +###### How can someone using this feature know that it is working for their instance? + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [X] Other (treat as last resort) + - Details: /metrics/resource endpoint for kubelet should show resource metrics + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + +This feature introduces a metrics endpoint that can used to establish SLOs + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + +This feature introduces a metrics endpoint that can be used to determine health of kubelet. + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + +No + +### Dependencies + +###### Does this feature depend on any specific services running in the cluster? + +Kubelet + +### Scalability + +###### Will enabling / using this feature result in any new API calls? + +No + +###### Will enabling / using this feature result in introducing new API types? + +No + +###### Will enabling / using this feature result in any new calls to the cloud provider? + +No + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + +No + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + +No + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + +No, infact CPU usage is reduced as compared to the Summary API's usage which was previously used my the metrics server. + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + +No + +### Troubleshooting + +###### How does this feature react if the API server and/or etcd is unavailable? + +No impact + +###### What are other known failure modes? + +/metrics/resource endpoint is not available + +###### What steps should be taken if SLOs are not being met to determine the problem? + +Memory leaks should be checked by looking at node_memory_working_set_bytes - sum(pod_memory_working_set_bytes) +If the problem is severe, kubernetes version should be downgraded so that the /metrics/resource endpoint is not exposed for kubelet. Keep in mind, users of these metrics should use other metrics endpoints (such as the summary api) before downgrading. + +## Implementation History + +- 2019-01-24: Initial KEP published. +- 2019-01-29: Presentation to Sig-Node +- 2019-02-04: KEP gets LGTM and Approval +- 2019-02-07: Presentation to Sig-Instrumentation +- 2020-01-14: [1.18] Endpoint copied from /metrics/resource/v1alpha1 to /metrics/resource, and adopting the metrics stability framework: https://github.com/kubernetes/kubernetes/pull/86282 +- 2020-09-01: [1.20] /metrics/resource/v1alpha1 removed: https://github.com/kubernetes/kubernetes/pull/94272 +- 2021-06-28: Use kubelet's /metrics/resource endpoint in metrics-server: https://github.com/kubernetes-sigs/metrics-server/pull/777 +- 2023-08-23: [1.29] GA graduation, non conformance test added https://github.com/kubernetes/kubernetes/pull/116897 +- 2023-09-08: [1.29] Promoted test to conformance test https://github.com/kubernetes/kubernetes/pull/120473 + +## Drawbacks + + + ## Alternatives Considered ### gRPC API @@ -189,30 +447,10 @@ service ResourceMetrics { } ``` -### Test Plan - -Test the new endpoint with a node-e2e test similar to the current summary API test. -Testgrid: https://k8s-testgrid.appspot.com/sig-node-kubelet#node-kubelet-features-master&include-filter-by-regex=ResourceMetricsAPI - -## Graduation Criteria - -Alpha: - -- [X] Implement the kubelet resource metrics endpoint as described above - -Beta: - -- [ ] Modify the metrics server to consume the kubelet resource metrics endpoint 3 releases after it is added to the kubelet - -GA: - -- [ ] Add node-e2e test to the node conformance tests +## Infrastructure Needed (Optional) -## Implementation History - -- 2019-01-24: Initial KEP published. -- 2019-01-29: Presentation to Sig-Node -- 2019-02-04: KEP gets LGTM and Approval -- 2019-02-07: Presentation to Sig-Instrumentation -- 2020-01-14: [1.18] Endpoint copied from /metrics/resource/v1alpha1 to /metrics/resource, and adopting the metrics stability framework: https://github.com/kubernetes/kubernetes/pull/86282 -- 2020-09-01: [1.20] /metrics/resource/v1alpha1 removed: https://github.com/kubernetes/kubernetes/pull/94272 + diff --git a/keps/sig-node/727-resource-metrics-endpoint/kep.yaml b/keps/sig-node/727-resource-metrics-endpoint/kep.yaml index d98715e30426..ba75a12edc71 100644 --- a/keps/sig-node/727-resource-metrics-endpoint/kep.yaml +++ b/keps/sig-node/727-resource-metrics-endpoint/kep.yaml @@ -2,18 +2,31 @@ title: Kubelet Resource Metrics Endpoint kep-number: 727 authors: - "@dashpole" + - "@richabanker" owning-sig: sig-node participating-sigs: - sig-instrumentation +status: implementable +creation-date: 2019-01-24 +last-updated: 2023-09-08 reviewers: - DirectXMan12 - tallclair approvers: - dchen1107 - brancz -creation-date: 2019-01-24 -last-updated: 2019-02-21 -status: implementable +replaces: + - none + +# The target maturity stage in the current dev cycle for this KEP. +stage: "stable" + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "1.29" -latest-milestone: "0.0" -stage: "alpha" +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.14" + stable: "v1.29"