From 920a68b536190b88f7969ce8ef6bafbe9fd190c5 Mon Sep 17 00:00:00 2001 From: pegasas Date: Tue, 24 Oct 2023 14:54:07 +0800 Subject: [PATCH 01/10] Document snag with stringData and server-side apply --- .../tasks/configmap-secret/managing-secret-using-kubectl.md | 4 ++++ .../tasks/configmap-secret/managing-secret-using-kustomize.md | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/content/en/docs/tasks/configmap-secret/managing-secret-using-kubectl.md b/content/en/docs/tasks/configmap-secret/managing-secret-using-kubectl.md index 51f66d44be347..36ce3b5875c4c 100644 --- a/content/en/docs/tasks/configmap-secret/managing-secret-using-kubectl.md +++ b/content/en/docs/tasks/configmap-secret/managing-secret-using-kubectl.md @@ -40,6 +40,10 @@ You must use single quotes `''` to escape special characters such as `$`, `\`, `*`, `=`, and `!` in your strings. If you don't, your shell will interpret these characters. +{{< note >}} +The `stringData` field for a Secret does not work well with server-side apply. +{{< /note >}} + ### Use source files 1. Store the credentials in files: diff --git a/content/en/docs/tasks/configmap-secret/managing-secret-using-kustomize.md b/content/en/docs/tasks/configmap-secret/managing-secret-using-kustomize.md index ae2109f803d9d..364b461614381 100644 --- a/content/en/docs/tasks/configmap-secret/managing-secret-using-kustomize.md +++ b/content/en/docs/tasks/configmap-secret/managing-secret-using-kustomize.md @@ -24,6 +24,10 @@ You can generate a Secret by defining a `secretGenerator` in a literal values. For example, the following instructions create a Kustomization file for the username `admin` and the password `1f2d1e2e67df`. +{{< note >}} +The `stringData` field for a Secret does not work well with server-side apply. +{{< /note >}} + ### Create the Kustomization file {{< tabs name="Secret data" >}} From 0004c99d6f4cc069b79cdfbc1e87396c59e2d248 Mon Sep 17 00:00:00 2001 From: Matt Grasberger Date: Tue, 7 Nov 2023 20:32:06 +0100 Subject: [PATCH 02/10] Update garbage-collection.md --- content/en/docs/concepts/architecture/garbage-collection.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/concepts/architecture/garbage-collection.md b/content/en/docs/concepts/architecture/garbage-collection.md index 947ad515fc90b..b7173405535fc 100644 --- a/content/en/docs/concepts/architecture/garbage-collection.md +++ b/content/en/docs/concepts/architecture/garbage-collection.md @@ -111,7 +111,7 @@ to override this behaviour, see [Delete owner objects and orphan dependents](/do ## Garbage collection of unused containers and images {#containers-images} The {{}} performs garbage -collection on unused images every five minutes and on unused containers every +collection on unused images every two minutes and on unused containers every minute. You should avoid using external garbage collection tools, as these can break the kubelet behavior and remove containers that should exist. From db0d363874795379f44e43bcfe73a407fedfb229 Mon Sep 17 00:00:00 2001 From: Syafiq Kamarul Azman Date: Wed, 8 Nov 2023 06:56:44 +0400 Subject: [PATCH 03/10] Fix minor formatting issue in _index.md In the operations table, the row that describes the "top" command was displayed incorrectly in context of the other rows in the table. This updates the docs to fix that display issue. Revert changes for [ja] [ko] and [zh-cn] --- content/en/docs/reference/kubectl/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/reference/kubectl/_index.md b/content/en/docs/reference/kubectl/_index.md index 1f3c6b73f17de..f04d5edcf9d98 100644 --- a/content/en/docs/reference/kubectl/_index.md +++ b/content/en/docs/reference/kubectl/_index.md @@ -171,7 +171,7 @@ Operation | Syntax | Description `scale` | kubectl scale (-f FILENAME | TYPE NAME | TYPE/NAME) --replicas=COUNT [--resource-version=version] [--current-replicas=count] [flags] | Update the size of the specified replication controller. `set` | `kubectl set SUBCOMMAND [options]` | Configure application resources. `taint` | `kubectl taint NODE NAME KEY_1=VAL_1:TAINT_EFFECT_1 ... KEY_N=VAL_N:TAINT_EFFECT_N [options]` | Update the taints on one or more nodes. -`top` | `kubectl top (POD | NODE) [flags] [options]` | Display Resource (CPU/Memory/Storage) usage of pod or node. +`top` | kubectl top (POD | NODE) [flags] [options] | Display Resource (CPU/Memory/Storage) usage of pod or node. `uncordon` | `kubectl uncordon NODE [options]` | Mark node as schedulable. `version` | `kubectl version [--client] [flags]` | Display the Kubernetes version running on the client and server. `wait` | kubectl wait ([-f FILENAME] | resource.group/resource.name | resource.group [(-l label | --all)]) [--for=delete|--for condition=available] [options] | Experimental: Wait for a specific condition on one or many resources. From 13039d826c52b9ab974a6169c38aa4840709a7df Mon Sep 17 00:00:00 2001 From: Tim Bannister Date: Fri, 17 Nov 2023 19:00:24 +0000 Subject: [PATCH 04/10] Switch StatefulSet tutorial to HTTP manifest source kubectl can apply manifests direct from the web; let's teach that. --- .../stateful-application/basic-stateful-set.md | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/content/en/docs/tutorials/stateful-application/basic-stateful-set.md b/content/en/docs/tutorials/stateful-application/basic-stateful-set.md index 10b2c35b4403a..905f384fd0059 100644 --- a/content/en/docs/tutorials/stateful-application/basic-stateful-set.md +++ b/content/en/docs/tutorials/stateful-application/basic-stateful-set.md @@ -76,9 +76,7 @@ It creates a [headless Service](/docs/concepts/services-networking/service/#head {{% code_sample file="application/web/web.yaml" %}} -Download the example above, and save it to a file named `web.yaml` - -You will need to use two terminal windows. In the first terminal, use +You will need to use at least two terminal windows. In the first terminal, use [`kubectl get`](/docs/reference/generated/kubectl/kubectl-commands/#get) to watch the creation of the StatefulSet's Pods. @@ -88,10 +86,10 @@ kubectl get pods -w -l app=nginx In the second terminal, use [`kubectl apply`](/docs/reference/generated/kubectl/kubectl-commands/#apply) to create the -headless Service and StatefulSet defined in `web.yaml`. +headless Service and StatefulSet: ```shell -kubectl apply -f web.yaml +kubectl apply -f https://k8s.io/examples/application/web/web.yaml ``` ``` service/nginx created @@ -919,7 +917,7 @@ you deleted the `nginx` Service (which you should not have), you will see an error indicating that the Service already exists. ```shell -kubectl apply -f web.yaml +kubectl apply -f https://k8s.io/examples/application/web/web.yaml ``` ``` statefulset.apps/web created @@ -1038,7 +1036,7 @@ service "nginx" deleted Recreate the StatefulSet and headless Service one more time: ```shell -kubectl apply -f web.yaml +kubectl apply -f https://k8s.io/examples/application/web/web.yaml ``` ``` @@ -1104,8 +1102,6 @@ Pod. This option only affects the behavior for scaling operations. Updates are n {{% code_sample file="application/web/web-parallel.yaml" %}} -Download the example above, and save it to a file named `web-parallel.yaml` - This manifest is identical to the one you downloaded above except that the `.spec.podManagementPolicy` of the `web` StatefulSet is set to `Parallel`. @@ -1118,7 +1114,7 @@ kubectl get pod -l app=nginx -w In another terminal, create the StatefulSet and Service in the manifest: ```shell -kubectl apply -f web-parallel.yaml +kubectl apply -f https://k8s.io/examples/application/web/web-parallel.yaml ``` ``` service/nginx created From 73161087f46efc196f94fce476aa5ae39144ab92 Mon Sep 17 00:00:00 2001 From: xin gu <418294249@qq.com> Date: Thu, 7 Dec 2023 21:54:24 +0800 Subject: [PATCH 05/10] sync change-package-repository /tasks/debug/debug-cluster/_index verify-kubectl --- .../kubeadm/change-package-repository.md | 2 +- .../zh-cn/docs/tasks/debug/debug-cluster/_index.md | 6 +++--- .../docs/tasks/tools/included/verify-kubectl.md | 14 ++++++++------ 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/content/zh-cn/docs/tasks/administer-cluster/kubeadm/change-package-repository.md b/content/zh-cn/docs/tasks/administer-cluster/kubeadm/change-package-repository.md index 62e7f5238bf38..e1e7cbd421ce6 100644 --- a/content/zh-cn/docs/tasks/administer-cluster/kubeadm/change-package-repository.md +++ b/content/zh-cn/docs/tasks/administer-cluster/kubeadm/change-package-repository.md @@ -223,7 +223,7 @@ version. 在从一个 Kubernetes 小版本升级到另一个版本时,应执行此步骤以获取所需 Kubernetes 小版本的软件包访问权限。 -{{< tabs name="k8s_install_versions" >}} +{{< tabs name="k8s_upgrade_versions" >}} {{% tab name="Ubuntu、Debian 或 HypriotOS" %}} ### 故障原因 {#contributing-causes} @@ -329,7 +329,7 @@ This is an incomplete list of things that could go wrong, and how to adjust your - API server VM shutdown or apiserver crashing - Results - unable to stop, update, or start new pods, services, replication controller - - existing pods and services should continue to work normally, unless they depend on the Kubernetes API + - existing pods and services should continue to work normally unless they depend on the Kubernetes API - API server backing storage lost - Results - the kube-apiserver component fails to start successfully and become healthy @@ -401,7 +401,7 @@ This is an incomplete list of things that could go wrong, and how to adjust your 如果返回一个 URL,则意味着 kubectl 成功地访问到了你的集群。 @@ -55,9 +56,11 @@ The connection to the server was refused - did you specify th ``` 例如,如果你想在自己的笔记本上(本地)运行 Kubernetes 集群,你需要先安装一个 Minikube 这样的工具,然后再重新运行上面的命令。 @@ -72,9 +75,8 @@ kubectl cluster-info dump ### Troubleshooting the 'No Auth Provider Found' error message {#no-auth-provider-found} In Kubernetes 1.26, kubectl removed the built-in authentication for the following cloud -providers' managed Kubernetes offerings. -These providers have released kubectl plugins to provide the cloud-specific authentication. -For instructions, refer to the following provider documentation: +providers' managed Kubernetes offerings. These providers have released kubectl plugins +to provide the cloud-specific authentication. For instructions, refer to the following provider documentation: --> ### 排查"找不到身份验证提供商"的错误信息 {#no-auth-provider-found} From 3b306af95f9785f5eeec40d313707043d46a834d Mon Sep 17 00:00:00 2001 From: Paul Renault Date: Thu, 7 Dec 2023 16:12:15 +0100 Subject: [PATCH 06/10] Fix typo in TLS Secrets part --- content/en/docs/concepts/configuration/secret.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/concepts/configuration/secret.md b/content/en/docs/concepts/configuration/secret.md index 36aea75968120..19389f6c6fd18 100644 --- a/content/en/docs/concepts/configuration/secret.md +++ b/content/en/docs/concepts/configuration/secret.md @@ -381,7 +381,7 @@ The following YAML contains an example config for a TLS Secret: The TLS Secret type is provided only for convenience. You can create an `Opaque` type for credentials used for TLS authentication. -However, using the defined and public Secret type (`kubernetes.io/ssh-auth`) +However, using the defined and public Secret type (`kubernetes.io/tls`) helps ensure the consistency of Secret format in your project. The API server verifies if the required keys are set for a Secret of this type. From 0c8bde770b687cd64501429557099747add1668c Mon Sep 17 00:00:00 2001 From: Christophe Gasmi Date: Thu, 7 Dec 2023 19:26:31 +0100 Subject: [PATCH 07/10] update date kubecon doc fr --- content/fr/_index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/fr/_index.html b/content/fr/_index.html index 87923c286881b..c6529535bbc4d 100644 --- a/content/fr/_index.html +++ b/content/fr/_index.html @@ -43,12 +43,12 @@

Les défis de la migration de plus de 150 microservices vers Kubernetes



- Venez au KubeCon Detroit, Michigan, USA du 24 au 28 Octobre 2022 + Venez au KubeCon Salt Lake City, UTAH, USA du 12 au 15 Novembre 2024



- Venez au KubeCon EU Amsterdam, Pays-Bas du 17 au 21 Avril 2023 + Venez au KubeCon EU Paris, France du 19 au 22 Mars 2024
From 488af6b5be805b3e71547f66e5f8309d3f7c2e1d Mon Sep 17 00:00:00 2001 From: Arhell Date: Fri, 8 Dec 2023 14:26:24 +0200 Subject: [PATCH 08/10] [ru] Update KubeCon dates for 2024 --- content/ru/_index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/ru/_index.html b/content/ru/_index.html index 305f652aea368..3054abc629044 100644 --- a/content/ru/_index.html +++ b/content/ru/_index.html @@ -43,12 +43,12 @@

О сложности миграции 150+ микросервисов в Ku

- Посетите KubeCon + CloudNativeCon в Европе, 18-21 апреля 2023 года + Посетите KubeCon + CloudNativeCon в Европе, 19-22 марта 2024 года



- Посетите KubeCon + CloudNativeCon в Северной Америке, 6-9 ноября 2023 года + Посетите KubeCon + CloudNativeCon в Северной Америке, 12-15 ноября 2024 года

From ec9f733857c9b8e7d4c831c16b36fc18da4c86d7 Mon Sep 17 00:00:00 2001 From: Arhell Date: Sat, 9 Dec 2023 02:07:25 +0200 Subject: [PATCH 09/10] [uk] Update KubeCon dates for 2024 --- content/uk/_index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/uk/_index.html b/content/uk/_index.html index 5467762143594..ff0b5312422af 100644 --- a/content/uk/_index.html +++ b/content/uk/_index.html @@ -64,12 +64,12 @@

Проблеми міграції 150+ мікросервісів у Kuberne

- Відвідайте KubeCon + CloudNativeCon у Північній Америці, 6-9 листопада 2023 року + Відвідайте KubeCon + CloudNativeCon в Європі, 19-22 березня 2024 року



- Відвідайте KubeCon + CloudNativeCon в Європі, 19-22 березня 2024 року + Відвідайте KubeCon + CloudNativeCon у Північній Америці, 12-15 листопада 2024 року

From 7f8d4ac66f343c2ba4a6ba43ee86aa963e300fbb Mon Sep 17 00:00:00 2001 From: Suruchi Kumari Date: Sat, 9 Dec 2023 17:11:48 +0530 Subject: [PATCH 10/10] Revert "Implemented a single columned list for the kubernetes metrics reference page (#42823)" This reverts commit 71cd6ca2038e762d810cf9b961db336ffb81ff9c. --- assets/scss/_custom.scss | 77 +- .../docs/reference/instrumentation/metrics.md | 6042 ++++++++--------- 2 files changed, 3054 insertions(+), 3065 deletions(-) diff --git a/assets/scss/_custom.scss b/assets/scss/_custom.scss index 6ec7f28f17b34..a06da2448c3f8 100644 --- a/assets/scss/_custom.scss +++ b/assets/scss/_custom.scss @@ -392,63 +392,52 @@ footer { } main { - -/* SCSS Related to the Metrics list */ - - div.metric:nth-of-type(odd) { // Look & Feel , Aesthetics - background-color: $light-grey; + .td-content table code, + .td-content>table td { + word-break: break-word; } - div.metrics { +/* SCSS Related to the Metrics Table */ + + @media (max-width: 767px) { // for mobile devices, Display the names, Stability levels & types - .metric { - div:empty{ + table.metrics { + th:nth-child(n + 4), + td:nth-child(n + 4) { display: none; } - display: flex; - flex-direction: column; - flex-wrap: wrap; - gap: .75em; - padding:.75em .75em .75em .75em; - - .metric_name{ - font-size: large; - font-weight: bold; - word-break: break-word; + td.metric_type{ + min-width: 7em; } - - label{ - font-weight: bold; - margin-right: .5em; + td.metric_stability_level{ + min-width: 6em; } - ul { - li:empty{ - display: none; + } + } + + table.metrics tbody{ // Tested dimensions to improve overall aesthetic of the table + tr { + td { + font-size: smaller; } - display: flex; - flex-direction: column; - gap: .75em; - flex-wrap: wrap; - li.metric_labels_varying{ - span{ - display: inline-block; - background-color: rgb(240, 239, 239); - padding: 0 0.5em; - margin-right: .35em; - font-family: monospace; - border: 1px solid rgb(230 , 230 , 230); - border-radius: 5%; - margin-bottom: .35em; - } + td.metric_labels_varying{ + min-width: 9em; } - + td.metric_type{ + min-width: 9em; + } + td.metric_description{ + min-width: 10em; + } + } - } - - } + table.no-word-break td, + table.no-word-break code { + word-break: normal; +} } // blockquotes and callouts diff --git a/content/en/docs/reference/instrumentation/metrics.md b/content/en/docs/reference/instrumentation/metrics.md index a94953a5655e8..643191e5fdab0 100644 --- a/content/en/docs/reference/instrumentation/metrics.md +++ b/content/en/docs/reference/instrumentation/metrics.md @@ -6,10 +6,10 @@ description: >- Details of the metric data that Kubernetes components export. --- -## Metrics (v1.29) +## Metrics (v1.28) - - + + This page details the metrics that different Kubernetes components export. You can query the metrics endpoint for these components using an HTTP scrape, and fetch the current metrics data in Prometheus format. @@ -17,3029 +17,3029 @@ components using an HTTP scrape, and fetch the current metrics data in Prometheu Stable metrics observe strict API contracts and no labels can be added or removed from stable metrics during their lifetime. -
-
apiserver_admission_controller_admission_duration_seconds
-
Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • nameoperationrejectedtype
-
-
apiserver_admission_step_admission_duration_seconds
-
Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • operationrejectedtype
-
-
apiserver_admission_webhook_admission_duration_seconds
-
Admission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • nameoperationrejectedtype
-
-
apiserver_current_inflight_requests
-
Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
-
    -
  • STABLE
  • -
  • Gauge
  • -
  • request_kind
-
-
apiserver_longrunning_requests
-
Gauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
-
    -
  • STABLE
  • -
  • Gauge
  • -
  • componentgroupresourcescopesubresourceverbversion
-
-
apiserver_request_duration_seconds
-
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • componentdry_rungroupresourcescopesubresourceverbversion
-
-
apiserver_request_total
-
Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
-
    -
  • STABLE
  • -
  • Counter
  • -
  • codecomponentdry_rungroupresourcescopesubresourceverbversion
-
-
apiserver_requested_deprecated_apis
-
Gauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release.
-
    -
  • STABLE
  • -
  • Gauge
  • -
  • groupremoved_releaseresourcesubresourceversion
-
-
apiserver_response_sizes
-
Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • componentgroupresourcescopesubresourceverbversion
-
-
apiserver_storage_objects
-
Number of stored objects at the time of last check split by kind.
-
    -
  • STABLE
  • -
  • Gauge
  • -
  • resource
-
-
container_cpu_usage_seconds_total
-
Cumulative cpu time consumed by the container in core-seconds
-
    -
  • STABLE
  • -
  • Custom
  • -
  • containerpodnamespace
-
-
container_memory_working_set_bytes
-
Current working set of the container in bytes
-
    -
  • STABLE
  • -
  • Custom
  • -
  • containerpodnamespace
-
-
container_start_time_seconds
-
Start time of the container since unix epoch in seconds
-
    -
  • STABLE
  • -
  • Custom
  • -
  • containerpodnamespace
-
-
cronjob_controller_job_creation_skew_duration_seconds
-
Time between when a cronjob is scheduled to be run, and when the corresponding job is created
-
    -
  • STABLE
  • -
  • Histogram
  • -
-
-
job_controller_job_pods_finished_total
-
The number of finished Pods that are fully tracked
-
    -
  • STABLE
  • -
  • Counter
  • -
  • completion_moderesult
-
-
job_controller_job_sync_duration_seconds
-
The time it took to sync a job
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • actioncompletion_moderesult
-
-
job_controller_job_syncs_total
-
The number of job syncs
-
    -
  • STABLE
  • -
  • Counter
  • -
  • actioncompletion_moderesult
-
-
job_controller_jobs_finished_total
-
The number of finished jobs
-
    -
  • STABLE
  • -
  • Counter
  • -
  • completion_modereasonresult
-
-
kube_pod_resource_limit
-
Resources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
-
    -
  • STABLE
  • -
  • Custom
  • -
  • namespacepodnodeschedulerpriorityresourceunit
-
-
kube_pod_resource_request
-
Resources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
-
    -
  • STABLE
  • -
  • Custom
  • -
  • namespacepodnodeschedulerpriorityresourceunit
-
-
node_collector_evictions_total
-
Number of Node evictions that happened since current instance of NodeController started.
-
    -
  • STABLE
  • -
  • Counter
  • -
  • zone
-
-
node_cpu_usage_seconds_total
-
Cumulative cpu time consumed by the node in core-seconds
-
    -
  • STABLE
  • -
  • Custom
  • -
-
-
node_memory_working_set_bytes
-
Current working set of the node in bytes
-
    -
  • STABLE
  • -
  • Custom
  • -
-
-
pod_cpu_usage_seconds_total
-
Cumulative cpu time consumed by the pod in core-seconds
-
    -
  • STABLE
  • -
  • Custom
  • -
  • podnamespace
-
-
pod_memory_working_set_bytes
-
Current working set of the pod in bytes
-
    -
  • STABLE
  • -
  • Custom
  • -
  • podnamespace
-
-
resource_scrape_error
-
1 if there was an error while getting container metrics, 0 otherwise
-
    -
  • STABLE
  • -
  • Custom
  • -
-
-
scheduler_framework_extension_point_duration_seconds
-
Latency for running all plugins of a specific extension point.
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • extension_pointprofilestatus
-
-
scheduler_pending_pods
-
Number of pending pods, by the queue type. 'active' means number of pods in activeQ; 'backoff' means number of pods in backoffQ; 'unschedulable' means number of pods in unschedulablePods that the scheduler attempted to schedule and failed; 'gated' is the number of unschedulable pods that the scheduler never attempted to schedule because they are gated.
-
    -
  • STABLE
  • -
  • Gauge
  • -
  • queue
-
-
scheduler_pod_scheduling_attempts
-
Number of attempts to successfully schedule a pod.
-
    -
  • STABLE
  • -
  • Histogram
  • -
-
-
scheduler_pod_scheduling_duration_seconds
-
E2e latency for a pod being scheduled which may include multiple scheduling attempts.
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • attempts
  • 1.28.0
-
-
scheduler_preemption_attempts_total
-
Total preemption attempts in the cluster till now
-
    -
  • STABLE
  • -
  • Counter
  • -
-
-
scheduler_preemption_victims
-
Number of selected preemption victims
-
    -
  • STABLE
  • -
  • Histogram
  • -
-
-
scheduler_queue_incoming_pods_total
-
Number of pods added to scheduling queues by event and queue type.
-
    -
  • STABLE
  • -
  • Counter
  • -
  • eventqueue
-
-
scheduler_schedule_attempts_total
-
Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
-
    -
  • STABLE
  • -
  • Counter
  • -
  • profileresult
-
-
scheduler_scheduling_attempt_duration_seconds
-
Scheduling attempt latency in seconds (scheduling algorithm + binding)
-
    -
  • STABLE
  • -
  • Histogram
  • -
  • profileresult
-
-
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
apiserver_admission_controller_admission_duration_secondsSTABLEHistogramAdmission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
name
operation
rejected
type
apiserver_admission_step_admission_duration_secondsSTABLEHistogramAdmission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).
operation
rejected
type
apiserver_admission_webhook_admission_duration_secondsSTABLEHistogramAdmission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
name
operation
rejected
type
apiserver_current_inflight_requestsSTABLEGaugeMaximal number of currently used inflight request limit of this apiserver per request kind in last second.
request_kind
apiserver_longrunning_requestsSTABLEGaugeGauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
component
group
resource
scope
subresource
verb
version
apiserver_request_duration_secondsSTABLEHistogramResponse latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
component
dry_run
group
resource
scope
subresource
verb
version
apiserver_request_totalSTABLECounterCounter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
code
component
dry_run
group
resource
scope
subresource
verb
version
apiserver_requested_deprecated_apisSTABLEGaugeGauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release.
group
removed_release
resource
subresource
version
apiserver_response_sizesSTABLEHistogramResponse size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
apiserver_storage_objectsSTABLEGaugeNumber of stored objects at the time of last check split by kind.
resource
cronjob_controller_job_creation_skew_duration_secondsSTABLEHistogramTime between when a cronjob is scheduled to be run, and when the corresponding job is created
job_controller_job_pods_finished_totalSTABLECounterThe number of finished Pods that are fully tracked
completion_mode
result
job_controller_job_sync_duration_secondsSTABLEHistogramThe time it took to sync a job
action
completion_mode
result
job_controller_job_syncs_totalSTABLECounterThe number of job syncs
action
completion_mode
result
job_controller_jobs_finished_totalSTABLECounterThe number of finished jobs
completion_mode
reason
result
kube_pod_resource_limitSTABLECustomResources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
namespace
pod
node
scheduler
priority
resource
unit
kube_pod_resource_requestSTABLECustomResources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
namespace
pod
node
scheduler
priority
resource
unit
node_collector_evictions_totalSTABLECounterNumber of Node evictions that happened since current instance of NodeController started.
zone
scheduler_framework_extension_point_duration_secondsSTABLEHistogramLatency for running all plugins of a specific extension point.
extension_point
profile
status
scheduler_pending_podsSTABLEGaugeNumber of pending pods, by the queue type. 'active' means number of pods in activeQ; 'backoff' means number of pods in backoffQ; 'unschedulable' means number of pods in unschedulablePods that the scheduler attempted to schedule and failed; 'gated' is the number of unschedulable pods that the scheduler never attempted to schedule because they are gated.
queue
scheduler_pod_scheduling_attemptsSTABLEHistogramNumber of attempts to successfully schedule a pod.
scheduler_pod_scheduling_duration_secondsSTABLEHistogramE2e latency for a pod being scheduled which may include multiple scheduling attempts.
attempts
scheduler_preemption_attempts_totalSTABLECounterTotal preemption attempts in the cluster till now
scheduler_preemption_victimsSTABLEHistogramNumber of selected preemption victims
scheduler_queue_incoming_pods_totalSTABLECounterNumber of pods added to scheduling queues by event and queue type.
event
queue
scheduler_schedule_attempts_totalSTABLECounterNumber of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
profile
result
scheduler_scheduling_attempt_duration_secondsSTABLEHistogramScheduling attempt latency in seconds (scheduling algorithm + binding)
profile
result
### List of Beta Kubernetes Metrics -Beta metrics observe a looser API contract than its stable counterparts. No labels can be removed from beta metrics during their lifetime, however, labels can be added while the metric is in the beta stage. This offers the assurance that beta metrics will honor existing dashboards and alerts, while allowing for amendments in the future. +Beta metrics observe a looser API contract than its stable counterparts. No labels can be removed from beta metrics during their lifetime, however, labels can be added while the metric is in the beta stage. This offers the assurance that beta metrics will honor existing dashboards and alerts, while allowing for amendments in the future. + + + + + + + + + + + + + + -
-
apiserver_flowcontrol_current_executing_requests
-
Number of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
-
    -
  • BETA
  • -
  • Gauge
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_current_executing_seats
-
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
-
    -
  • BETA
  • -
  • Gauge
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_current_inqueue_requests
-
Number of requests currently pending in queues of the API Priority and Fairness subsystem
-
    -
  • BETA
  • -
  • Gauge
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_dispatched_requests_total
-
Number of requests executed by API Priority and Fairness subsystem
-
    -
  • BETA
  • -
  • Counter
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_nominal_limit_seats
-
Nominal number of execution seats configured for each priority level
-
    -
  • BETA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_rejected_requests_total
-
Number of requests rejected by API Priority and Fairness subsystem
-
    -
  • BETA
  • -
  • Counter
  • -
  • flow_schemapriority_levelreason
-
-
apiserver_flowcontrol_request_wait_duration_seconds
-
Length of time a request spent waiting in its queue
-
    -
  • BETA
  • -
  • Histogram
  • -
  • executeflow_schemapriority_level
-
-
disabled_metrics_total
-
The count of disabled metrics.
-
    -
  • BETA
  • -
  • Counter
  • -
-
-
hidden_metrics_total
-
The count of hidden metrics.
-
    -
  • BETA
  • -
  • Counter
  • -
-
-
kubernetes_feature_enabled
-
This metric records the data about the stage and enablement of a k8s feature.
-
    -
  • BETA
  • -
  • Gauge
  • -
  • namestage
-
-
kubernetes_healthcheck
-
This metric records the result of a single healthcheck.
-
    -
  • BETA
  • -
  • Gauge
  • -
  • nametype
-
-
kubernetes_healthchecks_total
-
This metric records the results of all healthcheck.
-
    -
  • BETA
  • -
  • Counter
  • -
  • namestatustype
-
-
registered_metrics_total
-
The count of registered metrics broken by stability level and deprecation version.
-
    -
  • BETA
  • -
  • Counter
  • -
  • deprecated_versionstability_level
-
-
scheduler_pod_scheduling_sli_duration_seconds
-
E2e latency for a pod being scheduled, from the time the pod enters the scheduling queue an d might involve multiple scheduling attempts.
-
    -
  • BETA
  • -
  • Histogram
  • -
  • attempts
-
-
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
apiserver_flowcontrol_current_executing_requestsBETAGaugeNumber of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_current_executing_seatsBETAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_current_inqueue_requestsBETAGaugeNumber of requests currently pending in queues of the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_dispatched_requests_totalBETACounterNumber of requests executed by API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_nominal_limit_seatsBETAGaugeNominal number of execution seats configured for each priority level
priority_level
apiserver_flowcontrol_rejected_requests_totalBETACounterNumber of requests rejected by API Priority and Fairness subsystem
flow_schema
priority_level
reason
apiserver_flowcontrol_request_wait_duration_secondsBETAHistogramLength of time a request spent waiting in its queue
execute
flow_schema
priority_level
disabled_metrics_totalBETACounterThe count of disabled metrics.
hidden_metrics_totalBETACounterThe count of hidden metrics.
kubernetes_feature_enabledBETAGaugeThis metric records the data about the stage and enablement of a k8s feature.
name
stage
kubernetes_healthcheckBETAGaugeThis metric records the result of a single healthcheck.
name
type
kubernetes_healthchecks_totalBETACounterThis metric records the results of all healthcheck.
name
status
type
registered_metrics_totalBETACounterThe count of registered metrics broken by stability level and deprecation version.
deprecated_version
stability_level
### List of Alpha Kubernetes Metrics -Alpha metrics do not have any API guarantees. These metrics must be used at your own risk, subsequent versions of Kubernetes may remove these metrics altogether, or mutate the API in such a way that breaks existing dashboards and alerts. +Alpha metrics do not have any API guarantees. These metrics must be used at your own risk, subsequent versions of Kubernetes may remove these metrics altogether, or mutate the API in such a way that breaks existing dashboards and alerts. + + + + + + + + + + + + + + -
-
aggregator_discovery_aggregation_count_total
-
Counter of number of times discovery was aggregated
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
aggregator_openapi_v2_regeneration_count
-
Counter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • apiservicereason
-
-
aggregator_openapi_v2_regeneration_duration
-
Gauge of OpenAPI v2 spec regeneration duration in seconds.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • reason
-
-
aggregator_unavailable_apiservice
-
Gauge of APIServices which are marked as unavailable broken down by APIService name.
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • name
-
-
aggregator_unavailable_apiservice_total
-
Counter of APIServices which are marked as unavailable broken down by APIService name and reason.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • namereason
-
-
apiextensions_openapi_v2_regeneration_count
-
Counter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • crdreason
-
-
apiextensions_openapi_v3_regeneration_count
-
Counter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • crdgroupreasonversion
-
-
apiserver_admission_match_condition_evaluation_errors_total
-
Admission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • kindnameoperationtype
-
-
apiserver_admission_match_condition_evaluation_seconds
-
Admission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • kindnameoperationtype
-
-
apiserver_admission_match_condition_exclusions_total
-
Admission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • kindnameoperationtype
-
-
apiserver_admission_step_admission_duration_seconds_summary
-
Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
-
    -
  • ALPHA
  • -
  • Summary
  • -
  • operationrejectedtype
-
-
apiserver_admission_webhook_fail_open_count
-
Admission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • nametype
-
-
apiserver_admission_webhook_rejection_count
-
Admission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • error_typenameoperationrejection_codetype
-
-
apiserver_admission_webhook_request_total
-
Admission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codenameoperationrejectedtype
-
-
apiserver_audit_error_total
-
Counter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • plugin
-
-
apiserver_audit_event_total
-
Counter of audit events generated and sent to the audit backend.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_audit_level_total
-
Counter of policy levels for audit events (1 per request).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • level
-
-
apiserver_audit_requests_rejected_total
-
Counter of apiserver requests rejected due to an error in audit logging backend.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_cache_list_fetched_objects_total
-
Number of objects read from watch cache in the course of serving a LIST request
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • indexresource_prefix
-
-
apiserver_cache_list_returned_objects_total
-
Number of objects returned for a LIST request from watch cache
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource_prefix
-
-
apiserver_cache_list_total
-
Number of LIST requests served from watch cache
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • indexresource_prefix
-
-
apiserver_cel_compilation_duration_seconds
-
CEL compilation time in seconds.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
apiserver_cel_evaluation_duration_seconds
-
CEL evaluation time in seconds.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
apiserver_certificates_registry_csr_honored_duration_total
-
Total number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • signerName
-
-
apiserver_certificates_registry_csr_requested_duration_total
-
Total number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • signerName
-
-
apiserver_client_certificate_expiration_seconds
-
Distribution of the remaining lifetime on the certificate used to authenticate a request.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
apiserver_conversion_webhook_duration_seconds
-
Conversion webhook request latency
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • failure_typeresult
-
-
apiserver_conversion_webhook_request_total
-
Counter for conversion webhook requests with success/failure and failure error type
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • failure_typeresult
-
-
apiserver_crd_conversion_webhook_duration_seconds
-
CRD webhook conversion duration in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • crd_namefrom_versionsucceededto_version
-
-
apiserver_current_inqueue_requests
-
Maximal number of queued requests in this apiserver per request kind in last second.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • request_kind
-
-
apiserver_delegated_authn_request_duration_seconds
-
Request latency in seconds. Broken down by status code.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • code
-
-
apiserver_delegated_authn_request_total
-
Number of HTTP requests partitioned by status code.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
apiserver_delegated_authz_request_duration_seconds
-
Request latency in seconds. Broken down by status code.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • code
-
-
apiserver_delegated_authz_request_total
-
Number of HTTP requests partitioned by status code.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
apiserver_egress_dialer_dial_duration_seconds
-
Dial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • protocoltransport
-
-
apiserver_egress_dialer_dial_failure_count
-
Dial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • protocolstagetransport
-
-
apiserver_egress_dialer_dial_start_total
-
Dial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • protocoltransport
-
-
apiserver_encryption_config_controller_automatic_reload_failures_total
-
Total number of failed automatic reloads of encryption configuration.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_encryption_config_controller_automatic_reload_last_timestamp_seconds
-
Timestamp of the last successful or failed automatic reload of encryption configuration.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • status
-
-
apiserver_encryption_config_controller_automatic_reload_success_total
-
Total number of successful automatic reloads of encryption configuration.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_envelope_encryption_dek_cache_fill_percent
-
Percent of the cache slots currently occupied by cached DEKs.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
apiserver_envelope_encryption_dek_cache_inter_arrival_time_seconds
-
Time (in seconds) of inter arrival of transformation requests.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • transformation_type
-
-
apiserver_envelope_encryption_dek_source_cache_size
-
Number of records in data encryption key (DEK) source cache. On a restart, this value is an approximation of the number of decrypt RPC calls the server will make to the KMS plugin.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • provider_name
-
-
apiserver_envelope_encryption_invalid_key_id_from_status_total
-
Number of times an invalid keyID is returned by the Status RPC call split by error.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • errorprovider_name
-
-
apiserver_envelope_encryption_key_id_hash_last_timestamp_seconds
-
The last time in seconds when a keyID was used.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • key_id_hashprovider_nametransformation_type
-
-
apiserver_envelope_encryption_key_id_hash_status_last_timestamp_seconds
-
The last time in seconds when a keyID was returned by the Status RPC call.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • key_id_hashprovider_name
-
-
apiserver_envelope_encryption_key_id_hash_total
-
Number of times a keyID is used split by transformation type and provider.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • key_id_hashprovider_nametransformation_type
-
-
apiserver_envelope_encryption_kms_operations_latency_seconds
-
KMS operation duration with gRPC error code status total.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • grpc_status_codemethod_nameprovider_name
-
-
apiserver_flowcontrol_current_inqueue_seats
-
Number of seats currently pending in queues of the API Priority and Fairness subsystem
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_current_limit_seats
-
current derived number of execution seats available to each priority level
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_current_r
-
R(time of last change)
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_demand_seats
-
Observations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
-
    -
  • ALPHA
  • -
  • TimingRatioHistogram
  • -
  • priority_level
-
-
apiserver_flowcontrol_demand_seats_average
-
Time-weighted average, over last adjustment period, of demand_seats
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_demand_seats_high_watermark
-
High watermark, over last adjustment period, of demand_seats
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_demand_seats_smoothed
-
Smoothed seat demands
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_demand_seats_stdev
-
Time-weighted standard deviation, over last adjustment period, of demand_seats
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_dispatch_r
-
R(time of last dispatch)
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_epoch_advance_total
-
Number of times the queueset's progress meter jumped backward
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • priority_levelsuccess
-
-
apiserver_flowcontrol_latest_s
-
S(most recently dispatched request)
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_lower_limit_seats
-
Configured lower bound on number of execution seats available to each priority level
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_next_discounted_s_bounds
-
min and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • boundpriority_level
-
-
apiserver_flowcontrol_next_s_bounds
-
min and max, over queues, of S(oldest waiting request in queue)
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • boundpriority_level
-
-
apiserver_flowcontrol_priority_level_request_utilization
-
Observations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
-
    -
  • ALPHA
  • -
  • TimingRatioHistogram
  • -
  • phasepriority_level
-
-
apiserver_flowcontrol_priority_level_seat_utilization
-
Observations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
-
    -
  • ALPHA
  • -
  • TimingRatioHistogram
  • -
  • priority_level
  • phase:executing
-
-
apiserver_flowcontrol_read_vs_write_current_requests
-
Observations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
-
    -
  • ALPHA
  • -
  • TimingRatioHistogram
  • -
  • phaserequest_kind
-
-
apiserver_flowcontrol_request_concurrency_in_use
-
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • flow_schemapriority_level
  • 1.31.0
-
-
apiserver_flowcontrol_request_concurrency_limit
-
Nominal number of execution seats configured for each priority level
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
  • 1.30.0
-
-
apiserver_flowcontrol_request_dispatch_no_accommodation_total
-
Number of times a dispatch attempt resulted in a non accommodation due to lack of available seats
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_request_execution_seconds
-
Duration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • flow_schemapriority_leveltype
-
-
apiserver_flowcontrol_request_queue_length_after_enqueue
-
Length of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_seat_fair_frac
-
Fair fraction of server's concurrency to allocate to each priority level that can use it
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
apiserver_flowcontrol_target_seats
-
Seat allocation targets
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_upper_limit_seats
-
Configured upper bound on number of execution seats available to each priority level
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • priority_level
-
-
apiserver_flowcontrol_watch_count_samples
-
count of watchers for mutating requests in API Priority and Fairness
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • flow_schemapriority_level
-
-
apiserver_flowcontrol_work_estimated_seats
-
Number of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • flow_schemapriority_level
-
-
apiserver_init_events_total
-
Counter of init events processed in watch cache broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_kube_aggregator_x509_insecure_sha1_total
-
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_kube_aggregator_x509_missing_san_total
-
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_request_aborts_total
-
Number of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • groupresourcescopesubresourceverbversion
-
-
apiserver_request_body_sizes
-
Apiserver request body sizes broken out by size.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • resourceverb
-
-
apiserver_request_filter_duration_seconds
-
Request filter latency distribution in seconds, for each filter type
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • filter
-
-
apiserver_request_post_timeout_total
-
Tracks the activity of the request handlers after the associated requests have been timed out by the apiserver
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • sourcestatus
-
-
apiserver_request_sli_duration_seconds
-
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • componentgroupresourcescopesubresourceverbversion
-
-
apiserver_request_slo_duration_seconds
-
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • componentgroupresourcescopesubresourceverbversion
  • 1.27.0
-
-
apiserver_request_terminations_total
-
Number of requests which apiserver terminated in self-defense.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codecomponentgroupresourcescopesubresourceverbversion
-
-
apiserver_request_timestamp_comparison_time
-
Time taken for comparison of old vs new objects in UPDATE or PATCH requests
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • code_path
-
-
apiserver_rerouted_request_total
-
Total number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
apiserver_selfrequest_total
-
Counter of apiserver self-requests broken out for each verb, API resource and subresource.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resourcesubresourceverb
-
-
apiserver_storage_data_key_generation_duration_seconds
-
Latencies in seconds of data encryption key(DEK) generation operations.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
apiserver_storage_data_key_generation_failures_total
-
Total number of failed data encryption key(DEK) generation operations.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_storage_db_total_size_in_bytes
-
Total size of the storage database file physically allocated in bytes.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • endpoint
  • 1.28.0
-
-
apiserver_storage_decode_errors_total
-
Number of stored object decode errors split by object type
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_envelope_transformation_cache_misses_total
-
Total number of cache misses while accessing key decryption key(KEK).
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_storage_events_received_total
-
Number of etcd events received split by kind.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_list_evaluated_objects_total
-
Number of objects tested in the course of serving a LIST request from storage
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_list_fetched_objects_total
-
Number of objects read from storage in the course of serving a LIST request
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_list_returned_objects_total
-
Number of objects returned for a LIST request from storage
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_list_total
-
Number of LIST requests served from storage
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_storage_size_bytes
-
Size of the storage database file physically allocated in bytes.
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • cluster
-
-
apiserver_storage_transformation_duration_seconds
-
Latencies in seconds of value transformation operations.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • transformation_typetransformer_prefix
-
-
apiserver_storage_transformation_operations_total
-
Total number of transformations. Successful transformation will have a status 'OK' and a varied status string when the transformation fails. This status and transformation_type fields may be used for alerting on encryption/decryption failure using transformation_type from_storage for decryption and to_storage for encryption
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • statustransformation_typetransformer_prefix
-
-
apiserver_terminated_watchers_total
-
Counter of watchers closed due to unresponsiveness broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_tls_handshake_errors_total
-
Number of requests dropped with 'TLS handshake error from' error
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_validating_admission_policy_check_duration_seconds
-
Validation admission latency for individual validation expressions in seconds, labeled by policy and further including binding, state and enforcement action taken.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • enforcement_actionpolicypolicy_bindingstate
-
-
apiserver_validating_admission_policy_check_total
-
Validation admission policy check total, labeled by policy and further identified by binding, enforcement action taken, and state.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • enforcement_actionpolicypolicy_bindingstate
-
-
apiserver_validating_admission_policy_definition_total
-
Validation admission policy count total, labeled by state and enforcement action.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • enforcement_actionstate
-
-
apiserver_watch_cache_events_dispatched_total
-
Counter of events dispatched in watch cache broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_watch_cache_events_received_total
-
Counter of events received in watch cache broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_watch_cache_initializations_total
-
Counter of watch cache initializations broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
apiserver_watch_events_sizes
-
Watch event size distribution in bytes
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • groupkindversion
-
-
apiserver_watch_events_total
-
Number of events sent in watch clients
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • groupkindversion
-
-
apiserver_webhooks_x509_insecure_sha1_total
-
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
apiserver_webhooks_x509_missing_san_total
-
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
attach_detach_controller_attachdetach_controller_forced_detaches
-
Number of times the A/D Controller performed a forced detach
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • reason
-
-
attachdetach_controller_total_volumes
-
Number of volumes in A/D Controller
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • plugin_namestate
-
-
authenticated_user_requests
-
Counter of authenticated requests broken out by username.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • username
-
-
authentication_attempts
-
Counter of authenticated attempts.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • result
-
-
authentication_duration_seconds
-
Authentication duration in seconds broken out by result.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • result
-
-
authentication_token_cache_active_fetch_count
-
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • status
-
-
authentication_token_cache_fetch_total
-
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • status
-
-
authentication_token_cache_request_duration_seconds
-
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • status
-
-
authentication_token_cache_request_total
-
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • status
-
-
authorization_attempts_total
-
Counter of authorization attempts broken down by result. It can be either 'allowed', 'denied', 'no-opinion' or 'error'.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • result
-
-
authorization_duration_seconds
-
Authorization duration in seconds broken out by result.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • result
-
-
cloud_provider_webhook_request_duration_seconds
-
Request latency in seconds. Broken down by status code.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • codewebhook
-
-
cloud_provider_webhook_request_total
-
Number of HTTP requests partitioned by status code.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codewebhook
-
-
cloudprovider_azure_api_request_duration_seconds
-
Latency of an Azure API call
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_azure_api_request_errors
-
Number of errors for an Azure API call
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_azure_api_request_ratelimited_count
-
Number of rate limited Azure API calls
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_azure_api_request_throttled_count
-
Number of throttled Azure API calls
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_azure_op_duration_seconds
-
Latency of an Azure service operation
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_azure_op_failure_count
-
Number of failed Azure service operations
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • requestresource_groupsourcesubscription_id
-
-
cloudprovider_gce_api_request_duration_seconds
-
Latency of a GCE API call
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • regionrequestversionzone
-
-
cloudprovider_gce_api_request_errors
-
Number of errors for an API call
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • regionrequestversionzone
-
-
cloudprovider_vsphere_api_request_duration_seconds
-
Latency of vsphere api call
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • request
-
-
cloudprovider_vsphere_api_request_errors
-
vsphere Api errors
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • request
-
-
cloudprovider_vsphere_operation_duration_seconds
-
Latency of vsphere operation call
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation
-
-
cloudprovider_vsphere_operation_errors
-
vsphere operation errors
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation
-
-
cloudprovider_vsphere_vcenter_versions
-
Versions for connected vSphere vCenters
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • hostnameversionbuild
-
-
container_swap_usage_bytes
-
Current amount of the container swap usage in bytes. Reported only on non-windows systems
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • containerpodnamespace
-
-
csi_operations_seconds
-
Container Storage Interface operation duration with gRPC error code status total
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • driver_namegrpc_status_codemethod_namemigrated
-
-
endpoint_slice_controller_changes
-
Number of EndpointSlice changes
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation
-
-
endpoint_slice_controller_desired_endpoint_slices
-
Number of EndpointSlices that would exist with perfect endpoint allocation
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
endpoint_slice_controller_endpoints_added_per_sync
-
Number of endpoints added on each Service sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_controller_endpoints_desired
-
Number of endpoints desired
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
endpoint_slice_controller_endpoints_removed_per_sync
-
Number of endpoints removed on each Service sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_controller_endpointslices_changed_per_sync
-
Number of EndpointSlices changed on each Service sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • topology
-
-
endpoint_slice_controller_num_endpoint_slices
-
Number of EndpointSlices
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
endpoint_slice_controller_syncs
-
Number of EndpointSlice syncs
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • result
-
-
endpoint_slice_mirroring_controller_addresses_skipped_per_sync
-
Number of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubset
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_mirroring_controller_changes
-
Number of EndpointSlice changes
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation
-
-
endpoint_slice_mirroring_controller_desired_endpoint_slices
-
Number of EndpointSlices that would exist with perfect endpoint allocation
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
endpoint_slice_mirroring_controller_endpoints_added_per_sync
-
Number of endpoints added on each Endpoints sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_mirroring_controller_endpoints_desired
-
Number of endpoints desired
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
endpoint_slice_mirroring_controller_endpoints_removed_per_sync
-
Number of endpoints removed on each Endpoints sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_mirroring_controller_endpoints_sync_duration
-
Duration of syncEndpoints() in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_mirroring_controller_endpoints_updated_per_sync
-
Number of endpoints updated on each Endpoints sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
endpoint_slice_mirroring_controller_num_endpoint_slices
-
Number of EndpointSlices
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
ephemeral_volume_controller_create_failures_total
-
Number of PersistenVolumeClaims creation requests
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
ephemeral_volume_controller_create_total
-
Number of PersistenVolumeClaims creation requests
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
etcd_bookmark_counts
-
Number of etcd bookmarks (progress notify events) split by kind.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • resource
-
-
etcd_lease_object_counts
-
Number of objects attached to a single etcd lease.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
etcd_request_duration_seconds
-
Etcd request latency in seconds for each operation and object type.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operationtype
-
-
etcd_request_errors_total
-
Etcd failed request counts for each operation and object type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operationtype
-
-
etcd_requests_total
-
Etcd request counts for each operation and object type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operationtype
-
-
etcd_version_info
-
Etcd server's binary version
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • binary_version
-
-
field_validation_request_duration_seconds
-
Response latency distribution in seconds for each field validation value
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • field_validation
-
-
force_cleaned_failed_volume_operation_errors_total
-
The number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
force_cleaned_failed_volume_operations_total
-
The number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
garbagecollector_controller_resources_sync_error_total
-
Number of garbage collector resources sync errors
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
get_token_count
-
Counter of total Token() requests to the alternate token source
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
get_token_fail_count
-
Counter of failed Token() requests to the alternate token source
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
horizontal_pod_autoscaler_controller_metric_computation_duration_seconds
-
The time(seconds) that the HPA controller takes to calculate one metric. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. The label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • actionerrormetric_type
-
-
horizontal_pod_autoscaler_controller_metric_computation_total
-
Number of metric computations. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • actionerrormetric_type
-
-
horizontal_pod_autoscaler_controller_reconciliation_duration_seconds
-
The time(seconds) that the HPA controller takes to reconcile once. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • actionerror
-
-
horizontal_pod_autoscaler_controller_reconciliations_total
-
Number of reconciliations of HPA controller. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • actionerror
-
-
job_controller_pod_failures_handled_by_failure_policy_total
-
`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, "FailJob", "Ignore" and "Count".`
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • action
-
-
job_controller_terminated_pods_tracking_finalizer_total
-
`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".`
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • event
-
-
kube_apiserver_clusterip_allocator_allocated_ips
-
Gauge measuring the number of allocated IPs for Services
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • cidr
-
-
kube_apiserver_clusterip_allocator_allocation_errors_total
-
Number of errors trying to allocate Cluster IPs
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • cidrscope
-
-
kube_apiserver_clusterip_allocator_allocation_total
-
Number of Cluster IPs allocations
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • cidrscope
-
-
kube_apiserver_clusterip_allocator_available_ips
-
Gauge measuring the number of available IPs for Services
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • cidr
-
-
kube_apiserver_nodeport_allocator_allocated_ports
-
Gauge measuring the number of allocated NodePorts for Services
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kube_apiserver_nodeport_allocator_available_ports
-
Gauge measuring the number of available NodePorts for Services
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kube_apiserver_pod_logs_backend_tls_failure_total
-
Total number of requests for pods/logs that failed due to kubelet server TLS verification
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kube_apiserver_pod_logs_insecure_backend_total
-
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • usage
-
-
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_total
-
Total number of requests for pods/logs that failed due to kubelet server TLS verification
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • 1.27.0
-
-
kube_apiserver_pod_logs_pods_logs_insecure_backend_total
-
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • usage
  • 1.27.0
-
-
kubelet_active_pods
-
The number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • static
-
-
kubelet_certificate_manager_client_expiration_renew_errors
-
Counter of certificate renewal errors.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_certificate_manager_client_ttl_seconds
-
Gauge of the TTL (time-to-live) of the Kubelet's client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_certificate_manager_server_rotation_seconds
-
Histogram of the number of seconds the previous certificate lived before being rotated.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_certificate_manager_server_ttl_seconds
-
Gauge of the shortest TTL (time-to-live) of the Kubelet's serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_cgroup_manager_duration_seconds
-
Duration in seconds for cgroup manager operations. Broken down by method.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation_type
-
-
kubelet_container_log_filesystem_used_bytes
-
Bytes used by the container's logs on the filesystem.
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • uidnamespacepodcontainer
-
-
kubelet_containers_per_pod_count
-
The number of containers per pod.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_cpu_manager_pinning_errors_total
-
The number of cpu core allocations which required pinning failed.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_cpu_manager_pinning_requests_total
-
The number of cpu core allocations which required pinning.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_credential_provider_plugin_duration
-
Duration of execution in seconds for credential provider plugin
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • plugin_name
-
-
kubelet_credential_provider_plugin_errors
-
Number of errors from credential provider plugin
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • plugin_name
-
-
kubelet_desired_pods
-
The number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • static
-
-
kubelet_device_plugin_alloc_duration_seconds
-
Duration in seconds to serve a device plugin Allocation request. Broken down by resource name.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • resource_name
-
-
kubelet_device_plugin_registration_total
-
Cumulative number of device plugin registrations. Broken down by resource name.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource_name
-
-
kubelet_evented_pleg_connection_error_count
-
The number of errors encountered during the establishment of streaming connection with the CRI runtime.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_evented_pleg_connection_latency_seconds
-
The latency of streaming connection with the CRI runtime, measured in seconds.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_evented_pleg_connection_success_count
-
The number of times a streaming client was obtained to receive CRI Events.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_eviction_stats_age_seconds
-
Time between when stats are collected, and when pod is evicted based on those stats by eviction signal
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • eviction_signal
-
-
kubelet_evictions
-
Cumulative number of pod evictions by eviction signal
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • eviction_signal
-
-
kubelet_graceful_shutdown_end_time_seconds
-
Last graceful shutdown start time since unix epoch in seconds
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_graceful_shutdown_start_time_seconds
-
Last graceful shutdown start time since unix epoch in seconds
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_http_inflight_requests
-
Number of the inflight http requests
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • long_runningmethodpathserver_type
-
-
kubelet_http_requests_duration_seconds
-
Duration in seconds to serve http requests
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • long_runningmethodpathserver_type
-
-
kubelet_http_requests_total
-
Number of the http requests received since the server started
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • long_runningmethodpathserver_type
-
-
kubelet_lifecycle_handler_http_fallbacks_total
-
The number of times lifecycle handlers successfully fell back to http from https.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_managed_ephemeral_containers
-
Current number of ephemeral containers in pods managed by this kubelet.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_mirror_pods
-
The number of mirror pods the kubelet will try to create (one per admitted static pod)
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_node_name
-
The node's name. The count is always 1.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • node
-
-
kubelet_orphan_pod_cleaned_volumes
-
The total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_orphan_pod_cleaned_volumes_errors
-
The number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_orphaned_runtime_pods_total
-
Number of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_pleg_discard_events
-
The number of discard events in PLEG.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_pleg_last_seen_seconds
-
Timestamp in seconds when PLEG was last seen active.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_pleg_relist_duration_seconds
-
Duration in seconds for relisting pods in PLEG.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_pleg_relist_interval_seconds
-
Interval in seconds between relisting in PLEG.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_pod_resources_endpoint_errors_get
-
Number of requests to the PodResource Get endpoint which returned error. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_errors_get_allocatable
-
Number of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_errors_list
-
Number of requests to the PodResource List endpoint which returned error. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_requests_get
-
Number of requests to the PodResource Get endpoint. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_requests_get_allocatable
-
Number of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_requests_list
-
Number of requests to the PodResource List endpoint. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_resources_endpoint_requests_total
-
Cumulative number of requests to the PodResource endpoint. Broken down by server api version.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • server_api_version
-
-
kubelet_pod_start_duration_seconds
-
Duration in seconds from kubelet seeing a pod for the first time to the pod starting to run
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_pod_start_sli_duration_seconds
-
Duration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_pod_status_sync_duration_seconds
-
Duration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_pod_worker_duration_seconds
-
Duration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation_type
-
-
kubelet_pod_worker_start_duration_seconds
-
Duration in seconds from kubelet seeing a pod to starting a worker.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_preemptions
-
Cumulative number of pod preemptions by preemption resource
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • preemption_signal
-
-
kubelet_restarted_pods_total
-
Number of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • static
-
-
kubelet_run_podsandbox_duration_seconds
-
Duration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • runtime_handler
-
-
kubelet_run_podsandbox_errors_total
-
Cumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • runtime_handler
-
-
kubelet_running_containers
-
Number of containers currently running
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • container_state
-
-
kubelet_running_pods
-
Number of pods that have a running pod sandbox
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubelet_runtime_operations_duration_seconds
-
Duration in seconds of runtime operations. Broken down by operation type.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation_type
-
-
kubelet_runtime_operations_errors_total
-
Cumulative number of runtime operation errors by operation type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation_type
-
-
kubelet_runtime_operations_total
-
Cumulative number of runtime operations by operation type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation_type
-
-
kubelet_server_expiration_renew_errors
-
Counter of certificate renewal errors.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_started_containers_errors_total
-
Cumulative number of errors when starting containers
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codecontainer_type
-
-
kubelet_started_containers_total
-
Cumulative number of containers started
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • container_type
-
-
kubelet_started_host_process_containers_errors_total
-
Cumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codecontainer_type
-
-
kubelet_started_host_process_containers_total
-
Cumulative number of hostprocess containers started. This metric will only be collected on Windows.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • container_type
-
-
kubelet_started_pods_errors_total
-
Cumulative number of errors when starting pods
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_started_pods_total
-
Cumulative number of pods started
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_topology_manager_admission_duration_ms
-
Duration in milliseconds to serve a pod admission request.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubelet_topology_manager_admission_errors_total
-
The number of admission request failures where resources could not be aligned.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_topology_manager_admission_requests_total
-
The number of admission requests where resources have to be aligned.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubelet_volume_metric_collection_duration_seconds
-
Duration in seconds to calculate volume stats
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • metric_source
-
-
kubelet_volume_stats_available_bytes
-
Number of available bytes in the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_capacity_bytes
-
Capacity in bytes of the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_health_status_abnormal
-
Abnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_inodes
-
Maximum number of inodes in the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_inodes_free
-
Number of free inodes in the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_inodes_used
-
Number of used inodes in the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_volume_stats_used_bytes
-
Number of used bytes in the volume
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespacepersistentvolumeclaim
-
-
kubelet_working_pods
-
Number of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • configlifecyclestatic
-
-
kubeproxy_network_programming_duration_seconds
-
In Cluster Network Programming Latency in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubeproxy_proxy_healthz_total
-
Cumulative proxy healthz HTTP status
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
kubeproxy_proxy_livez_total
-
Cumulative proxy livez HTTP status
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
kubeproxy_sync_full_proxy_rules_duration_seconds
-
SyncProxyRules latency in seconds for full resyncs
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubeproxy_sync_partial_proxy_rules_duration_seconds
-
SyncProxyRules latency in seconds for partial resyncs
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubeproxy_sync_proxy_rules_duration_seconds
-
SyncProxyRules latency in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
kubeproxy_sync_proxy_rules_endpoint_changes_pending
-
Pending proxy rules Endpoint changes
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubeproxy_sync_proxy_rules_endpoint_changes_total
-
Cumulative proxy rules Endpoint changes
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubeproxy_sync_proxy_rules_iptables_last
-
Number of iptables rules written by kube-proxy in last sync
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • table
-
-
kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_total
-
Cumulative proxy iptables partial restore failures
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubeproxy_sync_proxy_rules_iptables_restore_failures_total
-
Cumulative proxy iptables restore failures
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubeproxy_sync_proxy_rules_iptables_total
-
Total number of iptables rules owned by kube-proxy
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • table
-
-
kubeproxy_sync_proxy_rules_last_queued_timestamp_seconds
-
The last time a sync of proxy rules was queued
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubeproxy_sync_proxy_rules_last_timestamp_seconds
-
The last time proxy rules were successfully synced
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubeproxy_sync_proxy_rules_no_local_endpoints_total
-
Number of services with a Local traffic policy and no endpoints
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • traffic_policy
-
-
kubeproxy_sync_proxy_rules_service_changes_pending
-
Pending proxy rules Service changes
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
kubeproxy_sync_proxy_rules_service_changes_total
-
Cumulative proxy rules Service changes
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
kubernetes_build_info
-
A metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • build_datecompilergit_commitgit_tree_stategit_versiongo_versionmajorminorplatform
-
-
leader_election_master_status
-
Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. 'name' is the string used to identify the lease. Please make sure to group by name.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • name
-
-
node_authorizer_graph_actions_duration_seconds
-
Histogram of duration of graph actions in node authorizer.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation
-
-
node_collector_unhealthy_nodes_in_zone
-
Gauge measuring number of not Ready Nodes per zones.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • zone
-
-
node_collector_update_all_nodes_health_duration_seconds
-
Duration in seconds for NodeController to update the health of all nodes.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
node_collector_update_node_health_duration_seconds
-
Duration in seconds for NodeController to update the health of a single node.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
node_collector_zone_health
-
Gauge measuring percentage of healthy nodes per zone.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • zone
-
-
node_collector_zone_size
-
Gauge measuring number of registered Nodes per zones.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • zone
-
-
node_controller_cloud_provider_taint_removal_delay_seconds
-
Number of seconds after node creation when NodeController removed the cloud-provider taint of a single node.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
node_controller_initial_node_sync_delay_seconds
-
Number of seconds after node creation when NodeController finished the initial synchronization of a single node.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
node_ipam_controller_cidrset_allocation_tries_per_request
-
Number of endpoints added on each Service sync
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • clusterCIDR
-
-
node_ipam_controller_cidrset_cidrs_allocations_total
-
Counter measuring total number of CIDR allocations.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • clusterCIDR
-
-
node_ipam_controller_cidrset_cidrs_releases_total
-
Counter measuring total number of CIDR releases.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • clusterCIDR
-
-
node_ipam_controller_cidrset_usage_cidrs
-
Gauge measuring percentage of allocated CIDRs.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • clusterCIDR
-
-
node_ipam_controller_cirdset_max_cidrs
-
Maximum number of CIDRs that can be allocated.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • clusterCIDR
-
-
node_ipam_controller_multicidrset_allocation_tries_per_request
-
Histogram measuring CIDR allocation tries per request.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • clusterCIDR
-
-
node_ipam_controller_multicidrset_cidrs_allocations_total
-
Counter measuring total number of CIDR allocations.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • clusterCIDR
-
-
node_ipam_controller_multicidrset_cidrs_releases_total
-
Counter measuring total number of CIDR releases.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • clusterCIDR
-
-
node_ipam_controller_multicidrset_usage_cidrs
-
Gauge measuring percentage of allocated CIDRs.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • clusterCIDR
-
-
node_ipam_controller_multicirdset_max_cidrs
-
Maximum number of CIDRs that can be allocated.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • clusterCIDR
-
-
node_swap_usage_bytes
-
Current swap usage of the node in bytes. Reported only on non-windows systems
-
    -
  • ALPHA
  • -
  • Custom
  • -
-
-
number_of_l4_ilbs
-
Number of L4 ILBs
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • feature
-
-
plugin_manager_total_plugins
-
Number of plugins in Plugin Manager
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • socket_pathstate
-
-
pod_gc_collector_force_delete_pod_errors_total
-
Number of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • namespacereason
-
-
pod_gc_collector_force_delete_pods_total
-
Number of pods that are being forcefully deleted since the Pod GC Controller started.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • namespacereason
-
-
pod_security_errors_total
-
Number of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for evaluation.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • fatalrequest_operationresourcesubresource
-
-
pod_security_evaluations_total
-
Number of policy evaluations that occurred, not counting ignored or exempt requests.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • decisionmodepolicy_levelpolicy_versionrequest_operationresourcesubresource
-
-
pod_security_exemptions_total
-
Number of exempt requests, not counting ignored or out of scope requests.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • request_operationresourcesubresource
-
-
pod_swap_usage_bytes
-
Current amount of the pod swap usage in bytes. Reported only on non-windows systems
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • podnamespace
-
-
prober_probe_duration_seconds
-
Duration in seconds for a probe response.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • containernamespacepodprobe_type
-
-
prober_probe_total
-
Cumulative number of a liveness, readiness or startup probe for a container by result.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • containernamespacepodpod_uidprobe_typeresult
-
-
pv_collector_bound_pv_count
-
Gauge measuring number of persistent volume currently bound
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • storage_class
-
-
pv_collector_bound_pvc_count
-
Gauge measuring number of persistent volume claim currently bound
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespace
-
-
pv_collector_total_pv_count
-
Gauge measuring total number of persistent volumes
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • plugin_namevolume_mode
-
-
pv_collector_unbound_pv_count
-
Gauge measuring number of persistent volume currently unbound
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • storage_class
-
-
pv_collector_unbound_pvc_count
-
Gauge measuring number of persistent volume claim currently unbound
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • namespace
-
-
reconstruct_volume_operations_errors_total
-
The number of volumes that failed reconstruction from the operating system during kubelet startup.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
reconstruct_volume_operations_total
-
The number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
replicaset_controller_sorting_deletion_age_ratio
-
The ratio of chosen deleted pod's ages to the current youngest pod's age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate's effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
resourceclaim_controller_create_attempts_total
-
Number of ResourceClaims creation requests
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
resourceclaim_controller_create_failures_total
-
Number of ResourceClaims creation request failures
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
rest_client_dns_resolution_duration_seconds
-
DNS resolver latency in seconds. Broken down by host.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • host
-
-
rest_client_exec_plugin_call_total
-
Number of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • call_statuscode
-
-
rest_client_exec_plugin_certificate_rotation_age
-
Histogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
rest_client_exec_plugin_ttl_seconds
-
Gauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
rest_client_rate_limiter_duration_seconds
-
Client side rate limiter latency in seconds. Broken down by verb, and host.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • hostverb
-
-
rest_client_request_duration_seconds
-
Request latency in seconds. Broken down by verb, and host.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • hostverb
-
-
rest_client_request_retries_total
-
Number of request retries, partitioned by status code, verb, and host.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codehostverb
-
-
rest_client_request_size_bytes
-
Request size in bytes. Broken down by verb and host.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • hostverb
-
-
rest_client_requests_total
-
Number of HTTP requests, partitioned by status code, method, and host.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • codehostmethod
-
-
rest_client_response_size_bytes
-
Response size in bytes. Broken down by verb and host.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • hostverb
-
-
rest_client_transport_cache_entries
-
Number of transport entries in the internal cache.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
rest_client_transport_create_calls_total
-
Number of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • result
-
-
retroactive_storageclass_errors_total
-
Total number of failed retroactive StorageClass assignments to persistent volume claim
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
retroactive_storageclass_total
-
Total number of retroactive StorageClass assignments to persistent volume claim
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
root_ca_cert_publisher_sync_duration_seconds
-
Number of namespace syncs happened in root ca cert publisher.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • code
-
-
root_ca_cert_publisher_sync_total
-
Number of namespace syncs happened in root ca cert publisher.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • code
-
-
running_managed_controllers
-
Indicates where instances of a controller are currently running
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • managername
-
-
scheduler_goroutines
-
Number of running goroutines split by the work they do such as binding.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • operation
-
-
scheduler_permit_wait_duration_seconds
-
Duration of waiting on permit.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • result
-
-
scheduler_plugin_evaluation_total
-
Number of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • extension_pointpluginprofile
-
-
scheduler_plugin_execution_duration_seconds
-
Duration for running a plugin at a specific extension point.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • extension_pointpluginstatus
-
-
scheduler_scheduler_cache_size
-
Number of nodes, pods, and assumed (bound) pods in the scheduler cache.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • type
-
-
scheduler_scheduling_algorithm_duration_seconds
-
Scheduling algorithm latency in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
scheduler_unschedulable_pods
-
The number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • pluginprofile
-
-
scheduler_volume_binder_cache_requests_total
-
Total number for request volume binding cache
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation
-
-
scheduler_volume_scheduling_stage_error_total
-
Volume scheduling stage error count
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation
-
-
scrape_error
-
1 if there was an error while getting container metrics, 0 otherwise
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • 1.29.0
-
-
service_controller_loadbalancer_sync_total
-
A metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the cluster
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
service_controller_nodesync_error_total
-
A metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the cluster
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
service_controller_nodesync_latency_seconds
-
A metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
service_controller_update_loadbalancer_host_latency_seconds
-
A metric measuring the latency for updating each load balancer hosts.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
serviceaccount_legacy_auto_token_uses_total
-
Cumulative auto-generated legacy tokens used
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
serviceaccount_legacy_manual_token_uses_total
-
Cumulative manually created legacy tokens used
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
serviceaccount_legacy_tokens_total
-
Cumulative legacy service account tokens used
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
serviceaccount_stale_tokens_total
-
Cumulative stale projected service account tokens used
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
serviceaccount_valid_tokens_total
-
Cumulative valid projected service account tokens used
-
    -
  • ALPHA
  • -
  • Counter
  • -
-
-
storage_count_attachable_volumes_in_use
-
Measure number of volumes in use
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • nodevolume_plugin
-
-
storage_operation_duration_seconds
-
Storage operation duration
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • migratedoperation_namestatusvolume_plugin
-
-
ttl_after_finished_controller_job_deletion_duration_seconds
-
The time it took to delete the job since it became eligible for deletion
-
    -
  • ALPHA
  • -
  • Histogram
  • -
-
-
volume_manager_selinux_container_errors_total
-
Number of errors when kubelet cannot compute SELinux context for a container. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_container_warnings_total
-
Number of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_pod_context_mismatch_errors_total
-
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_pod_context_mismatch_warnings_total
-
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_volume_context_mismatch_errors_total
-
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_volume_context_mismatch_warnings_total
-
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_selinux_volumes_admitted_total
-
Number of volumes whose SELinux context was fine and will be mounted with mount -o context option.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
-
-
volume_manager_total_volumes
-
Number of volumes in Volume Manager
-
    -
  • ALPHA
  • -
  • Custom
  • -
  • plugin_namestate
-
-
volume_operation_total_errors
-
Total volume operation errors
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • operation_nameplugin_name
-
-
volume_operation_total_seconds
-
Storage operation end to end duration in seconds
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • operation_nameplugin_name
-
-
watch_cache_capacity
-
Total capacity of watch cache broken by resource type.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • resource
-
-
watch_cache_capacity_decrease_total
-
Total number of watch cache capacity decrease events broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
watch_cache_capacity_increase_total
-
Total number of watch cache capacity increase events broken by resource type.
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • resource
-
-
workqueue_adds_total
-
Total number of adds handled by workqueue
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • name
-
-
workqueue_depth
-
Current depth of workqueue
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • name
-
-
workqueue_longest_running_processor_seconds
-
How many seconds has the longest running processor for workqueue been running.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • name
-
-
workqueue_queue_duration_seconds
-
How long in seconds an item stays in workqueue before being requested.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • name
-
-
workqueue_retries_total
-
Total number of retries handled by workqueue
-
    -
  • ALPHA
  • -
  • Counter
  • -
  • name
-
-
workqueue_unfinished_work_seconds
-
How many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
-
    -
  • ALPHA
  • -
  • Gauge
  • -
  • name
-
-
workqueue_work_duration_seconds
-
How long in seconds processing an item from workqueue takes.
-
    -
  • ALPHA
  • -
  • Histogram
  • -
  • name
-
-

NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
aggregator_discovery_aggregation_count_totalALPHACounterCounter of number of times discovery was aggregated
aggregator_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
apiservice
reason
aggregator_openapi_v2_regeneration_durationALPHAGaugeGauge of OpenAPI v2 spec regeneration duration in seconds.
reason
aggregator_unavailable_apiserviceALPHACustomGauge of APIServices which are marked as unavailable broken down by APIService name.
name
aggregator_unavailable_apiservice_totalALPHACounterCounter of APIServices which are marked as unavailable broken down by APIService name and reason.
name
reason
apiextensions_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
crd
reason
apiextensions_openapi_v3_regeneration_countALPHACounterCounter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
crd
group
reason
version
apiserver_admission_match_condition_evaluation_errors_totalALPHACounterAdmission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
kind
name
operation
type
apiserver_admission_match_condition_evaluation_secondsALPHAHistogramAdmission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).
kind
name
operation
type
apiserver_admission_match_condition_exclusions_totalALPHACounterAdmission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
kind
name
operation
type
apiserver_admission_step_admission_duration_seconds_summaryALPHASummaryAdmission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
operation
rejected
type
apiserver_admission_webhook_fail_open_countALPHACounterAdmission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).
name
type
apiserver_admission_webhook_rejection_countALPHACounterAdmission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
error_type
name
operation
rejection_code
type
apiserver_admission_webhook_request_totalALPHACounterAdmission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
code
name
operation
rejected
type
apiserver_audit_error_totalALPHACounterCounter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
plugin
apiserver_audit_event_totalALPHACounterCounter of audit events generated and sent to the audit backend.
apiserver_audit_level_totalALPHACounterCounter of policy levels for audit events (1 per request).
level
apiserver_audit_requests_rejected_totalALPHACounterCounter of apiserver requests rejected due to an error in audit logging backend.
apiserver_cache_list_fetched_objects_totalALPHACounterNumber of objects read from watch cache in the course of serving a LIST request
index
resource_prefix
apiserver_cache_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from watch cache
resource_prefix
apiserver_cache_list_totalALPHACounterNumber of LIST requests served from watch cache
index
resource_prefix
apiserver_cel_compilation_duration_secondsALPHAHistogramCEL compilation time in seconds.
apiserver_cel_evaluation_duration_secondsALPHAHistogramCEL evaluation time in seconds.
apiserver_certificates_registry_csr_honored_duration_totalALPHACounterTotal number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
signerName
apiserver_certificates_registry_csr_requested_duration_totalALPHACounterTotal number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
signerName
apiserver_client_certificate_expiration_secondsALPHAHistogramDistribution of the remaining lifetime on the certificate used to authenticate a request.
apiserver_conversion_webhook_duration_secondsALPHAHistogramConversion webhook request latency
failure_type
result
apiserver_conversion_webhook_request_totalALPHACounterCounter for conversion webhook requests with success/failure and failure error type
failure_type
result
apiserver_crd_conversion_webhook_duration_secondsALPHAHistogramCRD webhook conversion duration in seconds
crd_name
from_version
succeeded
to_version
apiserver_current_inqueue_requestsALPHAGaugeMaximal number of queued requests in this apiserver per request kind in last second.
request_kind
apiserver_delegated_authn_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
apiserver_delegated_authn_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
apiserver_delegated_authz_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
apiserver_delegated_authz_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
apiserver_egress_dialer_dial_duration_secondsALPHAHistogramDial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
protocol
transport
apiserver_egress_dialer_dial_failure_countALPHACounterDial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
protocol
stage
transport
apiserver_egress_dialer_dial_start_totalALPHACounterDial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).
protocol
transport
apiserver_encryption_config_controller_automatic_reload_failures_totalALPHACounterTotal number of failed automatic reloads of encryption configuration.
apiserver_encryption_config_controller_automatic_reload_last_timestamp_secondsALPHAGaugeTimestamp of the last successful or failed automatic reload of encryption configuration.
status
apiserver_encryption_config_controller_automatic_reload_success_totalALPHACounterTotal number of successful automatic reloads of encryption configuration.
apiserver_envelope_encryption_dek_cache_fill_percentALPHAGaugePercent of the cache slots currently occupied by cached DEKs.
apiserver_envelope_encryption_dek_cache_inter_arrival_time_secondsALPHAHistogramTime (in seconds) of inter arrival of transformation requests.
transformation_type
apiserver_envelope_encryption_invalid_key_id_from_status_totalALPHACounterNumber of times an invalid keyID is returned by the Status RPC call split by error.
error
provider_name
apiserver_envelope_encryption_key_id_hash_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was used.
key_id_hash
provider_name
transformation_type
apiserver_envelope_encryption_key_id_hash_status_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was returned by the Status RPC call.
key_id_hash
provider_name
apiserver_envelope_encryption_key_id_hash_totalALPHACounterNumber of times a keyID is used split by transformation type and provider.
key_id_hash
provider_name
transformation_type
apiserver_envelope_encryption_kms_operations_latency_secondsALPHAHistogramKMS operation duration with gRPC error code status total.
grpc_status_code
method_name
provider_name
apiserver_flowcontrol_current_limit_seatsALPHAGaugecurrent derived number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_current_rALPHAGaugeR(time of last change)
priority_level
apiserver_flowcontrol_demand_seatsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
priority_level
apiserver_flowcontrol_demand_seats_averageALPHAGaugeTime-weighted average, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_demand_seats_high_watermarkALPHAGaugeHigh watermark, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_demand_seats_smoothedALPHAGaugeSmoothed seat demands
priority_level
apiserver_flowcontrol_demand_seats_stdevALPHAGaugeTime-weighted standard deviation, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_dispatch_rALPHAGaugeR(time of last dispatch)
priority_level
apiserver_flowcontrol_epoch_advance_totalALPHACounterNumber of times the queueset's progress meter jumped backward
priority_level
success
apiserver_flowcontrol_latest_sALPHAGaugeS(most recently dispatched request)
priority_level
apiserver_flowcontrol_lower_limit_seatsALPHAGaugeConfigured lower bound on number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_next_discounted_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
bound
priority_level
apiserver_flowcontrol_next_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue)
bound
priority_level
apiserver_flowcontrol_priority_level_request_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
phase
priority_level
apiserver_flowcontrol_priority_level_seat_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
priority_level
phase:executing
apiserver_flowcontrol_read_vs_write_current_requestsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
phase
request_kind
apiserver_flowcontrol_request_concurrency_in_useALPHAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
flow_schema
priority_level
1.31.0
apiserver_flowcontrol_request_concurrency_limitALPHAGaugeNominal number of execution seats configured for each priority level
priority_level
1.30.0
apiserver_flowcontrol_request_dispatch_no_accommodation_totalALPHACounterNumber of times a dispatch attempt resulted in a non accommodation due to lack of available seats
flow_schema
priority_level
apiserver_flowcontrol_request_execution_secondsALPHAHistogramDuration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
flow_schema
priority_level
type
apiserver_flowcontrol_request_queue_length_after_enqueueALPHAHistogramLength of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
flow_schema
priority_level
apiserver_flowcontrol_seat_fair_fracALPHAGaugeFair fraction of server's concurrency to allocate to each priority level that can use it
apiserver_flowcontrol_target_seatsALPHAGaugeSeat allocation targets
priority_level
apiserver_flowcontrol_upper_limit_seatsALPHAGaugeConfigured upper bound on number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_watch_count_samplesALPHAHistogramcount of watchers for mutating requests in API Priority and Fairness
flow_schema
priority_level
apiserver_flowcontrol_work_estimated_seatsALPHAHistogramNumber of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
flow_schema
priority_level
apiserver_init_events_totalALPHACounterCounter of init events processed in watch cache broken by resource type.
resource
apiserver_kube_aggregator_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
apiserver_kube_aggregator_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
apiserver_request_aborts_totalALPHACounterNumber of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
group
resource
scope
subresource
verb
version
apiserver_request_body_sizesALPHAHistogramApiserver request body sizes broken out by size.
resource
verb
apiserver_request_filter_duration_secondsALPHAHistogramRequest filter latency distribution in seconds, for each filter type
filter
apiserver_request_post_timeout_totalALPHACounterTracks the activity of the request handlers after the associated requests have been timed out by the apiserver
source
status
apiserver_request_sli_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
apiserver_request_slo_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
1.27.0
apiserver_request_terminations_totalALPHACounterNumber of requests which apiserver terminated in self-defense.
code
component
group
resource
scope
subresource
verb
version
apiserver_request_timestamp_comparison_timeALPHAHistogramTime taken for comparison of old vs new objects in UPDATE or PATCH requests
code_path
apiserver_rerouted_request_totalALPHACounterTotal number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it
code
apiserver_selfrequest_totalALPHACounterCounter of apiserver self-requests broken out for each verb, API resource and subresource.
resource
subresource
verb
apiserver_storage_data_key_generation_duration_secondsALPHAHistogramLatencies in seconds of data encryption key(DEK) generation operations.
apiserver_storage_data_key_generation_failures_totalALPHACounterTotal number of failed data encryption key(DEK) generation operations.
apiserver_storage_db_total_size_in_bytesALPHAGaugeTotal size of the storage database file physically allocated in bytes.
endpoint
1.28.0
apiserver_storage_decode_errors_totalALPHACounterNumber of stored object decode errors split by object type
resource
apiserver_storage_envelope_transformation_cache_misses_totalALPHACounterTotal number of cache misses while accessing key decryption key(KEK).
apiserver_storage_events_received_totalALPHACounterNumber of etcd events received split by kind.
resource
apiserver_storage_list_evaluated_objects_totalALPHACounterNumber of objects tested in the course of serving a LIST request from storage
resource
apiserver_storage_list_fetched_objects_totalALPHACounterNumber of objects read from storage in the course of serving a LIST request
resource
apiserver_storage_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from storage
resource
apiserver_storage_list_totalALPHACounterNumber of LIST requests served from storage
resource
apiserver_storage_size_bytesALPHACustomSize of the storage database file physically allocated in bytes.
cluster
apiserver_storage_transformation_duration_secondsALPHAHistogramLatencies in seconds of value transformation operations.
transformation_type
transformer_prefix
apiserver_storage_transformation_operations_totalALPHACounterTotal number of transformations. Successful transformation will have a status 'OK' and a varied status string when the transformation fails. This status and transformation_type fields may be used for alerting on encryption/decryption failure using transformation_type from_storage for decryption and to_storage for encryption
status
transformation_type
transformer_prefix
apiserver_terminated_watchers_totalALPHACounterCounter of watchers closed due to unresponsiveness broken by resource type.
resource
apiserver_tls_handshake_errors_totalALPHACounterNumber of requests dropped with 'TLS handshake error from' error
apiserver_validating_admission_policy_check_duration_secondsALPHAHistogramValidation admission latency for individual validation expressions in seconds, labeled by policy and further including binding, state and enforcement action taken.
enforcement_action
policy
policy_binding
state
apiserver_validating_admission_policy_check_totalALPHACounterValidation admission policy check total, labeled by policy and further identified by binding, enforcement action taken, and state.
enforcement_action
policy
policy_binding
state
apiserver_validating_admission_policy_definition_totalALPHACounterValidation admission policy count total, labeled by state and enforcement action.
enforcement_action
state
apiserver_watch_cache_events_dispatched_totalALPHACounterCounter of events dispatched in watch cache broken by resource type.
resource
apiserver_watch_cache_events_received_totalALPHACounterCounter of events received in watch cache broken by resource type.
resource
apiserver_watch_cache_initializations_totalALPHACounterCounter of watch cache initializations broken by resource type.
resource
apiserver_watch_events_sizesALPHAHistogramWatch event size distribution in bytes
group
kind
version
apiserver_watch_events_totalALPHACounterNumber of events sent in watch clients
group
kind
version
apiserver_webhooks_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
apiserver_webhooks_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
attach_detach_controller_attachdetach_controller_forced_detachesALPHACounterNumber of times the A/D Controller performed a forced detach
reason
attachdetach_controller_total_volumesALPHACustomNumber of volumes in A/D Controller
plugin_name
state
authenticated_user_requestsALPHACounterCounter of authenticated requests broken out by username.
username
authentication_attemptsALPHACounterCounter of authenticated attempts.
result
authentication_duration_secondsALPHAHistogramAuthentication duration in seconds broken out by result.
result
authentication_token_cache_active_fetch_countALPHAGauge
status
authentication_token_cache_fetch_totalALPHACounter
status
authentication_token_cache_request_duration_secondsALPHAHistogram
status
authentication_token_cache_request_totalALPHACounter
status
authorization_attempts_totalALPHACounterCounter of authorization attempts broken down by result. It can be either 'allowed', 'denied', 'no-opinion' or 'error'.
result
authorization_duration_secondsALPHAHistogramAuthorization duration in seconds broken out by result.
result
cloud_provider_webhook_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
webhook
cloud_provider_webhook_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
webhook
cloudprovider_azure_api_request_duration_secondsALPHAHistogramLatency of an Azure API call
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_errorsALPHACounterNumber of errors for an Azure API call
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_ratelimited_countALPHACounterNumber of rate limited Azure API calls
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_throttled_countALPHACounterNumber of throttled Azure API calls
request
resource_group
source
subscription_id
cloudprovider_azure_op_duration_secondsALPHAHistogramLatency of an Azure service operation
request
resource_group
source
subscription_id
cloudprovider_azure_op_failure_countALPHACounterNumber of failed Azure service operations
request
resource_group
source
subscription_id
cloudprovider_gce_api_request_duration_secondsALPHAHistogramLatency of a GCE API call
region
request
version
zone
cloudprovider_gce_api_request_errorsALPHACounterNumber of errors for an API call
region
request
version
zone
cloudprovider_vsphere_api_request_duration_secondsALPHAHistogramLatency of vsphere api call
request
cloudprovider_vsphere_api_request_errorsALPHACountervsphere Api errors
request
cloudprovider_vsphere_operation_duration_secondsALPHAHistogramLatency of vsphere operation call
operation
cloudprovider_vsphere_operation_errorsALPHACountervsphere operation errors
operation
cloudprovider_vsphere_vcenter_versionsALPHACustomVersions for connected vSphere vCenters
hostname
version
build
container_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the container in core-seconds
container
pod
namespace
container_memory_working_set_bytesALPHACustomCurrent working set of the container in bytes
container
pod
namespace
container_start_time_secondsALPHACustomStart time of the container since unix epoch in seconds
container
pod
namespace
container_swap_usage_bytesALPHACustomCurrent amount of the container swap usage in bytes. Reported only on non-windows systems
container
pod
namespace
csi_operations_secondsALPHAHistogramContainer Storage Interface operation duration with gRPC error code status total
driver_name
grpc_status_code
method_name
migrated
endpoint_slice_controller_changesALPHACounterNumber of EndpointSlice changes
operation
endpoint_slice_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocation
endpoint_slice_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Service sync
endpoint_slice_controller_endpoints_desiredALPHAGaugeNumber of endpoints desired
endpoint_slice_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Service sync
endpoint_slice_controller_endpointslices_changed_per_syncALPHAHistogramNumber of EndpointSlices changed on each Service sync
topology
endpoint_slice_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlices
endpoint_slice_controller_syncsALPHACounterNumber of EndpointSlice syncs
result
endpoint_slice_mirroring_controller_addresses_skipped_per_syncALPHAHistogramNumber of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubset
endpoint_slice_mirroring_controller_changesALPHACounterNumber of EndpointSlice changes
operation
endpoint_slice_mirroring_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocation
endpoint_slice_mirroring_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Endpoints sync
endpoint_slice_mirroring_controller_endpoints_desiredALPHAGaugeNumber of endpoints desired
endpoint_slice_mirroring_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Endpoints sync
endpoint_slice_mirroring_controller_endpoints_sync_durationALPHAHistogramDuration of syncEndpoints() in seconds
endpoint_slice_mirroring_controller_endpoints_updated_per_syncALPHAHistogramNumber of endpoints updated on each Endpoints sync
endpoint_slice_mirroring_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlices
ephemeral_volume_controller_create_failures_totalALPHACounterNumber of PersistenVolumeClaims creation requests
ephemeral_volume_controller_create_totalALPHACounterNumber of PersistenVolumeClaims creation requests
etcd_bookmark_countsALPHAGaugeNumber of etcd bookmarks (progress notify events) split by kind.
resource
etcd_lease_object_countsALPHAHistogramNumber of objects attached to a single etcd lease.
etcd_request_duration_secondsALPHAHistogramEtcd request latency in seconds for each operation and object type.
operation
type
etcd_request_errors_totalALPHACounterEtcd failed request counts for each operation and object type.
operation
type
etcd_requests_totalALPHACounterEtcd request counts for each operation and object type.
operation
type
etcd_version_infoALPHAGaugeEtcd server's binary version
binary_version
field_validation_request_duration_secondsALPHAHistogramResponse latency distribution in seconds for each field validation value
field_validation
force_cleaned_failed_volume_operation_errors_totalALPHACounterThe number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.
force_cleaned_failed_volume_operations_totalALPHACounterThe number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.
garbagecollector_controller_resources_sync_error_totalALPHACounterNumber of garbage collector resources sync errors
get_token_countALPHACounterCounter of total Token() requests to the alternate token source
get_token_fail_countALPHACounterCounter of failed Token() requests to the alternate token source
horizontal_pod_autoscaler_controller_metric_computation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to calculate one metric. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. The label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
action
error
metric_type
horizontal_pod_autoscaler_controller_metric_computation_totalALPHACounterNumber of metric computations. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
action
error
metric_type
horizontal_pod_autoscaler_controller_reconciliation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to reconcile once. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
action
error
horizontal_pod_autoscaler_controller_reconciliations_totalALPHACounterNumber of reconciliations of HPA controller. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
action
error
job_controller_pod_failures_handled_by_failure_policy_totalALPHACounter`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, "FailJob", "Ignore" and "Count".`
action
job_controller_terminated_pods_tracking_finalizer_totalALPHACounter`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".`
event
kube_apiserver_clusterip_allocator_allocated_ipsALPHAGaugeGauge measuring the number of allocated IPs for Services
cidr
kube_apiserver_clusterip_allocator_allocation_errors_totalALPHACounterNumber of errors trying to allocate Cluster IPs
cidr
scope
kube_apiserver_clusterip_allocator_allocation_totalALPHACounterNumber of Cluster IPs allocations
cidr
scope
kube_apiserver_clusterip_allocator_available_ipsALPHAGaugeGauge measuring the number of available IPs for Services
cidr
kube_apiserver_nodeport_allocator_allocated_portsALPHAGaugeGauge measuring the number of allocated NodePorts for Services
kube_apiserver_nodeport_allocator_available_portsALPHAGaugeGauge measuring the number of available NodePorts for Services
kube_apiserver_pod_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verification
kube_apiserver_pod_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
usage
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verification1.27.0
kube_apiserver_pod_logs_pods_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
usage
1.27.0
kubelet_active_podsALPHAGaugeThe number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.
static
kubelet_certificate_manager_client_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.
kubelet_certificate_manager_client_ttl_secondsALPHAGaugeGauge of the TTL (time-to-live) of the Kubelet's client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.
kubelet_certificate_manager_server_rotation_secondsALPHAHistogramHistogram of the number of seconds the previous certificate lived before being rotated.
kubelet_certificate_manager_server_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the Kubelet's serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.
kubelet_cgroup_manager_duration_secondsALPHAHistogramDuration in seconds for cgroup manager operations. Broken down by method.
operation_type
kubelet_container_log_filesystem_used_bytesALPHACustomBytes used by the container's logs on the filesystem.
uid
namespace
pod
container
kubelet_containers_per_pod_countALPHAHistogramThe number of containers per pod.
kubelet_cpu_manager_pinning_errors_totalALPHACounterThe number of cpu core allocations which required pinning failed.
kubelet_cpu_manager_pinning_requests_totalALPHACounterThe number of cpu core allocations which required pinning.
kubelet_credential_provider_plugin_durationALPHAHistogramDuration of execution in seconds for credential provider plugin
plugin_name
kubelet_credential_provider_plugin_errorsALPHACounterNumber of errors from credential provider plugin
plugin_name
kubelet_desired_podsALPHAGaugeThe number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.
static
kubelet_device_plugin_alloc_duration_secondsALPHAHistogramDuration in seconds to serve a device plugin Allocation request. Broken down by resource name.
resource_name
kubelet_device_plugin_registration_totalALPHACounterCumulative number of device plugin registrations. Broken down by resource name.
resource_name
kubelet_evented_pleg_connection_error_countALPHACounterThe number of errors encountered during the establishment of streaming connection with the CRI runtime.
kubelet_evented_pleg_connection_latency_secondsALPHAHistogramThe latency of streaming connection with the CRI runtime, measured in seconds.
kubelet_evented_pleg_connection_success_countALPHACounterThe number of times a streaming client was obtained to receive CRI Events.
kubelet_eviction_stats_age_secondsALPHAHistogramTime between when stats are collected, and when pod is evicted based on those stats by eviction signal
eviction_signal
kubelet_evictionsALPHACounterCumulative number of pod evictions by eviction signal
eviction_signal
kubelet_graceful_shutdown_end_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in seconds
kubelet_graceful_shutdown_start_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in seconds
kubelet_http_inflight_requestsALPHAGaugeNumber of the inflight http requests
long_running
method
path
server_type
kubelet_http_requests_duration_secondsALPHAHistogramDuration in seconds to serve http requests
long_running
method
path
server_type
kubelet_http_requests_totalALPHACounterNumber of the http requests received since the server started
long_running
method
path
server_type
kubelet_lifecycle_handler_http_fallbacks_totalALPHACounterThe number of times lifecycle handlers successfully fell back to http from https.
kubelet_managed_ephemeral_containersALPHAGaugeCurrent number of ephemeral containers in pods managed by this kubelet.
kubelet_mirror_podsALPHAGaugeThe number of mirror pods the kubelet will try to create (one per admitted static pod)
kubelet_node_nameALPHAGaugeThe node's name. The count is always 1.
node
kubelet_orphan_pod_cleaned_volumesALPHAGaugeThe total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.
kubelet_orphan_pod_cleaned_volumes_errorsALPHAGaugeThe number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.
kubelet_orphaned_runtime_pods_totalALPHACounterNumber of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.
kubelet_pleg_discard_eventsALPHACounterThe number of discard events in PLEG.
kubelet_pleg_last_seen_secondsALPHAGaugeTimestamp in seconds when PLEG was last seen active.
kubelet_pleg_relist_duration_secondsALPHAHistogramDuration in seconds for relisting pods in PLEG.
kubelet_pleg_relist_interval_secondsALPHAHistogramInterval in seconds between relisting in PLEG.
kubelet_pod_resources_endpoint_errors_getALPHACounterNumber of requests to the PodResource Get endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_errors_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_errors_listALPHACounterNumber of requests to the PodResource List endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_getALPHACounterNumber of requests to the PodResource Get endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_listALPHACounterNumber of requests to the PodResource List endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_totalALPHACounterCumulative number of requests to the PodResource endpoint. Broken down by server api version.
server_api_version
kubelet_pod_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod for the first time to the pod starting to run
kubelet_pod_start_sli_duration_secondsALPHAHistogramDuration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
kubelet_pod_status_sync_duration_secondsALPHAHistogramDuration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.
kubelet_pod_worker_duration_secondsALPHAHistogramDuration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
operation_type
kubelet_pod_worker_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod to starting a worker.
kubelet_preemptionsALPHACounterCumulative number of pod preemptions by preemption resource
preemption_signal
kubelet_restarted_pods_totalALPHACounterNumber of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)
static
kubelet_run_podsandbox_duration_secondsALPHAHistogramDuration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
runtime_handler
kubelet_run_podsandbox_errors_totalALPHACounterCumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
runtime_handler
kubelet_running_containersALPHAGaugeNumber of containers currently running
container_state
kubelet_running_podsALPHAGaugeNumber of pods that have a running pod sandbox
kubelet_runtime_operations_duration_secondsALPHAHistogramDuration in seconds of runtime operations. Broken down by operation type.
operation_type
kubelet_runtime_operations_errors_totalALPHACounterCumulative number of runtime operation errors by operation type.
operation_type
kubelet_runtime_operations_totalALPHACounterCumulative number of runtime operations by operation type.
operation_type
kubelet_server_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.
kubelet_started_containers_errors_totalALPHACounterCumulative number of errors when starting containers
code
container_type
kubelet_started_containers_totalALPHACounterCumulative number of containers started
container_type
kubelet_started_host_process_containers_errors_totalALPHACounterCumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.
code
container_type
kubelet_started_host_process_containers_totalALPHACounterCumulative number of hostprocess containers started. This metric will only be collected on Windows.
container_type
kubelet_started_pods_errors_totalALPHACounterCumulative number of errors when starting pods
kubelet_started_pods_totalALPHACounterCumulative number of pods started
kubelet_topology_manager_admission_duration_msALPHAHistogramDuration in milliseconds to serve a pod admission request.
kubelet_topology_manager_admission_errors_totalALPHACounterThe number of admission request failures where resources could not be aligned.
kubelet_topology_manager_admission_requests_totalALPHACounterThe number of admission requests where resources have to be aligned.
kubelet_volume_metric_collection_duration_secondsALPHAHistogramDuration in seconds to calculate volume stats
metric_source
kubelet_volume_stats_available_bytesALPHACustomNumber of available bytes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_capacity_bytesALPHACustomCapacity in bytes of the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_health_status_abnormalALPHACustomAbnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
namespace
persistentvolumeclaim
kubelet_volume_stats_inodesALPHACustomMaximum number of inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_inodes_freeALPHACustomNumber of free inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_inodes_usedALPHACustomNumber of used inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_used_bytesALPHACustomNumber of used bytes in the volume
namespace
persistentvolumeclaim
kubelet_working_podsALPHAGaugeNumber of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.
config
lifecycle
static
kubeproxy_network_programming_duration_secondsALPHAHistogramIn Cluster Network Programming Latency in seconds
kubeproxy_proxy_healthz_totalALPHACounterCumulative proxy healthz HTTP status
code
kubeproxy_proxy_livez_totalALPHACounterCumulative proxy livez HTTP status
code
kubeproxy_sync_full_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for full resyncs
kubeproxy_sync_partial_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for partial resyncs
kubeproxy_sync_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds
kubeproxy_sync_proxy_rules_endpoint_changes_pendingALPHAGaugePending proxy rules Endpoint changes
kubeproxy_sync_proxy_rules_endpoint_changes_totalALPHACounterCumulative proxy rules Endpoint changes
kubeproxy_sync_proxy_rules_iptables_lastALPHAGaugeNumber of iptables rules written by kube-proxy in last sync
table
kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_totalALPHACounterCumulative proxy iptables partial restore failures
kubeproxy_sync_proxy_rules_iptables_restore_failures_totalALPHACounterCumulative proxy iptables restore failures
kubeproxy_sync_proxy_rules_iptables_totalALPHAGaugeTotal number of iptables rules owned by kube-proxy
table
kubeproxy_sync_proxy_rules_last_queued_timestamp_secondsALPHAGaugeThe last time a sync of proxy rules was queued
kubeproxy_sync_proxy_rules_last_timestamp_secondsALPHAGaugeThe last time proxy rules were successfully synced
kubeproxy_sync_proxy_rules_no_local_endpoints_totalALPHAGaugeNumber of services with a Local traffic policy and no endpoints
traffic_policy
kubeproxy_sync_proxy_rules_service_changes_pendingALPHAGaugePending proxy rules Service changes
kubeproxy_sync_proxy_rules_service_changes_totalALPHACounterCumulative proxy rules Service changes
kubernetes_build_infoALPHAGaugeA metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
build_date
compiler
git_commit
git_tree_state
git_version
go_version
major
minor
platform
leader_election_master_statusALPHAGaugeGauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. 'name' is the string used to identify the lease. Please make sure to group by name.
name
node_authorizer_graph_actions_duration_secondsALPHAHistogramHistogram of duration of graph actions in node authorizer.
operation
node_collector_unhealthy_nodes_in_zoneALPHAGaugeGauge measuring number of not Ready Nodes per zones.
zone
node_collector_update_all_nodes_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of all nodes.
node_collector_update_node_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of a single node.
node_collector_zone_healthALPHAGaugeGauge measuring percentage of healthy nodes per zone.
zone
node_collector_zone_sizeALPHAGaugeGauge measuring number of registered Nodes per zones.
zone
node_controller_cloud_provider_taint_removal_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController removed the cloud-provider taint of a single node.
node_controller_initial_node_sync_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController finished the initial synchronization of a single node.
node_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the node in core-seconds
node_ipam_controller_cidrset_allocation_tries_per_requestALPHAHistogramNumber of endpoints added on each Service sync
clusterCIDR
node_ipam_controller_cidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
clusterCIDR
node_ipam_controller_cidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
clusterCIDR
node_ipam_controller_cidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
clusterCIDR
node_ipam_controller_cirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.
clusterCIDR
node_ipam_controller_multicidrset_allocation_tries_per_requestALPHAHistogramHistogram measuring CIDR allocation tries per request.
clusterCIDR
node_ipam_controller_multicidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
clusterCIDR
node_ipam_controller_multicidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
clusterCIDR
node_ipam_controller_multicidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
clusterCIDR
node_ipam_controller_multicirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.
clusterCIDR
node_memory_working_set_bytesALPHACustomCurrent working set of the node in bytes
node_swap_usage_bytesALPHACustomCurrent swap usage of the node in bytes. Reported only on non-windows systems
number_of_l4_ilbsALPHAGaugeNumber of L4 ILBs
feature
plugin_manager_total_pluginsALPHACustomNumber of plugins in Plugin Manager
socket_path
state
pod_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the pod in core-seconds
pod
namespace
pod_gc_collector_force_delete_pod_errors_totalALPHACounterNumber of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
namespace
reason
pod_gc_collector_force_delete_pods_totalALPHACounterNumber of pods that are being forcefully deleted since the Pod GC Controller started.
namespace
reason
pod_memory_working_set_bytesALPHACustomCurrent working set of the pod in bytes
pod
namespace
pod_security_errors_totalALPHACounterNumber of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for enforcement.
fatal
request_operation
resource
subresource
pod_security_evaluations_totalALPHACounterNumber of policy evaluations that occurred, not counting ignored or exempt requests.
decision
mode
policy_level
policy_version
request_operation
resource
subresource
pod_security_exemptions_totalALPHACounterNumber of exempt requests, not counting ignored or out of scope requests.
request_operation
resource
subresource
pod_swap_usage_bytesALPHACustomCurrent amount of the pod swap usage in bytes. Reported only on non-windows systems
pod
namespace
prober_probe_duration_secondsALPHAHistogramDuration in seconds for a probe response.
container
namespace
pod
probe_type
prober_probe_totalALPHACounterCumulative number of a liveness, readiness or startup probe for a container by result.
container
namespace
pod
pod_uid
probe_type
result
pv_collector_bound_pv_countALPHACustomGauge measuring number of persistent volume currently bound
storage_class
pv_collector_bound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently bound
namespace
pv_collector_total_pv_countALPHACustomGauge measuring total number of persistent volumes
plugin_name
volume_mode
pv_collector_unbound_pv_countALPHACustomGauge measuring number of persistent volume currently unbound
storage_class
pv_collector_unbound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently unbound
namespace
reconstruct_volume_operations_errors_totalALPHACounterThe number of volumes that failed reconstruction from the operating system during kubelet startup.
reconstruct_volume_operations_totalALPHACounterThe number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.
replicaset_controller_sorting_deletion_age_ratioALPHAHistogramThe ratio of chosen deleted pod's ages to the current youngest pod's age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate's effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.
resourceclaim_controller_create_attempts_totalALPHACounterNumber of ResourceClaims creation requests
resourceclaim_controller_create_failures_totalALPHACounterNumber of ResourceClaims creation request failures
rest_client_dns_resolution_duration_secondsALPHAHistogramDNS resolver latency in seconds. Broken down by host.
host
rest_client_exec_plugin_call_totalALPHACounterNumber of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
call_status
code
rest_client_exec_plugin_certificate_rotation_ageALPHAHistogramHistogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
rest_client_exec_plugin_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
rest_client_rate_limiter_duration_secondsALPHAHistogramClient side rate limiter latency in seconds. Broken down by verb, and host.
host
verb
rest_client_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by verb, and host.
host
verb
rest_client_request_retries_totalALPHACounterNumber of request retries, partitioned by status code, verb, and host.
code
host
verb
rest_client_request_size_bytesALPHAHistogramRequest size in bytes. Broken down by verb and host.
host
verb
rest_client_requests_totalALPHACounterNumber of HTTP requests, partitioned by status code, method, and host.
code
host
method
rest_client_response_size_bytesALPHAHistogramResponse size in bytes. Broken down by verb and host.
host
verb
rest_client_transport_cache_entriesALPHAGaugeNumber of transport entries in the internal cache.
rest_client_transport_create_calls_totalALPHACounterNumber of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached
result
retroactive_storageclass_errors_totalALPHACounterTotal number of failed retroactive StorageClass assignments to persistent volume claim
retroactive_storageclass_totalALPHACounterTotal number of retroactive StorageClass assignments to persistent volume claim
root_ca_cert_publisher_sync_duration_secondsALPHAHistogramNumber of namespace syncs happened in root ca cert publisher.
code
root_ca_cert_publisher_sync_totalALPHACounterNumber of namespace syncs happened in root ca cert publisher.
code
running_managed_controllersALPHAGaugeIndicates where instances of a controller are currently running
manager
name
scheduler_goroutinesALPHAGaugeNumber of running goroutines split by the work they do such as binding.
operation
scheduler_permit_wait_duration_secondsALPHAHistogramDuration of waiting on permit.
result
scheduler_plugin_evaluation_totalALPHACounterNumber of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).
extension_point
plugin
profile
scheduler_plugin_execution_duration_secondsALPHAHistogramDuration for running a plugin at a specific extension point.
extension_point
plugin
status
scheduler_scheduler_cache_sizeALPHAGaugeNumber of nodes, pods, and assumed (bound) pods in the scheduler cache.
type
scheduler_scheduling_algorithm_duration_secondsALPHAHistogramScheduling algorithm latency in seconds
scheduler_unschedulable_podsALPHAGaugeThe number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
plugin
profile
scheduler_volume_binder_cache_requests_totalALPHACounterTotal number for request volume binding cache
operation
scheduler_volume_scheduling_stage_error_totalALPHACounterVolume scheduling stage error count
operation
scrape_errorALPHACustom1 if there was an error while getting container metrics, 0 otherwise
service_controller_loadbalancer_sync_totalALPHACounterA metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the cluster
service_controller_nodesync_error_totalALPHACounterA metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the cluster
service_controller_nodesync_latency_secondsALPHAHistogramA metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.
service_controller_update_loadbalancer_host_latency_secondsALPHAHistogramA metric measuring the latency for updating each load balancer hosts.
serviceaccount_legacy_tokens_totalALPHACounterCumulative legacy service account tokens used
serviceaccount_stale_tokens_totalALPHACounterCumulative stale projected service account tokens used
serviceaccount_valid_tokens_totalALPHACounterCumulative valid projected service account tokens used
storage_count_attachable_volumes_in_useALPHACustomMeasure number of volumes in use
node
volume_plugin
storage_operation_duration_secondsALPHAHistogramStorage operation duration
migrated
operation_name
status
volume_plugin
ttl_after_finished_controller_job_deletion_duration_secondsALPHAHistogramThe time it took to delete the job since it became eligible for deletion
volume_manager_selinux_container_errors_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.
volume_manager_selinux_container_warnings_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_pod_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
volume_manager_selinux_pod_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_volume_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
volume_manager_selinux_volume_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_volumes_admitted_totalALPHAGaugeNumber of volumes whose SELinux context was fine and will be mounted with mount -o context option.
volume_manager_total_volumesALPHACustomNumber of volumes in Volume Manager
plugin_name
state
volume_operation_total_errorsALPHACounterTotal volume operation errors
operation_name
plugin_name
volume_operation_total_secondsALPHAHistogramStorage operation end to end duration in seconds
operation_name
plugin_name
watch_cache_capacityALPHAGaugeTotal capacity of watch cache broken by resource type.
resource
watch_cache_capacity_decrease_totalALPHACounterTotal number of watch cache capacity decrease events broken by resource type.
resource
watch_cache_capacity_increase_totalALPHACounterTotal number of watch cache capacity increase events broken by resource type.
resource
workqueue_adds_totalALPHACounterTotal number of adds handled by workqueue
name
workqueue_depthALPHAGaugeCurrent depth of workqueue
name
workqueue_longest_running_processor_secondsALPHAGaugeHow many seconds has the longest running processor for workqueue been running.
name
workqueue_queue_duration_secondsALPHAHistogramHow long in seconds an item stays in workqueue before being requested.
name
workqueue_retries_totalALPHACounterTotal number of retries handled by workqueue
name
workqueue_unfinished_work_secondsALPHAGaugeHow many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
name
workqueue_work_duration_secondsALPHAHistogramHow long in seconds processing an item from workqueue takes.
name