-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Monitoring of control plane components
- Loading branch information
Showing
2 changed files
with
98 additions
and
0 deletions.
There are no files selected for viewing
97 changes: 97 additions & 0 deletions
97
content/en/docs/concepts/cluster-administration/monitoring.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
--- | ||
title: Monitoring Control Plane Components | ||
reviewers: | ||
- brancz | ||
- logicalhan | ||
- RainbowMango | ||
content_template: templates/concept | ||
weight: 60 | ||
--- | ||
|
||
{{% capture overview %}} | ||
|
||
System component metrics can give a better look into what is happening inside them. Metrics are particularly useful for building dashboards and alerts. | ||
|
||
{{% /capture %}} | ||
|
||
{{% capture body %}} | ||
|
||
## Metrics in Kubernetes | ||
|
||
Kubernetes control plane components use prometheus as the default client for exposing metrics. In most cases those metrics are available on `/metrics` endpoint of the HTTP server. For components that doesn't expose endpoint by default it can be enabled using `--bind-address` flag. | ||
|
||
Examples of those components: | ||
* kube-controller-manager | ||
* kube-proxy | ||
* kube-apiserver | ||
* kube-scheduler | ||
* kubelet (metrics exposed on `/metrics/cadvisor` and `/metrics/resource` do not have same guarantee) | ||
|
||
For clusters with RBAC enabled accessing metrics requires authorization through service account with `ClusterRole` for `/metrics` url. | ||
|
||
``` | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole | ||
metadata: | ||
name: prometheus | ||
rules: | ||
- nonResourceURLs: | ||
- "/metrics" | ||
verbs: | ||
- get | ||
``` | ||
|
||
## Metric Lifecycle | ||
|
||
Alpha metric -> Stable metric -> Deprecated metric -> Hidden metric -> Deletion | ||
|
||
Alpha metrics have no stability guarantees; as such they can be modified or deleted at any time. | ||
|
||
Stable metrics can be guaranteed to not change, except that the metric may become marked deprecated for a future Kubernetes version. By not change, we mean three things: | ||
|
||
* the metric itself will not be deleted (or renamed) | ||
* the type of metric will not be modified | ||
* no labels can be added or removed from this metric | ||
|
||
List of currently supported stable metrics can be found [here](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml) | ||
|
||
Deprecated metric are annotated with a Kubernetes version, from which point that metric will be considered deprecated. When a stable metric undergoes the deprecation process, we are signaling that the metric will eventually be deleted. | ||
|
||
Before deprecation: | ||
|
||
``` | ||
# HELP some_counter this counts things | ||
# TYPE some_counter counter | ||
some_counter 0 | ||
``` | ||
|
||
After deprecation: | ||
|
||
``` | ||
# HELP some_counter (Deprecated from 1.15) this counts things | ||
# TYPE some_counter counter | ||
some_counter 0 | ||
``` | ||
|
||
Hidden metrics will no longer be exposed by default and will require manual component configuration by cluster administrators; | ||
|
||
Deleted metrics will no longer be available; | ||
|
||
|
||
## Show Hidden Metrics | ||
|
||
As described above, admins can enable hidden metrics through a command-line flag on a specific binary. This intends to be used as an escape hatch for admins if they missed the migration of the metrics deprecated in the last release. | ||
|
||
The flag `show-hidden-metrics-for-version` takes a version for which you want to show metrics deprecated in that release. The version is expressed as x.y, where x is the major version, y is the minor version. The patch version is not needed even though a metrics can be deprecated in a patch release, the reason for that is the metrics deprecation policy runs against the minor release. | ||
|
||
The flag can only take the previous minor version as it's value. All metrics hidden in previous will be emitted if admins set the previous version to `show-hidden-metrics-for-version`. The too old version is not allowed because this violates the metrics deprecated policy. | ||
|
||
Take metric `A` as an example, here assumed that `A` is deprecated in 1.n. According to metrics deprecated policy, we can reach the following conclusion: | ||
|
||
* In release `1.n`, the metric is deprecated, and it can be emitted by default. | ||
* In release `1.n+1`, the metric is hidden by default and it can be emitted by command line `show-hidden-metrics-for-version=1.n`. | ||
* In release `1.n+2`, the metric should be removed from the codebase. No escape hatch anymore. | ||
|
||
So, if admins want to enable metric `A` in release `1.n+1`, they should set `1.n` to the command line flag. That is `show-hidden-metrics=1.n`. | ||
|
||
{{% /capture %}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters