Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for kubernetes_state_core core check #12552

Merged
merged 10 commits into from
Oct 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ assets/ @DataDog/agent-integrations
/kubernetes_state/ @DataDog/container-integrations @DataDog/agent-integrations
/kubernetes_state/*.md @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
/kubernetes_state/manifest.json @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
/kubernetes_state_core/ @DataDog/container-integrations @DataDog/agent-integrations
/kubernetes_state_core/*.md @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
/kubernetes_state_core/manifest.json @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
/nginx_ingress_controller/ @DataDog/container-integrations @DataDog/agent-integrations
/nginx_ingress_controller/*.md @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
/nginx_ingress_controller/manifest.json @DataDog/container-integrations @DataDog/agent-integrations @DataDog/documentation
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/config/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,8 @@ integration/kubernetes:
- kubernetes/**/*
integration/kubernetes_state:
- kubernetes_state/**/*
integration/kubernetes_state_core:
- kubernetes_state_core/**/*
integration/kyototycoon:
- kyototycoon/**/*
integration/lighttpd:
Expand Down
4 changes: 2 additions & 2 deletions kubernetes_state/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The Kubernetes-state check does not include any events.

### Service Checks

See [../kubernetes/service_checks.json][6] for a list of service checks provided by this integration.
See [../kubernetes/assets/service_checks.json][6] for a list of service checks provided by this integration.

## Troubleshooting

Expand All @@ -45,5 +45,5 @@ Need help? Contact [Datadog support][7].
[3]: https://github.com/DataDog/integrations-core/blob/master/kubernetes_state/datadog_checks/kubernetes_state/data/conf.yaml.example
[4]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
[5]: https://github.com/DataDog/integrations-core/blob/master/kubernetes_state/metadata.csv
[6]: https://github.com/DataDog/integrations-core/blob/master/kubernetes_state/assets/service_checks.json
[6]: https://github.com/DataDog/integrations-core/blob/master/kubernetes/assets/service_checks.json
[7]: https://docs.datadoghq.com/help/
2 changes: 2 additions & 0 deletions kubernetes_state_core/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# CHANGELOG - Kubernetes State Core

173 changes: 173 additions & 0 deletions kubernetes_state_core/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Agent Check: Kubernetes State Core

## Overview

Get metrics from Kubernetes service in real-time to:

- Visualize and monitor Kubernetes states.
- Be notified about Kubernetes failovers and events.

The Kubernetes State Metrics Core check leverages [kube-state-metrics version 2+][1] and includes major performance and tagging improvements compared to the legacy `kubernetes_state` check.

As opposed to the legacy check, with the Kubernetes State Metrics Core check, you no longer need to deploy `kube-state-metrics` in your cluster.

Kubernetes State Metrics Core provides a better alternative to the legacy `kubernetes_state` check as it offers more granular metrics and tags. See the [Major Changes][2] and [Data Collected][3] for more details.

## Setup

### Installation

The Kubernetes State Metrics Core check is included in the [Datadog Cluster Agent][4] image, so you don't need to install anything else on your Kubernetes servers.

### Requirements

- Datadog Cluster Agent v1.12+

### Configuration

<!-- xxx tabs xxx -->
<!-- xxx tab "Helm" xxx -->

In your Helm `values.yaml`, add the following:

```yaml
datadog:
# (...)
kubeStateMetricsCore:
enabled: true
```
L3n41c marked this conversation as resolved.
Show resolved Hide resolved

<!-- xxz tab xxx -->
<!-- xxx tab "Operator" xxx -->

To enable the `kubernetes_state_core` check, the setting `spec.features.kubeStateMetricsCore.enabled` must be set to `true` in the DatadogAgent resource:

```yaml
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
name: datadog
spec:
credentials:
apiKey: <DATADOG_API_KEY>
appKey: <DATADOG_APP_KEY>
features:
kubeStateMetricsCore:
enabled: true
# (...)
```

Note: Datadog Operator v0.7.0 or greater is required.

<!-- xxz tab xxx -->
<!-- xxz tabs xxx -->

## Migration from kubernetes_state to kubernetes_state_core

### Tags removal

In the original `kubernetes_state` check, several tags have been flagged as deprecated and replaced by new tags. To determine your migration path, check which tags are submitted with your metrics.

In the `kubernetes_state_core` check, only the non-deprecated tags are submitted. Before migrating from `kubernetes_state` to `kubernetes_state_core`, verify that only official tags are used in monitors and dashboards.

Here is the mapping between deprecated tags and the official tags that have replaced them:

| deprecated tag | official tag |
|-----------------------|-----------------------------|
| cluster_name | kube_cluster_name |
| container | kube_container_name |
| cronjob | kube_cronjob |
| daemonset | kube_daemon_set |
| deployment | kube_deployment |
| image | image_name |
| job | kube_job |
| job_name | kube_job |
| namespace | kube_namespace |
| phase | pod_phase |
| pod | pod_name |
| replicaset | kube_replica_set |
| replicationcontroller | kube_replication_controller |
| statefulset | kube_stateful_set |

### Backward incompatibility changes

The Kubernetes State Metrics Core check is not backward compatible, be sure to read the changes carefully before migrating from the legacy `kubernetes_state` check.

`kubernetes_state.node.by_condition`
: A new metric with node name granularity. It replaces `kubernetes_state.nodes.by_condition`.

`kubernetes_state.persistentvolume.by_phase`
: A new metric with persistentvolume name granularity. It replaces `kubernetes_state.persistentvolumes.by_phase`.

`kubernetes_state.pod.status_phase`
: The metric is tagged with pod level tags, like `pod_name`.

`kubernetes_state.node.count`
: The metric is not tagged with `host` anymore. It aggregates the nodes count by `kernel_version` `os_image` `container_runtime_version` `kubelet_version`.

`kubernetes_state.container.waiting` and `kubernetes_state.container.status_report.count.waiting`
: These metrics no longer emit a 0 value if no pods are waiting. They only report non-zero values.

`kube_job`
: In `kubernetes_state`, the `kube_job` tag value is the `CronJob` name if the `Job` had `CronJob` as an owner, otherwise it is the `Job` name. In `kubernetes_state_core`, the `kube_job` tag value is always the `Job` name, and a new `kube_cronjob` tag key is added with the `CronJob` name as the tag value. When migrating to `kubernetes_state_core`, it's recommended to use the new tag or `kube_job:foo*`, where `foo` is the `CronJob` name, for query filters.

<!-- xxx tabs xxx -->
<!-- xxx tab "Helm" xxx -->

Enabling `kubeStateMetricsCore` in your Helm `values.yaml` configures the Agent to ignore the auto configuration file for legacy `kubernetes_state` check. The goal is to avoid running both checks simultaneously.

If you still want to enable both checks simultaneously for the migration phase, disable the `ignoreLegacyKSMCheck` field in your `values.yaml`.

**Note**: `ignoreLegacyKSMCheck` makes the Agent only ignore the auto configuration for the legacy `kubernetes_state` check. Custom `kubernetes_state` configurations need to be removed manually.

The Kubernetes State Metrics Core check does not require deploying `kube-state-metrics` in your cluster anymore, you can disable deploying `kube-state-metrics` as part of the Datadog Helm Chart. To do this, add the following in your Helm `values.yaml`:

```yaml
datadog:
# (...)
kubeStateMetricsEnabled: false
```

<!-- xxz tab xxx -->
<!-- xxz tabs xxx -->

**Important Note:** The Kubernetes State Metrics Core check is an alternative to the legacy `kubernetes_state` check. Datadog recommends not enabling both checks simultaneously to guarantee consistent metrics.

## Data Collected

### Metrics

See [metadata.csv][5] for a list of metrics provided by this integration.

**Note:** You can configure [Datadog Standard labels][6] on your Kubernetes objects to get the `env` `service` `version` tags.

### Events

The Kubernetes State Metrics Core check does not include any events.

### Service Checks

See [../kubernetes/assets/service_checks.json][7] for a list of service checks provided by this integration.

### Validation

[Run the Cluster Agent's `status` subcommand][8] inside your Cluster Agent container and look for `kubernetes_state_core` under the Checks section.

## Troubleshooting

Need help? Contact [Datadog support][9].

## Further Reading

{{< partial name="whats-next/whats-next.html" >}}


[1]: https://kubernetes.io/blog/2021/04/13/kube-state-metrics-v-2-0/
[2]: /integrations/kubernetes_state_core/#migration-from-kubernetes_state-to-kubernetes_state_core
[3]: /integrations/kubernetes_state_core/#data-collected
[4]: /agent/cluster_agent/
[5]: https://github.com/DataDog/integrations-core/blob/master/kubernetes_state_core/metadata.csv
[6]: /getting_started/tagging/unified_service_tagging/?tab=kubernetes#configuration-1
[7]: https://github.com/DataDog/integrations-core/blob/master/kubernetes/assets/service_checks.json
[8]: /agent/guide/agent-commands/?tab=clusteragent#agent-status-and-information
[9]: https://docs.datadoghq.com/help/
115 changes: 115 additions & 0 deletions kubernetes_state_core/assets/configuration/spec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
name: Kubernetes State
files:
- name: conf.yaml.example
options:
- template: init_config
options:
template: init_config/openmetrics_legacy
- template: instances
options:
- name: collectors
required: false
description: |
A list of resource types to collect

Example to enable pods and nodes collectors:
value:
type: array
items:
type: string
example:
- nodes
- pods
- name: label_joins
required: false
description: |
label_joins allows adding the tags to join from other KSM metrics.

Example: Joining for deployment metrics. Based on:
kube_deployment_labels{deployment="kube-dns",label_addonmanager_kubernetes_io_mode="Reconcile"}
Use the following config to add the value of label_addonmanager_kubernetes_io_mode as a tag to your KSM
deployment metrics.
value:
type: object
example:
kube_deployment_labels:
labels_to_match:
- deployment
labels_to_get:
- label_addonmanager_kubernetes_io_mode
- name: labels_as_tags
required: false
value:
type: object
example:
labels_as_tags:
pod:
- "app_*"
node:
- "zone"
- "team_*"
- name: labels_mapper
required: false
description: |
LabelsMapper can be used to translate kube-state-metrics labels to other tags.
Example: Adding kube_namespace tag instead of namespace
value:
type: object
example:
namespace: kube_namespace
- name: tags
required: false
description: |
A list of tags to attach to every metric and service check emitted by this instance.

Learn more about tagging at https://docs.datadoghq.com/tagging
value:
type: array
items:
type: string
example:
- <KEY_1>:<VALUE_1>
- <KEY_2>:<VALUE_2>
- name: disable_global_tags
required: false
description: |
disable_global_tags disables adding the global host tags defined via tags/DD_TAG in the Agent config, default false.
value:
type: bool
example: false
- name: namespaces
required: false
description: |
namespaces contains the namespaces from which we collect metrics

Example: Enable metric collection for objects in prod and kube-system namespaces.
value:
type: list
items:
type: string
example:
- prod
- kube-system
- name: resync_period
required: false
description: |
resync_period is the frequency of resync'ing the metrics cache in seconds, default 5 minutes (kubernetes_informers_resync_period).
value:
type: integer
example: 300
- name: telemetry
required: false
description: |
enables telemetry check's metrics, default false.
Metrics can be found under kubernetes_state.telemetry
value:
type: bool
example: false
- name: skip_leader_election
required: false
description: |
skip_leader_election forces ignoring the leader election when running the check
Can be useful when running the check as cluster check
value:
type: bool
example: false
1 change: 1 addition & 0 deletions kubernetes_state_core/assets/service_checks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[]
Loading