Skip to content

Commit

Permalink
Update docs for the removal of the receiver.k8sclusterreceiver.report…
Browse files Browse the repository at this point in the history
…CpuMetricsAsDouble feature gate
  • Loading branch information
jvoravong committed Jul 13, 2022
1 parent f06e4cd commit 0222007
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 63 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## Unreleased

### Changed

- The receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate has been removed (#487)
- If you are using this feature gate, then see the [upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0540-to-0550).

### Fixed

- Make sure that logs are enabled to send k8s events (#481)
Expand Down Expand Up @@ -95,6 +100,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
and affected receivers in a custom manner should be reviewed. See
[upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0480-to-0490)

- The receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate is now enabled by default (#487)
- [BREAKING CHANGE] The Splunk Otel Collector has a feature gate to enable a
bug fix that makes the k8sclusterreceiver emit a few Kubernetes cpu
metrics differently to properly adhere to OpenTelemetry specifications. See
[upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0480-to-0490)

- Upgrade splunk-otel-collector image to 0.49.0 (#442)

## [0.48.0] - 2022-04-13
Expand Down
63 changes: 45 additions & 18 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# Upgrade guidelines

## 0.54.0 to 0.55.0

[[receiver/k8sclusterreceiver] The receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate has been removed](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/10838)

If you are disabling this feature gate to keep previous functionality, you will
have to complete the steps in
[upgrade guidelines 0.47.0 to 0.47.1](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0470-to-0471)
to upgrade since the feature gate no longer exists.

## 0.53.2 to 0.54.0

[OTel Kubernetes receiver is now used for events collection instead of Signalfx events receiver](https://github.com/signalfx/splunk-otel-collector-chart/pull/478)
Expand Down Expand Up @@ -76,8 +85,13 @@ monitoring setup, you can stop here.
custom log monitoring, update your log monitoring to accommodate the breaking
changes.

## 0.47.0 to 0.47.1
[[receiver/k8sclusterreceiver] The receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate is now enabled by default](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/9367)

If you haven't already completed the steps in
[upgrade guidelines 0.47.0 to 0.47.1](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#0470-to-0471)
, then complete them.

## 0.47.0 to 0.47.1
[[receiver/k8sclusterreceiver] Fix k8s node and container cpu metrics not being reported properly](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/8245)

The Splunk Otel Collector added a feature gate to enable a bug fix for three
Expand All @@ -88,23 +102,36 @@ pairs (current, legacy) below.
- `k8s.container.cpu_request`, `kubernetes.container_cpu_request`
- `k8s.container.cpu_limit`, `kubernetes.container_cpu_limit`
- `k8s.node.allocatable_cpu`, `kubernetes.node_allocatable_cpu`

1. Check to see if any of your custom monitoring uses the affected metrics.
Check for the current and legacy names of the affected metrics. If you don't
use the affected metrics in your custom monitoring, you can stop here.
2. Read the documentation for the
[receiver.k8sclusterreceiver.reportCpuMetricsAsDouble](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#highlighted-feature-gates)
feature gate and the bug fix it applies.
3. If the bug fix will break any of your custom monitoring for the affected
metrics, update your monitoring to accommodate the bug fix.
4. Use the `--set clusterReceiver.featureGates=receiver.k8sclusterreceiver.reportCpuMetricsAsDouble`
argument with the helm install/upgrade command, or add the following line to
your custom values.yaml:

```yaml
clusterReceiver:
featureGates: receiver.k8sclusterreceiver.reportCpuMetricsAsDouble
```
- Upgrade Steps
1. Check to see if any of your custom monitoring uses the affected metrics.
Check for the current and legacy names of the affected metrics. If you don't
use the affected metrics in your custom monitoring, you can stop here.
2. Read the documentation for the
[receiver.k8sclusterreceiver.reportCpuMetricsAsDouble](https://github.com/signalfx/splunk-otel-collector-chart/tree/splunk-otel-collector-0.54.0/docs/advanced-configuration.md#highlighted-feature-gates)
feature gate and the bug fix it applies.
3. If the bug fix will break any of your custom monitoring for the affected
metrics, update your monitoring to accommodate the bug fix.
- Feature Gate Stages and Versions
- Alpha (versions 0.47.1-0.48.0):
- The feature gate is disabled by default. Use the `--set clusterReceiver.featureGates=receiver.k8sclusterreceiver.reportCpuMetricsAsDouble`
argument with the helm install/upgrade command, or add the following line to
your custom values.yaml to enable the feature gate:
```yaml
clusterReceiver:
featureGates: receiver.k8sclusterreceiver.reportCpuMetricsAsDouble
```
- Beta (versions 0.49.0-0.54.0):
- The feature gate is enabled by default. Use the `--set clusterReceiver.featureGates=-receiver.k8sclusterreceiver.reportCpuMetricsAsDouble`
argument with the helm install/upgrade command, or add the following line to
your custom values.yaml to disable the feature gate:
```yaml
clusterReceiver:
featureGates: -receiver.k8sclusterreceiver.reportCpuMetricsAsDouble
```
- Generally Available (versions +0.55.0):
- The receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate
functionality is permanently enabled and the feature gate is no longer available
for anyone.

## 0.44.1 to 0.45.0

Expand Down
45 changes: 0 additions & 45 deletions docs/advanced-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -529,51 +529,6 @@ helm install {name} --set agent.featureGates=+feature1 --set clusterReceiver.fea
Would result in the agent having feature1 enabled, the clusterReceiver having feature2 enabled, and the gateway having
feature2 disabled.

### Highlighted feature gates
- receiver.k8sclusterreceiver.reportCpuMetricsAsDouble
- Description
- A bug was reported where three Kubernetes cpu metrics emitted by the k8sclusterreceiver do not follow
OpenTelemetry cpu metric specifications. To address this issue, we are slowly transitioning the affected metrics
to follow the proper specifications. The k8sclusterreceiver will transition emitting the affected metrics from
integer millicpu units to double cpu units.
- Example Dashboard: [k8s.container.cpu_request: 300 (millicpu units) -> k8s.container.cpu_request: 0.3 (cpu units)](https://drive.google.com/file/d/1GkrhAonJZG7aDNGAx7vggfbOn_lqtY8f/view)
- From a user's perspective, this change will cause the affected metrics to be double (instead of integer) values
as well as the metric values will be scaled down by 1000x. This can be a breaking change for current monitoring
involving the affected metrics, users may have to update alerts and dashboards to accommodate this change.
- To help mitigate user friction during this transition, we are doing a couple of things.
- We are rolling out the bug fix slowly behind a feature gate. The feature gate will have 3 stages that will be released with specific versions of the collector (more info below).
- We have included
[warning messages](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/2324192480395cf0c193e3c9048b02969c38003e/receiver/k8sclusterreceiver/internal/collection/collector.go#L103)
about this change in the k8sclusterreceiver startup logs.
- Affected Metrics
- Note: These metrics have a current and a legacy name, we list both as pairs (current, legacy) below.
- `k8s.container.cpu_request`, `kubernetes.container_cpu_request`
- `k8s.container.cpu_limit`, `kubernetes.container_cpu_limit`
- `k8s.node.allocatable_cpu`, `kubernetes.node_allocatable_cpu`
- Stages and Timeline
- Alpha (current stage)
- In this stage the feature gate is disabled by default and must be enabled by the user. This allows users to preemptively opt in and start using the bug fix by enabling the feature gate.
- Collector version: v0.47.0
- Release Date: Late March 2022
- Beta
- In this stage the feature gate is enabled by default and can be disabled by the user.
- Users could experience some friction in this stage, they may need to update monitoring for the affected metrics or opt out of using the bug fix by disabling the feature gate.
- Target Collector version: v0.50.0
- Target Release Date: Early May 2022
- Generally Available
- In this stage the feature gate is permanently enabled and the feature gate is no longer available for anyone.
- Users could experience some friction in this stage, they may have to update monitoring for the affected metrics or be blocked from upgrading the collector to versions v0.53.0 and newer.
- Target Collector version: v0.53.0
- Target Release Date: Mid June 2022
- Applying The Bug Fix With The Feature Gate
- Install with the feature gate enabled:
- helm install {name} --set clusterReceiver.featureGates=receiver.k8sclusterreceiver.reportCpuMetricsAsDouble {other_flags}
- Install with the feature gate disabled:
- helm install {name} --set clusterReceiver.featureGates=-receiver.k8sclusterreceiver.reportCpuMetricsAsDouble {other_flags}
- More Information
- [receiver.k8sclusterreceiver.reportCpuMetricsAsDouble feature gate documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/8c0ebe6f08d09f2c70a3d52f62b6203b6706ebe1/receiver/k8sclusterreceiver/README.md?plain=1#L56)
- [OpenTelemetry CPU Metric Specifications](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/semantic_conventions/system-metrics.md#systemcpu---processor-metrics)

## Override underlying OpenTelemetry agent configuration

If you want to use your own OpenTelemetry Agent configuration, you can override it by providing a custom configuration in the `agent.config` parameter in the values.yaml, which will be merged into the default agent configuration, list parts of the configuration (for example, `service.pipelines.logs.processors`) to be fully re-defined.

0 comments on commit 0222007

Please sign in to comment.