Skip to content

Commit

Permalink
Collector internal telemetry updates (#4867)
Browse files Browse the repository at this point in the history
Co-authored-by: Tiffany Hrabusa <[email protected]>
Co-authored-by: Alex Boten <[email protected]>
Co-authored-by: opentelemetrybot <[email protected]>
Co-authored-by: Phillip Carter <[email protected]>
  • Loading branch information
5 people authored Sep 7, 2024
1 parent 1fe2d78 commit e4f6838
Showing 1 changed file with 13 additions and 9 deletions.
22 changes: 13 additions & 9 deletions content/en/docs/collector/internal-telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,8 @@ own telemetry.

#### Data loss

Use the rate of `otelcol_processor_dropped_spans > 0` and
Use the rate of `otelcol_processor_dropped_log_records > 0`,
`otelcol_processor_dropped_spans > 0`, and
`otelcol_processor_dropped_metric_points > 0` to detect data loss. Depending on
your project's requirements, select a narrow time window before alerting begins
to avoid notifications for small losses that are within the desired reliability
Expand Down Expand Up @@ -317,19 +318,22 @@ logs for messages such as `Dropping data because sending_queue is full`.

#### Receive failures

Sustained rates of `otelcol_receiver_refused_spans` and
`otelcol_receiver_refused_metric_points` indicate that too many errors were
returned to clients. Depending on the deployment and the clients' resilience,
this might indicate clients' data loss.
Sustained rates of `otelcol_receiver_refused_log_records`,
`otelcol_receiver_refused_spans`, and `otelcol_receiver_refused_metric_points`
indicate that too many errors were returned to clients. Depending on the
deployment and the clients' resilience, this might indicate clients' data loss.

Sustained rates of `otelcol_exporter_send_failed_spans` and
Sustained rates of `otelcol_exporter_send_failed_log_records`,
`otelcol_exporter_send_failed_spans`, and
`otelcol_exporter_send_failed_metric_points` indicate that the Collector is not
able to export data as expected. These metrics do not inherently imply data loss
since there could be retries. But a high rate of failures could indicate issues
with the network or backend receiving the data.

#### Data flow

You can monitor data ingress with the `otelcol_receiver_accepted_spans` and
`otelcol_receiver_accepted_metric_points` metrics and data egress with the
`otelcol_exporter_sent_spans` and `otelcol_exporter_sent_metric_points` metrics.
You can monitor data ingress with the `otelcol_receiver_accepted_log_records`,
`otelcol_receiver_accepted_spans`, and `otelcol_receiver_accepted_metric_points`
metrics and data egress with the `otelcol_exporter_sent_log_records`,
`otelcol_exporter_sent_spans`, and `otelcol_exporter_sent_metric_points`
metrics.

0 comments on commit e4f6838

Please sign in to comment.