Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Supportability: print own metrics in logs
Resolves open-telemetry#2098 The Collector's own metrics are an important source of information for troubleshooting. Typically own metrics are scraped using a Prometheus receiver and send to a backend where they can be examined. However, this only works if the Collector's metric pipeline works and the backend is available. If this is not the case, which often happens when the Collector is misconfigured and cannot send metrics or the backend is not available, then these metrics are not possible to see anywhere. In an effort to improve supportability of the Collector in such situations we want to output own metrics in a log file. In difficult situations the local Collector log is the important source of troubleshooting information. We periodically log the metric values in a human readable form. When --log-level=debug command line option is passed the metrics are logged as often as they are exported by stats (currently every 10 seconds). If the --log-level is not set to debug then metrics are logged once every 5 minutes. Sample output: ``` 2020-11-10T09:59:34.884-0500 INFO service/telemetry_log.go:234 Internal Metrics: Metric | Value --------------------------------------------------|-------------------------------- exporter/send_failed_log_records | exporter/sent_log_records | fluent_closed_connections | fluent_events_parsed | fluent_opened_connections | fluent_parse_failures | fluent_records_generated | grpc.io/client/completed_rpcs | grpc.io/client/received_bytes_per_rpc | grpc.io/client/received_messages_per_rpc | grpc.io/client/roundtrip_latency | grpc.io/client/sent_bytes_per_rpc | grpc.io/client/sent_messages_per_rpc | grpc.io/server/completed_rpcs | grpc.io/server/received_bytes_per_rpc | grpc.io/server/received_messages_per_rpc | grpc.io/server/sent_bytes_per_rpc | grpc.io/server/sent_messages_per_rpc | grpc.io/server/server_latency | kafka_receiver_current_offset | kafka_receiver_messages | kafka_receiver_offset_lag | kafka_receiver_partition_close | kafka_receiver_partition_start | process/cpu_seconds | 0 process/memory/rss | 44,625,920 process/runtime/heap_alloc_bytes | 13,168,120 By process/runtime/total_alloc_bytes | 28,170,760 By process/runtime/total_sys_memory_bytes | 76,366,848 By process/uptime | 55.006789 s processor/accepted_log_records | processor/accepted_metric_points | processor/accepted_spans | processor/batch/batch_send_size_bytes | processor/batch/batch_size_trigger_send | processor/batches_received | processor/dropped_log_records | processor/dropped_metric_points | processor/dropped_spans | processor/queued_retry/fail_send | processor/queued_retry/queue_latency | processor/queued_retry/queue_length | processor/queued_retry/send_latency | processor/queued_retry/success_send | processor/refused_log_records | processor/refused_metric_points | processor/refused_spans | processor/spans_dropped | processor/spans_received | processor/trace_batches_dropped | receiver/accepted_log_records | receiver/refused_log_records | scraper/errored_metric_points | scraper/scraped_metric_points | --------------------------------------------------|-------------------------------- Component/Dimensions | Metric | Value --------------------------------------------------|-----------------------------------------|-------------------------------- exporter=otlphttp | exporter/send_failed_metric_points | 57 exporter=otlphttp | exporter/send_failed_spans | 1,085 exporter=otlphttp | exporter/sent_metric_points | 0 exporter=otlphttp | exporter/sent_spans | 0 processor=batch | processor/batch/batch_send_size | 2/33.439024/80 (min/mean/max) Occurrences=41 processor=batch | processor/batch/timeout_trigger_send | 41 receiver=jaeger, transport=collector_http | receiver/accepted_spans | 1,306 receiver=jaeger, transport=collector_http | receiver/refused_spans | 0 receiver=prometheus, transport=http | receiver/accepted_metric_points | 77 receiver=prometheus, transport=http | receiver/refused_metric_points | 0 --------------------------------------------------|-----------------------------------------|-------------------------------- ```
- Loading branch information