[Prometheus Receiver] Histogram and Summary metric count value outputs minInt64 for non-numerical input values #6376

PaurushGarg · 2021-11-18T19:19:20Z

Describe the bug

We want to ensure OpenMetrics / Prometheus compatibility in the OpenTelemetry Collector. We have been building compatibility tests to verify the OpenMetrics spec is fully supported on both the OpenTelemetry Collector Prometheus receiver and PRW exporter as well as in Prometheus itself.

Prometheus Receiver should assign a metric staleNaN value, if the metric is missing in the current scrape but was present in the previous scrape. However, currently metric builder do not assign staleNaN values to the histogram and summary values, that are passed by the Prometheus scrape loop for failed scrapes.
Histogram and Summary count values are int64, and the casting of float64 type num-numerical values(staleNaN, normalNaN, and +-Inf) assign minInt64(-9223372036854775808) number to the count.

Steps to reproduce

Run func TestEndToEnd(t *testing.T)
Currently, the validate loop is skipped for the tests, re-enable the validate loop by removing/commenting following lines from func testEndToEnd(...) (lines 1442-1445)

if true {
   t.Log(`Skipping the "up" metric checks as they seem to be spuriously failing after staleness marker insertions`)
   return}

Note: the test fails in getValidScrapes due to staleness, inspect the metrics and find the first failed scrape. The Histogram and Summary count values in the failed scrapes are minInt64(-9223372036854775808) instead of non-numerical values (staleNaN, normalNaN, and +-Inf)

What did you see instead?

Scraping endpoints that contains histogram/summary metric and a failed scrape in between, produces the following graph in Prometheus Web UI. The histogram/summary count value is plotted as the peak in the below graph:

However, if same data is passed directly to the Prometheus Server. Prometheus WebUI produces following graph:

Possible Solution

Since count value (int64) for histogram/summary can not be assigned float64 values, one possible solution is to use directly use OTLP format in OTLP Prometheus receiver metricbuilder, and assign datapoint flags (MetricDataPointFlagNoRecordedValue) to the metric as staleness marker. See linked issue: #6400

What version did you use?

Collector-Contrib: v- 0.37.1

Additional context
Related to open-telemetry/prometheus-interoperability-spec#57
Linked Issue: #6400 #6000 #6087

cc @alolita @Aneurysm9

The text was updated successfully, but these errors were encountered:

bogdandrutu · 2021-11-19T16:45:58Z

@PaurushGarg I agree, the prometheus receiver should not set that "NaN" value but instead should use the OTLP native no-value present for that (for all metrics not just for histograms).

PaurushGarg · 2021-11-19T16:55:39Z

@PaurushGarg I agree, the prometheus receiver should not set that "NaN" value but instead should use the OTLP native no-value present for that (for all metrics not just for histograms).

@bogdandrutu thanks. Is there a tracking issue for Prometheus Receiver to directly use OTLP format in metric builder? If not, do we need to create one?

PaurushGarg mentioned this issue Nov 18, 2021

[Prometheus Receiver] Incorrect start_timestamp of Summary and Histogram metrics after a failed scrape #6360

Closed

mx-psi added comp:prometheus Prometheus related issues comp: receiver Receiver labels Nov 30, 2021

JamesJHPark mentioned this issue Dec 7, 2021

[Prometheus Remote Write Exporter] Ensure Prometheus Remote Write Exporter respond to OTLP data format from Prometheus Receiver #6620

Closed

hyunuk mentioned this issue Dec 14, 2021

[Prometheus Exporter] Ensure Prometheus Exporter respond to OTLP data format from Prometheus Receiver #6805

Closed

PaurushGarg mentioned this issue Dec 15, 2021

REQUEST: New membership for @PaurushGarg open-telemetry/community#934

Closed

6 tasks

Aneurysm9 mentioned this issue Dec 28, 2021

[prometheus exporter] Metric value overflow on metrics_expiration #6935

Closed

Aneurysm9 mentioned this issue Jan 5, 2022

[receiver/prometheus] set pdata stale marker #7043

Merged

codeboten closed this as completed in #7043 Jan 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Prometheus Receiver] Histogram and Summary metric count value outputs minInt64 for non-numerical input values #6376

[Prometheus Receiver] Histogram and Summary metric count value outputs minInt64 for non-numerical input values #6376

PaurushGarg commented Nov 18, 2021 •

edited

Loading

bogdandrutu commented Nov 19, 2021

PaurushGarg commented Nov 19, 2021

[Prometheus Receiver] Histogram and Summary metric count value outputs minInt64 for non-numerical input values #6376

[Prometheus Receiver] Histogram and Summary metric count value outputs minInt64 for non-numerical input values #6376

Comments

PaurushGarg commented Nov 18, 2021 • edited Loading

bogdandrutu commented Nov 19, 2021

PaurushGarg commented Nov 19, 2021

PaurushGarg commented Nov 18, 2021 •

edited

Loading