Prometheus histograms #1165

beniwohli · 2021-06-14T12:15:21Z

What does this pull request do?

The spec for histograms has been merged in
elastic/apm-server#5360. This implements the agent-side
histogram, as well as a Prometheus bridge.

Instead of letting the thread die, we log the exception with a stack trace to ease debugging.

The spec for histograms has been merged in elastic/apm-server#5360

apmmachine · 2021-06-14T12:20:06Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2021-07-19T09:55:26.732+0000
Duration: 25 min 18 sec
Commit: ff607b1

Test stats 🧪

Test	Results
Failed	0
Passed	9521
Skipped	8604
Total	18125

Trends 🧪

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	9521
Skipped	8604
Total	18125

basepi

This looks great. 👍

It seems like some users will likely want to modify the defaults on these histogram metrics, specifically the buckets. Are we planning to make that configurable, or are we going to just point users to the prometheus bridge?

beniwohli · 2021-06-14T15:17:53Z

@basepi it already is configurable in the API call when creating a histogram, see e.g. the tests: https://github.com/elastic/apm-agent-python/pull/1165/files#diff-478c95bc12f712e1c8be230780ea39fe00e2213dd8162c81656d4199c0881e35R91

Generally, the metrics API isn't considered public, but I think we should reconsider it at some point, as it covers most use cases now. Forcing people to use the prometheus_client if they aren't migrating away from Prometheus in the first place feels like a crutch

basepi · 2021-06-14T15:23:16Z

Yeah we should probably document that and make some public API entry-points (since we're largely trying to move away from interacting with the Client object directly). I created an issue: #1166

axw

I've only really looked closely at the Prometheus bits and how things are being sent to the server. I think that we should consolidate on one approach for converting bucket ranges to single values. Please take a look at the comments in the apm-server code I linked and let me know if you think that's a reasonable approach.

axw · 2021-06-15T02:12:21Z

elasticapm/metrics/sets/prometheus.py

+        sample_pos = 0
+        prev_val = 0
+        counts = []
+        values = []
+        name = self._registry.client.config.prometheus_metrics_prefix + name
+        while sample_pos < len(samples):
+            sample = samples[sample_pos]
+            if "le" in sample.labels:
+                values.append(float(sample.labels["le"]))
+                counts.append(sample.value - prev_val)
+                prev_val = sample.value
+                sample_pos += 1
+
+            else:
+                # we reached the end of one set of buckets/values, this is the "count" sample
+                self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts
+                prev_val = 0
+                counts = []
+                values = []
+                sample_pos += 3  # skip sum/created samples
+
+    METRIC_MAP = {
+        "counter": _prom_counter_handler,
+        "gauge": _prom_gauge_handler,
+        "summary": _prom_summary_handler,
+        "histogram": _prom_histogram_handler,
+    }


In APM Server we have some code for handling OpenTelemetry Metrics, following Prometheus's histogram_quantile method: https://github.com/elastic/apm-server/blob/f7c7b0c6873d40f533c072b3d9751c8fe285e63e/processor/otel/metrics.go#L191-L252

I think we should probably follow the same approach here:

for +Inf "le", use the preceding bucket's value

for the first bucket only: if it has a negative "le", use the value as-is; otherwise use half its value (midpoint to zero)

for all other buckets, use the midpoint from that bucket's value to the preceding bucket's

Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

Is that the same strategy that metricbeat is using to convert prometheus histograms?

Good question. Looking at https://github.com/elastic/beats/blob/master/x-pack/metricbeat/module/prometheus/collector/histogram.go, no - there's some similarity, but it's not the same. Also I miswrote the rules above, now corrected.

Metricbeat is doing the following:

for +Inf "le", interpolate by adding half the difference between the previous two buckets to the second last bucket.

for the first bucket, use the midpoint from the bucket's value to zero. I think that's probably a bug, as it could lead to incorrect ordering of histogram values? e.g. if you have le=-100 then le=-99, then the first one will report as -50, and the second as -99.5

for all other buckets, use the midpoint from that bucket's value to the preceding bucket's

@exekias WDYT about changing Metricbeat's implementation to what I outlined in #1165 (comment)? Or otherwise, can you elaborate on the rationale behind the current implementation? We should probably all do the same thing. In retrospect, it would be helpful if Elasticsearch accepted bucket ranges 😅

Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

t-digest. Prometheus histograms are general and may have negative values, which won't work with HDRHistogram.

At some point, we've discussed whether to encode in _meta if the histogram is optimized for tdigest or hdrhistogram. Do you remember where we landed?
Seems relevant as the default transaction duration metrics are stored with hdrhistogram in mind whereas we need to use tdigest for prometheus to accoutn for negative values.

That was in an email thread with Lens folks. I think everyone agreed that it was a reasonable idea, but I haven't taken the idea any further yet. Lens will default to (like Elasticsearch) tdigest, and there's an issue open to support hdrhistogram: elastic/kibana#98499.

I've thought a little more about that since, and I think we would only ever end up setting it for known fields like transaction.duration.histogram. I'm not aware of any metrics APIs that restrict to positive-value histograms, which would be necessary to default to hdrhistogram instead. If there are any such metrics APIs, then it would be a reasonable extension, but for now I think we should just assume tdigest is going to be used for application metrics.

I never thought about negative le values, is that a possibility at all? I think we can adjust Metricbeat implementation if so. FYI @ChrsMark

If the underlying measurement can be negative (energy flow, temperature, monetary values etc), the bucket limits also have to be negative, right?

I did a little bit more of research, and negative buckets are indeed a possibility, although so badly supported that I don't think they can be used for real: https://github.com/prometheus/client_golang/blob/cb5c8ff250fc376cc4088d824a262565fde83612/prometheus/histogram.go#L50-L56

Anyway I'm happy to adjust the implementation, as it won't change anything for positive buckets.

Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf bucket that's considered a bad config.

Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf bucket that's considered a bad config.

I assume this question is for me, based on the description in my initial comment. It's not +Inf on the last bucket, but the second last bucket's upper bound. I mainly chose that following what Prometheus's histogram_quantile function does:

If a quantile is located in the highest bucket, the upper bound of the second highest bucket is returned.

My thinking is that since we don't have any insight into that last bucket, so reporting anything greater than the second last bucket's upper limit risks overstating the value. Reporting the second last bucket's upper limit is a bit more conservative, ensuring that we report a valid lower bound for percentile queries that fall beyond the configured buckets.

Thanks for the detailed explanation, I've created elastic/beats#26903 to adjust Beats implementation 👍

axw · 2021-06-15T02:16:32Z

elasticapm/metrics/base_metrics.py

+                if counter is not noop_metric:
+                    val = counter.val
+                    if val or not counter.reset_on_collect:
+                        samples[labels].update({name: {"value": val, "type": "counter"}})


I think we should omit "type": "counter" if the counter is resetting; just omit "type" altogether for now. Counter metrics are expected to be monotonically increasing.

felixbarny · 2021-06-15T06:16:24Z

elasticapm/metrics/base_metrics.py

@@ -375,6 +396,46 @@ def val(self, value):
            self._val, self._count = value


+class Histogram(BaseMetric):
+    DEFAULT_BUCKETS = (0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10, float("inf"))


@axw what are the default buckets that APM Server creates for transaction duration metrics?

Too many to list! APM Server uses HDRHistogram with 2 significant figures by default.

felixbarny · 2021-06-15T06:22:39Z

elasticapm/metrics/sets/prometheus.py

+        sample_pos = 0
+        prev_val = 0
+        counts = []
+        values = []
+        name = self._registry.client.config.prometheus_metrics_prefix + name
+        while sample_pos < len(samples):
+            sample = samples[sample_pos]
+            if "le" in sample.labels:
+                values.append(float(sample.labels["le"]))
+                counts.append(sample.value - prev_val)
+                prev_val = sample.value
+                sample_pos += 1
+
+            else:
+                # we reached the end of one set of buckets/values, this is the "count" sample
+                self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts
+                prev_val = 0
+                counts = []
+                values = []
+                sample_pos += 3  # skip sum/created samples
+
+    METRIC_MAP = {
+        "counter": _prom_counter_handler,
+        "gauge": _prom_gauge_handler,
+        "summary": _prom_summary_handler,
+        "histogram": _prom_histogram_handler,
+    }


Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

…uckets

axw

changes LGTM FWIW :) thanks

* catch and log exceptions in the interval timer function Instead of letting the thread die, we log the exception with a stack trace to ease debugging. * implement histogram metrics and a prometheus histogram bridge The spec for histograms has been merged in elastic/apm-server#5360 * trying to debug failure that only happens on CI... * adapt prometheus histograms to use centroids instead of upper limit buckets * move midpoint calculation into base metrics

beniwohli added 2 commits June 2, 2021 10:55

catch and log exceptions in the interval timer function

e5107df

Instead of letting the thread die, we log the exception with a stack trace to ease debugging.

implement histogram metrics and a prometheus histogram bridge

3bc05d5

The spec for histograms has been merged in elastic/apm-server#5360

beniwohli added this to the 7.14 milestone Jun 14, 2021

beniwohli requested review from basepi and axw June 14, 2021 12:15

github-actions bot added the agent-python label Jun 14, 2021

Merge branch 'master' into prometheus-histograms

dd85ad9

basepi approved these changes Jun 14, 2021

View reviewed changes

trying to debug failure that only happens on CI...

7626a96

axw requested changes Jun 15, 2021

View reviewed changes

felixbarny reviewed Jun 15, 2021

View reviewed changes

beniwohli added 3 commits June 30, 2021 15:06

Merge branch 'master' into prometheus-histograms

ab2327b

adapt prometheus histograms to use centroids instead of upper limit b…

99ffc0d

…uckets

Merge branch 'master' into prometheus-histograms

6c88c43

exekias mentioned this pull request Jul 15, 2021

Adjust Prometheus histogram buckets calculation elastic/beats#26903

Closed

beniwohli added 2 commits July 19, 2021 10:28

Merge branch 'master' into prometheus-histograms

5d328c7

move midpoint calculation into base metrics

ff607b1

beniwohli merged commit dec91b4 into elastic:master Jul 19, 2021

axw reviewed Jul 20, 2021

View reviewed changes

gizas mentioned this pull request Jan 18, 2023

Enhance Histogram feature implementation of Prometheus Server Integration elastic/integrations#5042

Open

tetianakravchenko mentioned this pull request Oct 10, 2023

[Prometheus] Align on the algorithm used to transform Prometheus histograms into Elasticsearch histograms elastic/beats#36647

Merged

6 tasks

shmsr mentioned this pull request Nov 14, 2024

Missing count and sum in prometheus histogram metrics elastic/beats#41573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus histograms #1165

Prometheus histograms #1165

beniwohli commented Jun 14, 2021

apmmachine commented Jun 14, 2021 •

edited

Loading

Build stats

Test stats 🧪

Trends 🧪

Test stats 🧪

basepi left a comment

beniwohli commented Jun 14, 2021

basepi commented Jun 14, 2021

axw left a comment

axw Jun 15, 2021 •

edited

Loading

felixbarny Jun 15, 2021

axw Jun 15, 2021 •

edited

Loading

felixbarny Jun 15, 2021

axw Jun 15, 2021

exekias Jul 15, 2021

beniwohli Jul 15, 2021

exekias Jul 15, 2021

axw Jul 15, 2021

exekias Jul 15, 2021

axw Jun 15, 2021

felixbarny Jun 15, 2021

axw Jun 15, 2021

felixbarny Jun 15, 2021

axw left a comment

Prometheus histograms #1165

Prometheus histograms #1165

Conversation

beniwohli commented Jun 14, 2021

What does this pull request do?

apmmachine commented Jun 14, 2021 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

Trends 🧪

💚 Flaky test report

Test stats 🧪

basepi left a comment

Choose a reason for hiding this comment

beniwohli commented Jun 14, 2021

basepi commented Jun 14, 2021

axw left a comment

Choose a reason for hiding this comment

axw Jun 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axw Jun 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axw left a comment

Choose a reason for hiding this comment

apmmachine commented Jun 14, 2021 •

edited

Loading

axw Jun 15, 2021 •

edited

Loading

axw Jun 15, 2021 •

edited

Loading