-
Notifications
You must be signed in to change notification settings - Fork 851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Append unit to prometheus metric names #5400
Conversation
The unit conversion function that converts OTLP units to Prometheus now takes in the rawUnitName and PrometheusType instead of OTLP metric data. This leads to more accurate conversion of OTLP units to Prometheus units since the unit type representative of types recognized by Prometheus
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #5400 +/- ##
============================================
+ Coverage 91.31% 91.36% +0.05%
- Complexity 4892 4952 +60
============================================
Files 547 549 +2
Lines 14412 14503 +91
Branches 1354 1359 +5
============================================
+ Hits 13160 13251 +91
Misses 866 866
Partials 386 386
☔ View full report in Codecov by Sentry. |
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Show resolved
Hide resolved
The static Map would be too memory intensive compared to a switch-case and does not give any performance boost in its comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Down to minor commments.
Overall LGTM.
High level - we should probably optimise the string manipulation to involve less allocations over time. We can check w/ @jack-berg on whether or not we have a prometheus-exporter benchmark where we could measure allocations, etc. That can be follow on work here.
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Show resolved
Hide resolved
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
private static String metricName(String rawMetricName, PrometheusType type) { | ||
String name = NameSanitizer.INSTANCE.apply(rawMetricName); | ||
private static String metricName(MetricData rawMetric, PrometheusType type) { | ||
String name = NameSanitizer.INSTANCE.apply(rawMetric.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move this line below the next if
you can probably get rid of the cleanUpString
method in PrometheusUnitsHelper
-- no need to perform sanitization twice after all. (And the performance will benefit from the cache in NameSanitizer
as well)
prometheusCompliant = prometheusCompliant.replaceAll("_+$", ""); // remove trailing underscore | ||
prometheusCompliant = prometheusCompliant.replaceAll("^_+", ""); // remove leading underscore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for these replacements? Prometheus does not seem to forbid double or trailing underscores in its spec; and it does not seem that the previous operations on the unit are likely to produce trailing or duplicated underscores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec linked in the PR description states -
Multiple consecutive `_` characters MUST be replaced with a single `_` character
Based on that I removed leading _
, because the serializer was already adding it.
name = name + "_" + prometheusEquivalentUnit;
But looking at this again, I think this would be confusing to do in the PrometheusUnitHelper
. I will remove this, also the spec does not explicitly mention that the final unit name cannot have a trailing _
, so I think leading _
is fine. I will make changes.
Also, the current name sanitizer does not seem to remove consecutive _
, so based on spec this needs to be added as well.
TL;DR
- The PrometheusUnitHelper will just return the computed unit (with illegal characters replaced with
_
) without any additional cleaning up. - The Sanitizing will be done by the serializer right before returning. The sanitizing will only take care of replacing any remaining Illegal characters with
_
and then replacing consecutive_
with a single underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to mention here that while not explicitly stated in the spec, this is how opentelemetry-collector works, should we try to keep this in sync with how the collector works ?
return SANITIZE_LEADING_UNDERSCORES | ||
.matcher( | ||
SANITIZE_TRAILING_UNDERSCORES | ||
.matcher( | ||
SANITIZE_CONSECUTIVE_UNDERSCORES | ||
.matcher(INVALID_CHARACTERS_PATTERN.matcher(string).replaceAll("_")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: as a future performance improvement we could probably replace all these regexes (and those in NameSanitizer
too) with a single loop that removes all the unwanted characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of small comments but overall looks good. Test cases really help make it clear what's happening.
Sorry it took so long to get eyes on this!
...ers/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusUnitsHelper.java
Outdated
Show resolved
Hide resolved
static String metricName(MetricData rawMetric, PrometheusType type) { | ||
String name = NameSanitizer.INSTANCE.apply(rawMetric.getName()); | ||
String prometheusEquivalentUnit = | ||
PrometheusUnitsHelper.getEquivalentPrometheusUnit(rawMetric.getUnit()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing how this is used, we probably want to cache the unit conversion the same way NameSanitizer
caches its responses. Maybe even cache the combination of the metric name, unit, and prometheus type to avoid repeatedly performing the concatenation / suffix logic below.
exporters/prometheus/src/main/java/io/opentelemetry/exporter/prometheus/Serializer.java
Outdated
Show resolved
Hide resolved
exporters/prometheus/src/test/java/io/opentelemetry/exporter/prometheus/SerializerTest.java
Outdated
Show resolved
Hide resolved
exporters/prometheus/src/test/java/io/opentelemetry/exporter/prometheus/SerializerTest.java
Outdated
Show resolved
Hide resolved
@jack-berg Thanks for the review ! |
...rometheus/src/main/java/io/opentelemetry/exporter/prometheus/PrometheusMetricNameMapper.java
Outdated
Show resolved
Hide resolved
A dedicated class to represent cache mapping keys would prevent certain edge cases where String concatination could yield same result even when individual Strings being concatinated are different.
Fixes #4390
This PR appends the unit name to the prometheus metric name generated by the exporter. The PR follows guidelines from this portion of the spec and closely follows the implementation details of the Opentelemetry Collector.
Difference between collector generated names and prometheus exporter
The collector will drop the entire unit if it encounters
{
or}
.In contrast, this PR will drop only portions between
{}
and process the remaining string. This is done because the spec suggests to drop only portions in{}
.If the metric name starts with a number, Collector simply appends it with an
_
, the current implementation (existing, not affected by this PR) replaces the number with a_
. This is not changed by this PR.The Collector tokenizes both - the input metric name and its unit. Each unit token is individually processed, and is appended to the final name if the name does not contain that token. To clarify, for a given metric and unit, the collector will:
metric_total_hertz
Metric Unit:hertz_total
will result in final namemetric_total_hertz
metric_total_hertz_hertz_total
, this is because the spec suggests to avoid appending unit ifmetric name already contains the unit
.Notes:
_
in the unit are not mandated by the spec, but the implementation of unit conversion in the collector does not allow for any leading or trailing_
.