-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial draft of Prometheus <-> OTLP datamodel specification. #2017
Changes from all commits
d80b07d
037c518
14bfdcb
59b89bb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -988,6 +988,125 @@ For comparison, see the simple logic used in | ||||||||||
[statsd sums](https://github.com/statsd/statsd/blob/master/stats.js#L281) | |||||||||||
where all points are added, and lost points are ignored. | |||||||||||
|
|||||||||||
## Prometheus Compatibility | |||||||||||
|
|||||||||||
**Status**: [Experimental](../document-status.md) | |||||||||||
|
|||||||||||
This section denotes how to convert from prometheus scraped metrics to the | |||||||||||
OpenTelemtery metric data model and how to create prometheus metrics from | |||||||||||
OpenTelemetry metric data. | |||||||||||
|
|||||||||||
### Label Mapping | |||||||||||
|
|||||||||||
Prometheus metric labels are split in OpenTelemetry across Resource attributes | |||||||||||
and Metric data stream attributes. Some labels are used within metric | |||||||||||
families to denote semantics which open-telemetry captures within the structure | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Intentional There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no, good catch! |
|||||||||||
of a data point. When mapping from prometheus to OpenTelemetry, any label | |||||||||||
that is not explicitly called out as being handled specially will be included | |||||||||||
in the set of attributes for a metric data stream. | |||||||||||
|
|||||||||||
Here is a table of the sett of prometheus labels that are lifted into Resource | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
|||||||||||
attributes when converting into OpenTelemetry. | |||||||||||
|
|||||||||||
| Prometheus Label | OTLP Resource Attribute | Description | | |||||||||||
| -------------------- | ----------------------- | ----------- | | |||||||||||
| `job` | `service.name` | [Semantic convention](../resource/semantic_conventions/README.md#Service) | | |||||||||||
| `host` | `host.name` | [Semantic convention](../resource/semantic_conventions/host.md) | | |||||||||||
| `instance` | `instance` | ... | | |||||||||||
| `port` | `port` | ... | | |||||||||||
| `__scheme__` | `scheme` | ... | | |||||||||||
|
|||||||||||
Next, this set of attributes are "special" and used when converting from a | |||||||||||
metric family to a specific OTLP metric data point: | |||||||||||
|
|||||||||||
- `le` label is used to identify histogram bucket boundaries and counts. | |||||||||||
- `quantile` label is used to identify quantile points in summary metrics. | |||||||||||
- `__name__` is used to identify the metric name of the data point. | |||||||||||
- `__metrics_path__` is ignored in OTLP. | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Except for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The collector seems to use |
|||||||||||
|
|||||||||||
Additionally, in Prometheus metric labels must match the following regex: `[a-zA-Z_:]([a-zA-Z0-9_:])*`. Metrics | |||||||||||
from OpenTelemetry with unsupported Attribute names should replace invalid characters with the `_` character. This | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest that we take a special case here: if the first character of the label key is a digit ( For example:
Rather than:
|
|||||||||||
may cause ambiguity in scenarios where mulitple similar-named attributes share invalid characters at the same | |||||||||||
location. This is considered an unsupported case, and is highly unlikely. | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
|||||||||||
|
|||||||||||
### Prometheus Metric points to OTLP | |||||||||||
|
|||||||||||
Prometheus allows metrics to reported in "metric family" groups. While | |||||||||||
OpenTelemetry can assume some shape/structure to metric family groups, any | |||||||||||
metric belonging to a family that is not treated specially should be exported | |||||||||||
as its own metric stream. For example, while histograms are expected to | |||||||||||
live in a metric family with metrics `{name}_sum`, `{name}_count` and `{name}`, | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to the Prometheus best practices for Histogram and Summaries, we should include unit:
|
|||||||||||
any other metric also reported in the family should be exported as an | |||||||||||
independent metric in OpenTelemetry. | |||||||||||
|
|||||||||||
TODO - A bit about detecting/using start_time. | |||||||||||
|
|||||||||||
Prometheus Counter becomes an OTLP Sum. | |||||||||||
|
|||||||||||
Prometheus Gauge becomes an OTLP Gauge. | |||||||||||
|
|||||||||||
Prometheus Unknown becomes an OTLP Gauge. | |||||||||||
|
|||||||||||
Prometheus Histogram becomes an OTLP Histogram. | |||||||||||
|
|||||||||||
Prometheus Summary becomes an OTLP Summary. | |||||||||||
|
|||||||||||
Prometheus Gauge Histogram is dropped (TBD). | |||||||||||
|
|||||||||||
Prometheus Stateset is dropped (TBD). | |||||||||||
|
|||||||||||
Prometheus Info is dropped (TBD). | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Considered Prometheus Info are typically represented with a constant 1 I could see this easily mapping into a OTel Gauge. As if I remember correctly they're already exported as Gauges. (So maybe it doesn't need to be specified) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok looking at Java client it used to be a part of Gauge metric family, but now it's been moved to Unknown. So based on our reference either way it should be a gauge. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should specify it in some way. I think coming in as a gauge works, we just need a way to do OpenMetrics -> OTLP -> OpenMetrics here. |
|||||||||||
|
|||||||||||
### OTLP Metric points to Prometheus | |||||||||||
|
|||||||||||
OpenTelemetry Gauge becomes a Prometheus Gauge. | |||||||||||
|
|||||||||||
TODO: Example Gauge Conversions | |||||||||||
|
|||||||||||
OpenTelemetry Sum follows this logic: | |||||||||||
|
|||||||||||
- If the aggregation temporality is cumulative and the sum is monotonic, | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we specify this on the reverse that Prometheus Counter becomes a cumulative monotonic OTLP Sum ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep |
|||||||||||
then it becomes a Prometheus Sum. | |||||||||||
- Otherwise the Sum becomes a Prometheus Gauge. | |||||||||||
|
|||||||||||
TODO: Example Sum Conversions | |||||||||||
|
|||||||||||
OpenTelemetry Histogram becomes a metric family with the following: | |||||||||||
|
|||||||||||
- A single `{name}_count` metric denoting the count field of the histogram. | |||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prometheus doesn't have a separate notion of "unit", based on the Prometheus Metric and Label naming convention, it seems the general guidance is to append unit as a suffix. For example:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that prometheus is moving the direction of OpenMetrics which does support Units as a concept https://github.com/OpenObservability/OpenMetrics we may want to suggest that path, so not changing the metrics name but instead is metadata on the metric. |
|||||||||||
All attributes of the histogram point are converted to prometheus labels. | |||||||||||
- `{name}_sum` metric denoting the sum field of the histogram, reported | |||||||||||
only if the sum is positive and monotonic. All attributes of the histogram | |||||||||||
point are converted to prometheus labels. | |||||||||||
- A series of `{name}` metric points that contain all attributes of the | |||||||||||
histogram point recorded as labels. Additionally, a label, denoted as `le` | |||||||||||
is added denoting a bucket boundary, and having its value be the stringified | |||||||||||
floating point value of bucket boundaries, starting form lowest to highest, | |||||||||||
and all being non-negative. The value of each point is the sum of the count | |||||||||||
of all histogram buckets up the the boundary reported in the `le` label. | |||||||||||
These points will include a single exemplar that falls within `le` label and | |||||||||||
no other `le` labelled point. | |||||||||||
|
|||||||||||
_Note: OpenTelemetry DELTA histograms are not exported to prometheus._ | |||||||||||
|
|||||||||||
TODO: Example Histogram conversion | |||||||||||
|
|||||||||||
OpenTelemetry Summary becomes a metric family with the following: | |||||||||||
|
|||||||||||
- A single `{name}_count` metric denoting the count field of the summary. | |||||||||||
All attributes of the summary point are converted to prometheus labels. | |||||||||||
- `{name}_sum` metric denoting the sum field of the summary, reported | |||||||||||
only if the sum is positive and monotonic. All attributes of the summary | |||||||||||
point are converted to prometheus labels. | |||||||||||
- A series of `{name}` metric points that contain all attributes of the | |||||||||||
summary point recorded as labels. Additionally, a label, denoted as | |||||||||||
`quantile` is added denoting a reported qunatile point, and having its value | |||||||||||
be the stringified floating point value of quantiles (between 0.0 and 1.0), | |||||||||||
starting from lowest to highest, and all being non-negative. The value of | |||||||||||
each point is the computed value of the quantile point. | |||||||||||
|
|||||||||||
TODO: Example Summary conversion | |||||||||||
|
|||||||||||
## Footnotes | |||||||||||
|
|||||||||||
<a name="otlpdatapointfn">[1]</a>: OTLP supports data point kinds that do not | |||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, shouldn't Prometheus be uppercase throughout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was hastily written, will go through and fix.