Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for flattening attributes from OTLP messages #21

Merged
merged 11 commits into from
Aug 17, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions specification/common/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ See [Requirement Level](attribute-requirement-level.md) for requirement levels g
See [this document](attribute-type-mapping.md) to find out how to map values obtained
outside OpenTelemetry into OpenTelemetry attribute values.

See [Attribute precedence for non-OTLP exporters](attribute-precedence.md) to
find out how to transform a structured representation like OTLP to a flat set of
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
unique attributes.

### Attribute Limits

Execution of erroneous code can result in unintended attributes. If there are no
Expand Down
175 changes: 175 additions & 0 deletions specification/common/attribute-precedence.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# Attribute Precedence on transformation to non-OTLP formats
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

pirgeo marked this conversation as resolved.
Show resolved Hide resolved
**Status**: [Experimental](../document-status.md)

<details>
<summary>Table of Contents</summary>

<!-- toc -->

- [Overview](#overview)
- [Attribute hierarchy in the OTLP message](#attribute-hierarchy-in-the-otlp-message)
- [Precedence per Signal](#precedence-per-signal)
- [Traces](#traces)
- [Span Events](#span-events)
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
- [Span links](#span-links)
- [Metrics](#metrics)
- [Metric exemplars](#metric-exemplars)
- [Logs](#logs)
- [Considerations](#considerations)
- [Example](#example)
- [Useful links](#useful-links)

<!-- tocstop -->

</details>

## Overview

This document provides supplementary guidelines for the attribute precedence
that exporters should follow when translating from the hierarchical OTLP format
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
to non-hierarchical formats.

A mapping is required when flattening out attributes from the structured OTLP
format, which has attributes at different levels (e.g., Resource attributes,
InstrumentationScope attributes, attributes on Spans/Metrics/Logs) to a
non-hierarchical representation (e.g., OpenMetrics labels).
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
In the case of OpenMetrics, the set of labels is completely flat and must have
unique labels only
(https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#labelset).
Since OpenTelemetry allows for different levels of attributes, it is feasible
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
that the same attribute appears multiple times on different levels.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewers upstream might point out that the current semantic conventions always define attributes to either be on the resource OR the telemetry signal. It might be worth pointing out in the PR description that this also covers custom attributes, which might appear anywhere.


This document aims to provide guidance on how OpenTelemetry attributes can be
consistently mapped to flat sets.

## Attribute hierarchy in OTLP messages

Since the OTLP format is a hierarchical format, there is an inherent order in
the attributes.
In this document,
[Resource](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md)
attributes are considered to be at the top of the hierarchy, since they are the
most general attributes.
Attributes on individual Spans/Metric data points/Logs are at the bottom of the
hierarchy, as they are most specialized and only apply to a subset of all data.

**A more specialized attribute that shares an attribute key with more general
attribute will take precedence.**
pirgeo marked this conversation as resolved.
Show resolved Hide resolved

In some cases it might be desirable to overwrite an attribute like this.
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
<!-- TODO example -->

When de-normalizing an OTLP message to a flat set of key-value pairs,
attributes that are present on the Resource and InstrumentationScope levels will
be duplicated for each Span/Metric data point/Log.
pirgeo marked this conversation as resolved.
Show resolved Hide resolved

## Precedence per Signal

Below, the precedence for each of the signals is spelled out explicitly.
Only Spans, Metric data points and LogRecords are considered.
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
Span Links, Span Events and Metric Exemplars need to be considered differently,
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
as conflicting entries there can lead to problematic data loss.
Consider a `http.host` attribute on a Span Link, which identifies the host of a
linked Span.
Following the "more specialized overwrites more general" suggestion leads to
overwriting the `http.host` attribute of the Span, which is likely desired
information.
Consider transferring attributes on Span Links, Span Events and Metric Exemplars
separately from the parent Span/Metric data point.

pirgeo marked this conversation as resolved.
Show resolved Hide resolved

`A > B` denotes that the attribute on `A` will overwrite the attribute on `B`
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
if the keys clash.

### Traces

```
Span.attributes > ScopeSpans.scope.attributes > ResourceSpans.resource.attributes
```

### Metrics

Metrics are different from Spans and LogRecords, as each Metric has a data field
which can contain one or more data points.
Each data point has a set of attributes, which need to be considered
independently.

```
Metric.data.data_points.attributes > ScopeMetrics.scope.attributes > ResourceMetrics.resource.attributes
```

### Logs

```
LogRecord.log_records.attributes > ScopeLogs.scope.attributes > ResourceLogs.resource.attributes
```

## Considerations

Note that this precedence is a strong suggestion, not a requirement.
Code that transforms attributes should follow this mode of flattening, but might
pirgeo marked this conversation as resolved.
Show resolved Hide resolved
diverge if they have a reason to do so.

## Example

The following is a theoretical YAML-like representation of an OTLP message which
has attributes with attribute names that clash on multiple levels.

```yaml
ResourceMetrics:
resource:
attributes:
# key-value pairs (attributes) on the resource
attribute1: resource-attribute-1
attribute2: resource-attribute-2
attribute3: resource-attribute-3
service.name: my-service

scope_metrics:
scope:
attributes:
attribute1: scope-attribute-1
attribute2: scope-attribute-2

metrics:
# there can be multiple data entries here.
data/0:
data_points:
# each data can have multiple data points:
data_point/1:
attributes:
# will overwrite scope and resource attribute
attribute1: data-point-1-attribute-1
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

data_point/2:
attributes:
# will overwrite
attribute1: data-point-2-attribute-1
```

The structure above contains two data points, thus there will be two data points
in the output.
Their attributes will be:

```yaml
# data point 1
service.name: my-service # from the resource
attribute1: data-point-1-attribute-1 # overwrites attribute1 on resource & scope
attribute2: scope-attribute-2 # overwrites attribute2 on resource
attribute3: resource-attribute-3 # from the resource, not overwritten

# data point 2
service.name: my-service # from the resource
attribute1: data-point-2-attribute-1 # overwrites attribute1 on resource & scope
attribute2: scope-attribute-2 # overwrites attribute2 on resource
attribute3: resource-attribute-3 # from the resource, not overwritten
```

## Useful links

* [Trace Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto)
* [Metrics Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto)
* [Logs Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/logs/v1/logs.proto)
* [Resource Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/resource/v1/resource.proto)