Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Common] Spec inconsistency with proto definition of Attributes #2581

Open
MSNev opened this issue May 25, 2022 · 8 comments
Open

[Common] Spec inconsistency with proto definition of Attributes #2581

MSNev opened this issue May 25, 2022 · 8 comments
Assignees
Labels
spec:trace Related to the specification/trace directory

Comments

@MSNev
Copy link
Contributor

MSNev commented May 25, 2022

The proto definition used for attributes uses "KeyValue" which can have a value of AnyValue, however, the spec definition of "Attribute" only allows for primitives or an Array of primitives, this is inconsistent.

What did you expect to see?

  • They should be defined in the same way or have their own distinct definitions

Additional context.

The Spec Attribute definition

Trace proto attribute definition

  // attributes is a collection of key/value pairs. Note, global attributes
  // like server name can be set using the resource API. Examples of attributes:
  //
  //     "/http/user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
  //     "/http/server_latency": 300
  //     "abc.com/myattribute": true
  //     "abc.com/score": 10.239
  //
  // The OpenTelemetry API specification further restricts the allowed value types:
  // https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/common.md#attributes
  // Attribute keys MUST be unique (it is not allowed to have more than one
  // attribute with the same key).
  repeated opentelemetry.proto.common.v1.KeyValue attributes = 9;

And the Logs spec also uses Attributes

Logs proto attribute definition

  // Additional attributes that describe the specific event occurrence. [Optional].
  // Attribute keys MUST be unique (it is not allowed to have more than one
  // attribute with the same key).
  repeated opentelemetry.proto.common.v1.KeyValue attributes = 6;

With Proto KeyValue defined as taking AnyValue

// KeyValue is a key-value pair that is used to store Span attributes, Link
// attributes, etc.
message KeyValue {
  string key = 1;
  AnyValue value = 2;
}

And Proto AnyValue being

// AnyValue is used to represent any type of attribute value. AnyValue may contain a
// primitive value such as a string or integer or it may contain an arbitrary nested
// object containing arrays, key-value lists and primitives.
message AnyValue {
  // The value is one of the listed fields. It is valid for all values to be unspecified
  // in which case this AnyValue is considered to be "empty".
  oneof value {
    string string_value = 1;
    bool bool_value = 2;
    int64 int_value = 3;
    double double_value = 4;
    ArrayValue array_value = 5;
    KeyValueList kvlist_value = 6;
    bytes bytes_value = 7;
  }
}

And Proto KeyValueList being

// KeyValueList is a list of KeyValue messages. We need KeyValueList as a message
// since `oneof` in AnyValue does not allow repeated fields. Everywhere else where we need
// a list of KeyValue messages (e.g. in Span) we use `repeated KeyValue` directly to
// avoid unnecessary extra wrapping (which slows down the protocol). The 2 approaches
// are semantically equivalent.
message KeyValueList {
  // A collection of key/value pairs of key-value pairs. The list may be empty (may
  // contain 0 elements).
  // The keys MUST be unique (it is not allowed to have more than one
  // value with the same key).
  repeated KeyValue values = 1;
}
@Oberon00
Copy link
Member

This is intentional. Maybe we should add a note in the spec about this?

@MSNev
Copy link
Contributor Author

MSNev commented May 31, 2022

As the protocol supports nested attributes, what is wrong with expanding "all" Attributes definition to support nested attributes.
And for any backends that don't support nested attributes (thinking exporters) this would allow them to flattern out the attribute names.

This would also allow for unifying attributes usage across the board and usage, especially in relation to the forthcoming EventsApi, which could allow for have a single API for both logs and trace transport mechanisms

@scheler
Copy link
Contributor

scheler commented Jun 5, 2022

@MSNev can you list some pros and cons of JSON.stringify'ing those nested objects as strings?

@MSNev
Copy link
Contributor Author

MSNev commented Jun 7, 2022

JSON.stringify() is not the specific driver for this spec issue. The issue is that the definitions are not consistent which means an Attribute is not always the same Attribute.

We should be treating all definitions of Attribute (supporting nested Attributes) as the same. And as protobuf is defined as supporting nested Attributes then this is "just" as spec issue (on the assumption that SDK's don't have this coded this way, but this should not cause any breaking changes as it's expanding the options.

The only potential issues will be for exporters that don't support nesting and that can be resolved by the specific exporter to flatten the attribute keys.

JSON.stringify() Side Note (for completeness)
Pros:
for JSON exporter is that using the built-in native serialization rather than creating coded versions is more efficient for CPU and having nested Attributes is more efficient for memory both pre and post serialization.
Cons:
JSON can be verbose, but this depends on the object(s) being serialized

@tigrannajaryan
Copy link
Member

The proto definition used for attributes uses "KeyValue" which can have a value of AnyValue, however, the spec definition of "Attribute" only allows for primitives or an Array of primitives, this is inconsistent.

As @Oberon00 mentioned, this is intentional. OTLP is at least as expressive as the Otel spec, but necessarily exactly as expressive as the spec. We allow OTLP to be more expressive because Otel API/SDK is not the only source of data that we want to represent in OTLP. Other data sources may require richer capabilities that OTLP does offer, in particular the ability to represent arbitrarily nested objects.

the spec definition of "Attribute" only allows for primitives or an Array of primitives, this is inconsistent.

This definition of "Attributes" currently applies to traces and metrics only, i.e. to the Stable parts of the spec. It does NOT apply to experimental logs, which has its own definition of what an attributes are and which allows nested objects.

As the protocol supports nested attributes, what is wrong with expanding "all" Attributes definition to support nested attributes.

This can be discussed separately. In the past there were objections to allowing this for traces and metrics.

So, in short the existing differences between traces/metrics and logs and between the spec and OTLP are known and intentional.

@scheler
Copy link
Contributor

scheler commented Jun 7, 2022

if LogRecord allows objects/key-value pairs for attribute values, then that is sufficient to address the requirements of RUM. @MSNev?

@MSNev
Copy link
Contributor Author

MSNev commented Jun 7, 2022

@tigrannajaryan That is where the inconsistencies are as the Spec also states here

"Resources, Metrics data points, Spans, Span Events, Span Links and Log Records may contain a collection of attributes.

From offline discussions, this section is also unclear as I read "Collection" as nested Attributes Map<> (coming from the protobuf def) vs as a single flat Array/List of Attributes.

@scheler While I found this as part of investigating the possibilities for the EventsAPI, this inconsistency is not specific to the RUM requirements. For Example the JS Sdk, supports nested objects as per the "Attribute Collections" definition.

@Oberon00
Copy link
Member

Oberon00 commented Jun 14, 2022

This was discussed numerous times. Nested attributes make it harder for backends to process (search, store, ...) them. Issue #376 is still open in fact, I think this is where this discussion belongs. (The issue also has two associated PRs with more discussion)

Leaving this open to track the inconsistency mentioned by @MSNev. But please continue discussion about whether nested attributes should be allowed for traces/metrics/resources (they are not now) to #376.

MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Jun 16, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Jun 16, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Jun 16, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Jun 16, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Jul 7, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Aug 2, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Aug 24, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Aug 29, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Sep 20, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Sep 23, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Oct 18, 2022
MSNev added a commit to MSNev/opentelemetry-specification that referenced this issue Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

5 participants