-
Notifications
You must be signed in to change notification settings - Fork 164
Conversation
(which is a general representation of a JSON object), with some embedded metadata - | ||
|
||
```protobuf | ||
message Value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you not use the existing AnyValue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few important differences - supporting NullValue
and the 'nested' metadata fields (and not supporting bytes
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is NullValue different from the "empty" value that AnyValue supports? What 'nested' metadata fields are you referring to? Do you mean original_encoding
and others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- That's right about "empty" value (missed it)
- You are correct about the metadata fields
|
||
```python | ||
# Added method to `Span` class | ||
def add_payload_attribute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an alternative would be to have a special SDK-provided wrapper type passed to regular set_attribute, e.g.
p := otel.Payload(...)
span.set_attribute(key, p)
In this form you have better extensibility because new fields could be added to Payload type in the future, which you cannot do with function arguments without overloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a great idea to explore
(which is a general representation of a JSON object), with some embedded metadata - | ||
|
||
```protobuf | ||
message Value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is NullValue different from the "empty" value that AnyValue supports? What 'nested' metadata fields are you referring to? Do you mean original_encoding
and others?
For example, this will be required by a simple processor that filters specific keys in a map. | ||
Also, it requires backends (and potentially processors) to unnecessarily attempt deserialization of every string attribute. | ||
|
||
### Supporting nested map values in Span attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this alternate is not sufficiently explored. IMO, instead of adding a new set of attributes with a new type that mostly overlaps with AnyValue, this alternate can be more viable.
I would rather advocate that open-telemetry/opentelemetry-specification#376 is accepted and come up with semantic convention to record the payload and associated metadata as a regular Span attribute.
This OTEP proposes and approach that requires much bigger changes, but I do not feel that the arguments in favour of it are strong enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that there is room to explore this further.
Some ideas (which I can also add to the OTEP):
- We can consider having a 'core' type (like AnyValue) which could be extended here
- Even if Support map values and nested values for attributes opentelemetry-specification#376 is approved, I think that this proposal introduces essential advantages -
- Adding the metadata as semantic conventions instead is quite cumbersome. We need 4 extra tags to include proposed metadata (
original_encoding
,encoded_size
,original_length
anddropped_keys
). The last two should support nested values so that some keys will get duplicated. We may have more in the future. - Specifying attributes as payloads is important. For example, to disable all payload collection in Span SDK implementation, or in processors and exporters. Doing that by semantic conventions is less robust. And in general, naming tags like 'http.payload.request.body' isn't ideal.
- Adding the metadata as semantic conventions instead is quite cumbersome. We need 4 extra tags to include proposed metadata (
- On the other hand, there are some drawbacks to supporting nested maps in general attributes (as discussed in the other issue). So this proposal could be another alternative to that.
To sum up, I think that uniting both attribute types is possible, but we have to make sure they will be generic enough.
Doing that could over-complicate the most common use of 'simple' attributes and require even bigger changes than adding payload attributes.
// Optional - the original bytes encoding type of this value (e.g. json/yaml/avro/csv) | ||
string original_encoding = 2; | ||
// Optional - the size of the value as bytes encoded (including dropped data) | ||
int64 encoded_size = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of these fields given that the payload is deserialized and is recorded in a structured form in the Value
message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original size could be useful to troubleshoot and to generate aggregations (the deserialized might not include all the data, and it requires encoding to calculate the original size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for example, sometimes we will not collect the entire payload (and use the dropped_keys
field or original_length
) - in these cases it's useful to know the original size and still be able to generate aggregations over it
int64 original_length = 7; | ||
|
||
// Set only for MapValue, in case some of the keys were dropped | ||
repeated string dropped_keys = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear why this field is here instead of in the MapValue message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
The original reason for putting it here is because dropped_keys
should be set together with original_length
for shortened maps. (original_length
is here because it's also used for string
type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is a great capability to add to OTel. From our experience in Epsagon (and looking at other vendors in the space) this kind of data gave huge value to our customers. I do agree with @tigrannajaryan that the next step should be probably discussing and choosing the best approach to deliver this value. I also agree with @tigrannajaryan that the viable options from where I stand are either what described in this OTEP or the smaller change described in open-telemetry/opentelemetry-specification#376, and not the other alternatives discussed here.
This OTEP proposes to add support for collecting payload data in spans, by adding a non-breaking | ||
functionality to trace API, and to OTLP. As we show in this proposal, adding such data using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is crucial and good that you called it out. Opt-in is definitely the way to go here, and non-breaking is important for that
// Optional - the original bytes encoding type of this value (e.g. json/yaml/avro/csv) | ||
string original_encoding = 2; | ||
// Optional - the size of the value as bytes encoded (including dropped data) | ||
int64 encoded_size = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for example, sometimes we will not collect the entire payload (and use the dropped_keys
field or original_length
) - in these cases it's useful to know the original size and still be able to generate aggregations over it
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
|
||
## Alternatives | ||
|
||
There are other possible ways to encode payload data, using current Span attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional alternative would be to define this as a "payload" events and use the Log event as the transport. A Log event can have the current span associated with the log record. We are also expanding the definition of an event to include the generic event.data
which will include the "payload" of the event as defined by the schema (semantic conventions) of the event.domain
/ event.name
combination.
The introduction of the event.data
is currently included in this PR #2926
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it could be a good alternative.
A possible advantage is collecting payloads asynchronously from the span, which is more efficient in many cases (for example, a span representing an HTTP request could be closed before the response is read).
On the other hand, this approach will make it harder to do searches based on span tags + payload tags in the backend.
I am concerned that this OTEP significantly expands the scope of the OpenTelemetry project, and would introduce significant complexity in all areas of the ecosystem. And, as mentioned, there are likely ways that already existing portions could be pressed into service for this kind of functionality (even if it isn't a perfect match today). For me specifically, I would be concerned that this would introduce an additional burden onto SDK maintainers - we already must be extraordinarily careful about the performance implications of the SDKs we produce, but should we accept this OTEP we now must retain that same diligence and care in the face of possibly very large payload values. That is not straightforward and simple, and would likely require redesigns of portions of the SDKs to be even more efficient than they already are. Efficiency is good, of course, but I don't know if this in particular should be the forcing function for such rewrites. 😄 Perhaps it is better to first have broader conversations about what OpenTelemetry is and where the boundaries of its scope lie? |
+1 👍 |
I would suggest:
|
I think that the scope should be "reasonable" sizes, that are performant enough to collect. Probably a few KBs per span. Definitely not "1 gigabyte of video payload" IMO :).
We can also provide a processor on the OTel Collector that will convert payload attributes into regular attributes as a workaround. |
I just took a second pass at the OTEP, and I see that @tigrannajaryan 's and my comments are largely not addressed. The proposal significantly expands the interface surface of OTEL, both in the API and in OTLP, and the justifications are not convincing, the authors admit that the same things could be achieved with the existing mechanisms and data structures. So why not explore that, maybe propose some semantic conventions. Even more importantly, capturing payloads is a big data privacy red flag. The OTEP does not discuss this at all. From the implementation perspective, where to stash the data is a much easier problem to solve than how to capture the data in a privacy-respecting way. And while it is possible to decouple these two problems, the actual value of this OTEP would only materialize if it provides a framework for instrumentation to capture payloads, not just an extension of API/SDK on where to record it. And a framework for instrumentation would absolutely have to deal with privacy questions. |
Thanks @yurishkuro. About the privacy issue - |
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Hey @ronyis are you still working on this? If not, wanted to take and complete this as we need it as well :) |
Not actively working on it, feel free to jump in |
Closing this as the work is continued in open-telemetry/opentelemetry-specification#234 |
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.
@ronyis - I don't see this work at all in open-telemetry/opentelemetry-specification#234 - I think you mean open-telemetry/opentelemetry-specification#234 . |
In this OTEP, we propose a way for extending trace functionality to support payload collection.
We plan that this change will also enable collecting payload by instrumentations in the future.
This is an important capability in our vision, which could help many OpenTelemetry users to troubleshoot and debug applications much more effectively. We at Epsagon are also committed to pushing this through specifications and development.
We are glad to receive any feedback and to collaborate on this topic!