Trace Payload Collection #219

ronyis · 2022-09-14T14:07:05Z

In this OTEP, we propose a way for extending trace functionality to support payload collection.
We plan that this change will also enable collecting payload by instrumentations in the future.

This is an important capability in our vision, which could help many OpenTelemetry users to troubleshoot and debug applications much more effectively. We at Epsagon are also committed to pushing this through specifications and development.

We are glad to receive any feedback and to collaborate on this topic!

yurishkuro · 2022-09-14T15:25:32Z

text/trace/0219-payload-collection.md

+(which is a general representation of a JSON object), with some embedded metadata -
+
+```protobuf
+message Value {


can you not use the existing AnyValue?

https://github.com/open-telemetry/opentelemetry-proto/blob/724e427879e3d2bae2edc0218fff06e37b9eb46e/opentelemetry/proto/common/v1/common.proto#L28

There are a few important differences - supporting NullValue and the 'nested' metadata fields (and not supporting bytes)

How is NullValue different from the "empty" value that AnyValue supports? What 'nested' metadata fields are you referring to? Do you mean original_encoding and others?

That's right about "empty" value (missed it)

You are correct about the metadata fields

yurishkuro · 2022-09-14T15:28:58Z

text/trace/0219-payload-collection.md

+
+```python
+# Added method to `Span` class
+def add_payload_attribute(


an alternative would be to have a special SDK-provided wrapper type passed to regular set_attribute, e.g.

p := otel.Payload(...) span.set_attribute(key, p)

In this form you have better extensibility because new fields could be added to Payload type in the future, which you cannot do with function arguments without overloading.

I think it's a great idea to explore

tigrannajaryan · 2022-09-14T16:39:41Z

text/trace/0219-payload-collection.md

+(which is a general representation of a JSON object), with some embedded metadata -
+
+```protobuf
+message Value {


How is NullValue different from the "empty" value that AnyValue supports? What 'nested' metadata fields are you referring to? Do you mean original_encoding and others?

tigrannajaryan · 2022-09-14T16:44:43Z

text/trace/0219-payload-collection.md

+For example, this will be required by a simple processor that filters specific keys in a map.
+Also, it requires backends (and potentially processors) to unnecessarily attempt deserialization of every string attribute.
+
+### Supporting nested map values in Span attributes


I think this alternate is not sufficiently explored. IMO, instead of adding a new set of attributes with a new type that mostly overlaps with AnyValue, this alternate can be more viable.

I would rather advocate that open-telemetry/opentelemetry-specification#376 is accepted and come up with semantic convention to record the payload and associated metadata as a regular Span attribute.

This OTEP proposes and approach that requires much bigger changes, but I do not feel that the arguments in favour of it are strong enough.

I agree that there is room to explore this further.

Some ideas (which I can also add to the OTEP):

We can consider having a 'core' type (like AnyValue) which could be extended here

Even if Support map values and nested values for attributes opentelemetry-specification#376 is approved, I think that this proposal introduces essential advantages -

Adding the metadata as semantic conventions instead is quite cumbersome. We need 4 extra tags to include proposed metadata (original_encoding, encoded_size, original_length and dropped_keys). The last two should support nested values so that some keys will get duplicated. We may have more in the future.

Specifying attributes as payloads is important. For example, to disable all payload collection in Span SDK implementation, or in processors and exporters. Doing that by semantic conventions is less robust. And in general, naming tags like 'http.payload.request.body' isn't ideal.

On the other hand, there are some drawbacks to supporting nested maps in general attributes (as discussed in the other issue). So this proposal could be another alternative to that.

To sum up, I think that uniting both attribute types is possible, but we have to make sure they will be generic enough.
Doing that could over-complicate the most common use of 'simple' attributes and require even bigger changes than adding payload attributes.

tigrannajaryan · 2022-09-14T16:48:54Z

text/trace/0219-payload-collection.md

+    // Optional - the original bytes encoding type of this value (e.g. json/yaml/avro/csv)
+    string original_encoding = 2;
+    // Optional - the size of the value as bytes encoded (including dropped data)
+    int64 encoded_size = 3;


What is the purpose of these fields given that the payload is deserialized and is recorded in a structured form in the Value message?

The original size could be useful to troubleshoot and to generate aggregations (the deserialized might not include all the data, and it requires encoding to calculate the original size).

Also for example, sometimes we will not collect the entire payload (and use the dropped_keys field or original_length) - in these cases it's useful to know the original size and still be able to generate aggregations over it

tigrannajaryan · 2022-09-14T16:50:02Z

text/trace/0219-payload-collection.md

+    int64 original_length = 7;
+
+    // Set only for MapValue, in case some of the keys were dropped
+    repeated string dropped_keys = 8; 


Unclear why this field is here instead of in the MapValue message.

Moved.
The original reason for putting it here is because dropped_keys should be set together with original_length for shortened maps. (original_length is here because it's also used for string type).

osherv · 2022-09-14T18:30:11Z

Related Issues:
@pavolloffay
#1062
#1284
#1317

@MovieStoreGuy
open-telemetry/opentelemetry-specification#176

galbash

IMO this is a great capability to add to OTel. From our experience in Epsagon (and looking at other vendors in the space) this kind of data gave huge value to our customers. I do agree with @tigrannajaryan that the next step should be probably discussing and choosing the best approach to deliver this value. I also agree with @tigrannajaryan that the viable options from where I stand are either what described in this OTEP or the smaller change described in open-telemetry/opentelemetry-specification#376, and not the other alternatives discussed here.

galbash · 2022-09-15T10:48:25Z

text/trace/0219-payload-collection.md

+This OTEP proposes to add support for collecting payload data in spans, by adding a non-breaking
+functionality to trace API, and to OTLP. As we show in this proposal, adding such data using


I think this is crucial and good that you called it out. Opt-in is definitely the way to go here, and non-breaking is important for that

galbash · 2022-09-15T10:52:50Z

text/trace/0219-payload-collection.md

+    // Optional - the original bytes encoding type of this value (e.g. json/yaml/avro/csv)
+    string original_encoding = 2;
+    // Optional - the size of the value as bytes encoded (including dropped data)
+    int64 encoded_size = 3;


Also for example, sometimes we will not collect the entire payload (and use the dropped_keys field or original_length) - in these cases it's useful to know the original size and still be able to generate aggregations over it

Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.

MSNev · 2022-11-08T16:33:24Z

text/trace/0219-payload-collection.md

+
+## Alternatives
+
+There are other possible ways to encode payload data, using current Span attributes.


An additional alternative would be to define this as a "payload" events and use the Log event as the transport. A Log event can have the current span associated with the log record. We are also expanding the definition of an event to include the generic event.data which will include the "payload" of the event as defined by the schema (semantic conventions) of the event.domain / event.name combination.

The introduction of the event.data is currently included in this PR #2926

I also think it could be a good alternative.
A possible advantage is collecting payloads asynchronously from the span, which is more efficient in many cases (for example, a span representing an HTTP request could be closed before the response is read).
On the other hand, this approach will make it harder to do searches based on span tags + payload tags in the backend.

ahayworth · 2022-11-10T15:18:23Z

I am concerned that this OTEP significantly expands the scope of the OpenTelemetry project, and would introduce significant complexity in all areas of the ecosystem. And, as mentioned, there are likely ways that already existing portions could be pressed into service for this kind of functionality (even if it isn't a perfect match today).

For me specifically, I would be concerned that this would introduce an additional burden onto SDK maintainers - we already must be extraordinarily careful about the performance implications of the SDKs we produce, but should we accept this OTEP we now must retain that same diligence and care in the face of possibly very large payload values. That is not straightforward and simple, and would likely require redesigns of portions of the SDKs to be even more efficient than they already are. Efficiency is good, of course, but I don't know if this in particular should be the forcing function for such rewrites. 😄

Perhaps it is better to first have broader conversations about what OpenTelemetry is and where the boundaries of its scope lie?

reyang · 2022-11-10T16:23:27Z

I am concerned that this OTEP significantly expands the scope of the OpenTelemetry project, and would introduce significant complexity in all areas of the ecosystem.

+1 👍

reyang · 2022-11-10T16:27:12Z

I would suggest:

clarify what type of the payload is intended to be in-scope, and what's not (e.g. would we anticipate the eventual drifting to some extreme situation such as "here is 1 gigabyte of video payload"?).
explore how existing backends (e.g. Jaeger) would support it, and see if there is high interest from the community.

ronyis · 2022-11-10T17:57:54Z

clarify what type of the payload is intended to be in-scope, and what's not (e.g. would we anticipate the eventual drifting to some extreme situation such as "here is 1 gigabyte of video payload"?).

I think that the scope should be "reasonable" sizes, that are performant enough to collect. Probably a few KBs per span. Definitely not "1 gigabyte of video payload" IMO :).
And in the end, users could increase preconfigured limits as much as they like.

explore how existing backends (e.g. Jaeger) would support it, and see if there is high interest from the community.
I agree, but not so sure what's the right way to do that.

We can also provide a processor on the OTel Collector that will convert payload attributes into regular attributes as a workaround.
Also, based on my experience, storing JSON payloads in Elasticsearch/OpenSearch is quite straightforward.

yurishkuro · 2022-11-11T15:15:37Z

I just took a second pass at the OTEP, and I see that @tigrannajaryan 's and my comments are largely not addressed. The proposal significantly expands the interface surface of OTEL, both in the API and in OTLP, and the justifications are not convincing, the authors admit that the same things could be achieved with the existing mechanisms and data structures. So why not explore that, maybe propose some semantic conventions.

Even more importantly, capturing payloads is a big data privacy red flag. The OTEP does not discuss this at all. From the implementation perspective, where to stash the data is a much easier problem to solve than how to capture the data in a privacy-respecting way. And while it is possible to decouple these two problems, the actual value of this OTEP would only materialize if it provides a framework for instrumentation to capture payloads, not just an extension of API/SDK on where to record it. And a framework for instrumentation would absolutely have to deal with privacy questions.

ronyis · 2022-11-13T12:44:05Z

Thanks @yurishkuro.
I'm working on compiling the many comments and ideas into the OTEP (also I was offline for a while), representing different alternatives with pros and cons.

About the privacy issue -
I focused on APIs in this OTEP, because I think it's the foremost problem that blocks users from collecting payloads with their own instrumentations (or for companies to develop such 3rd party libs). Generally, I think that a configuration based on allow-lists and/or deny-lists for collecting payload attributes should be sufficient. I'm not aware of other ways to deal with this problem and looking forward to hearing about it. I will reference that in the OTEP as well.

Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.

nirga · 2023-06-04T15:53:39Z

Hey @ronyis are you still working on this? If not, wanted to take and complete this as we need it as well :)

ronyis · 2023-06-06T08:48:32Z

Hey @ronyis are you still working on this? If not, wanted to take and complete this as we need it as well :)

Not actively working on it, feel free to jump in

ronyis · 2023-06-21T08:37:01Z

Closing this as the work is continued in open-telemetry/opentelemetry-specification#234

Resolves open-telemetry#376 Use cases where this is necessary or useful: 1. Specify more than one resource in the telemetry: open-telemetry#579 2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment) or open-telemetry#376 (comment) 3. Capturing HTTP headers: open-telemetry#376 (comment) 4. Structured stack traces: open-telemetry#2841 5. Payloads as attributes: open-telemetry/oteps#219 (comment) This is a draft PR to see what the change looks like. If this PR is merged it will be nice to follow it up with: - A standard way of flattening maps and nested objects when converting from OTLP to formats that don't support maps/nested objects. - Recommendations for semantic conventions to use/not use complex objects.

johncrim · 2024-07-02T14:23:55Z

@ronyis - I don't see this work at all in open-telemetry/opentelemetry-specification#234 - I think you mean open-telemetry/opentelemetry-specification#234 .

Propose Trace Payload Collection

b4a2ced

ronyis requested a review from a team September 14, 2022 14:07

ronyis added 3 commits September 14, 2022 17:07

Rename 0000-payload-collection.md to 0219-payload-collection.md

a5f0fa1

lint

53bd390

Moved under trace/ dir

98ce3bc

ronyis requested a review from a team September 14, 2022 14:29

lint

1a7fac6

yurishkuro reviewed Sep 14, 2022

View reviewed changes

tigrannajaryan reviewed Sep 14, 2022

View reviewed changes

galbash reviewed Sep 15, 2022

View reviewed changes

tigrannajaryan mentioned this pull request Sep 27, 2022

Support map values and nested values for attributes open-telemetry/opentelemetry-specification#376

Closed

tigrannajaryan mentioned this pull request Oct 17, 2022

Support maps and heterogeneous arrays as attribute values open-telemetry/opentelemetry-specification#2888

Closed

Move dropped_keys into MapValue

a06cb5a

trask mentioned this pull request Oct 28, 2022

Traces & sensitive data: Should trace contain a request or response body? open-telemetry/community#1279

Closed

MSNev reviewed Nov 8, 2022

View reviewed changes

trask mentioned this pull request Jan 18, 2023

Add custom attribute through distro open-telemetry/opentelemetry-java-instrumentation#6209

Closed

naseemkullah mentioned this pull request Jan 20, 2023

Attributes helper function - convert a deep object to an attribute compliant single-depth kv object open-telemetry/opentelemetry-js#1445

Open

2 tasks

tedsuo added the triaged label Jan 30, 2023

trask mentioned this pull request Jan 31, 2023

Capture request and response bodies open-telemetry/semantic-conventions#857

Open

trask mentioned this pull request Apr 17, 2023

BREAKING: Introduce common url.* attributes, and improve use of namespacing under http.* open-telemetry/opentelemetry-specification#3355

Merged

nirga mentioned this pull request Jun 19, 2023

Trace Payload Collection #234

Closed

ronyis closed this Jun 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace Payload Collection #219

Trace Payload Collection #219

ronyis commented Sep 14, 2022

yurishkuro Sep 14, 2022

ronyis Sep 14, 2022

tigrannajaryan Sep 14, 2022

ronyis Sep 14, 2022

yurishkuro Sep 14, 2022

ronyis Sep 14, 2022

tigrannajaryan Sep 14, 2022

tigrannajaryan Sep 14, 2022

ronyis Sep 15, 2022

tigrannajaryan Sep 14, 2022

ronyis Sep 14, 2022

galbash Sep 15, 2022

tigrannajaryan Sep 14, 2022

ronyis Oct 23, 2022

osherv commented Sep 14, 2022 •

edited

Loading

galbash left a comment •

edited

Loading

galbash Sep 15, 2022

galbash Sep 15, 2022

MSNev Nov 8, 2022 •

edited

Loading

ronyis Nov 10, 2022

ahayworth commented Nov 10, 2022

reyang commented Nov 10, 2022

reyang commented Nov 10, 2022

ronyis commented Nov 10, 2022

yurishkuro commented Nov 11, 2022

ronyis commented Nov 13, 2022

nirga commented Jun 4, 2023

ronyis commented Jun 6, 2023

ronyis commented Jun 21, 2023 •

edited

Loading

johncrim commented Jul 2, 2024

		This OTEP proposes to add support for collecting payload data in spans, by adding a non-breaking
		functionality to trace API, and to OTLP. As we show in this proposal, adding such data using


		## Alternatives

		There are other possible ways to encode payload data, using current Span attributes.

Trace Payload Collection #219

Trace Payload Collection #219

Conversation

ronyis commented Sep 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

osherv commented Sep 14, 2022 • edited Loading

galbash left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MSNev Nov 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahayworth commented Nov 10, 2022

reyang commented Nov 10, 2022

reyang commented Nov 10, 2022

ronyis commented Nov 10, 2022

yurishkuro commented Nov 11, 2022

ronyis commented Nov 13, 2022

nirga commented Jun 4, 2023

ronyis commented Jun 6, 2023

ronyis commented Jun 21, 2023 • edited Loading

johncrim commented Jul 2, 2024

osherv commented Sep 14, 2022 •

edited

Loading

galbash left a comment •

edited

Loading

MSNev Nov 8, 2022 •

edited

Loading

ronyis commented Jun 21, 2023 •

edited

Loading