Span processor clarifications #1135

rachelleahklein · 2020-10-23T00:20:02Z

Changes

This PR attempts to clarify information about span processors in the Trace SDK spec. It contains:

More information about when one might want to use a simple vs. batching span processor
More information about scenarios in which span processors may be used without exporters
Minor grammatical fixes

Related issues

The need for clarification in the spec came to my attention when it was mentioned in this ticket: open-telemetry/opentelemetry-ruby#397.

linux-foundation-easycla · 2020-10-23T00:20:06Z

The committers are authorized under a signed CLA.

✅ Rachel Klein (8f6f3c0, 89d36ab, e146850, 12777b7)

SergeyKanzhelev · 2020-10-23T00:32:23Z

specification/trace/sdk.md

+and the export-friendly span data representation to the configured
+`SpanExporter` as soon as they are finished.
+
+Typically, the simple processor will be most suitable for use in testing and/or


it may be used in production for adding attributes like here: https://medium.com/opentelemetry/opentelemetry-beyond-getting-started-5ac43cd0fe26#ffbb:~:text=Custom%20attributes%20in%20code%20scopes

In Java this has become the most common span processor as exporters use async functions to send data instead of relying on a background thread. Not how the spec was designed probably, but if it works.
CC @anuraaga

Thanks for your valuable feedback on this section as well as the batch processor one, @Oberon00. Do you think more should be added to clarify that while these statements may "typically" be true, it also depends on language implementation? I'm not sure how much detail to get into here.

SergeyKanzhelev

thank you for PR. I added comment on one more scenario for processor. Current edit may mislead

rachelleahklein · 2020-10-23T00:37:05Z

Thanks for the quick response! I will add that further clarification.

reyang · 2020-10-23T02:49:45Z

specification/trace/sdk.md

@@ -287,16 +287,17 @@ invocations. The span processors are invoked only when
 [`IsRecording`](api.md#isrecording) is true.

 Built-in span processors are responsible for batching and conversion of spans to
-exportable representation and passing batches to exporters.
+exportable representation and passing batches to exporters or alternatives to


Not sure if a GA spec should point to an experimental spec, thoughts?

I think zPages are implemented as a span processor, not something that receives data through a span processor.

Would it be true then to say "and passing batches to exporters or other span processors," or are there still other alternatives?

Also, to @reyang's point, I am not even sure referring to zPages specifically makes sense here. My only goal is to have a clear and universal definition of span processors in this sentence. Reading this section, I was confused by the fact that span processors were defined as "responsible for... passing batches to exporters" but then a few lines down, exporters were described as optional.

I don't see a problem referring to experimental feature as it makes the point that this is an extensibility point in SDK that can be useful for purposes including experimentation with the new features.

Oberon00 · 2020-10-23T07:57:39Z

specification/trace/sdk.md

+and the export-friendly span data representation to the configured
+`SpanExporter` as soon as they are finished.
+
+Typically, the simple processor will be most suitable for use in testing and/or


In Java this has become the most common span processor as exporters use async functions to send data instead of relying on a background thread. Not how the spec was designed probably, but if it works.
CC @anuraaga

Oberon00 · 2020-10-23T07:59:43Z

specification/trace/sdk.md

@@ -287,16 +287,17 @@ invocations. The span processors are invoked only when
 [`IsRecording`](api.md#isrecording) is true.

 Built-in span processors are responsible for batching and conversion of spans to
-exportable representation and passing batches to exporters.
+exportable representation and passing batches to exporters or alternatives to


I think zPages are implemented as a span processor, not something that receives data through a span processor.

specification/trace/sdk.md

Oberon00 · 2020-10-23T08:02:04Z

specification/trace/sdk.md

+Typically, the batching processor will be more suitable for production environments
+than the simple processor.


Depends on the language, e.g. in Java exporters do batching themselves sometimes. But typically, I think that BatchSpanProcessor should be the expected scenario, yes.

I like how it's written. Wdyt @Oberon00?

I think the PR's wording is fine here 👍

specification/trace/sdk.md

Oberon00 · 2020-10-23T08:04:54Z

specification/trace/sdk.md

+Each processor registered on `TracerProvider` is a start of a pipeline that consists
+of one or more span processors and, optionally, one or more exporters. The SDK MUST
+allow ending each pipeline with an individual exporter.



I think we should remove this paragraph. Or do you know what is meant by it? AFAIK the TracerProvider must allow the registration of multiple SpanProcessors, but each built-in span processor only supports a single exporter.

Suggested change

Each processor registered on `TracerProvider` is a start of a pipeline that consists

of one or more span processors and, optionally, one or more exporters. The SDK MUST

allow ending each pipeline with an individual exporter.

I think we definitely need some better description of the possible compositions of SpanProcessors and SpanExporters. E.g. we MUST indicate somewhere that there should be a SpanProcessor before each SpanExporter (purely from methods signatures point of view). We also MUST stress and clearly describe that multiple registered SpanProcessor are "siblings" and do NOT call each other. All this has already created some confusion in the past.

I support removing this paragraph. We don't need to specify that pipelines end in exporters, just that the two builtin span processors end in exporters. The metric SDK hasn't thus far mentioned support for multiple processors, and why should we? It's possible to define a multi-processor that calls a series of individual processors. Should the default trace SDK include a builtin multi-processor? I suppose they should if it's common to implement multiple trace exporters with different processor settings, but that does not sound common. Wouldn't you prefer to have a single processor call multiple exporters? Why do we need to specify these things?

multiple registered SpanProcessor are "siblings" and do NOT call each other

If you only specify SDK support for a single processor, this problem vanishes. Potentially you can also specify a multi-processor and/or a multi-exporter, but are you sure these are commonly needed?

For what it's worth, as someone relatively new to OpenTelemetry concepts, it was helpful to me to have a textual description of the span processor pipeline in addition to the diagram that also exists in this section. There may be some additional nuance that needs to be added, but having some form of this paragraph seems like a good idea to me.

specification/trace/sdk.md

Co-authored-by: Christian Neumüller <[email protected]>

iNikem

This PR does not solve existing problems with the current description, but it is a nice cleanup. Thank you! :)

rachelleahklein · 2020-10-26T17:50:49Z

I have attempted to address @SergeyKanzhelev's requested change. Please let me know if more clarification is needed there.

Thanks to all who have submitted feedback. I think there are a few things that are potentially outside the scope of this PR, but I have tried to incorporate fixes that are straightforward.

One outstanding question I have: does the overall definition of span processors (see discussion on ln. 290) need to be further fixed?

specification/trace/sdk.md

SergeyKanzhelev · 2020-10-26T18:26:57Z

@rachelleahklein thanks again for the PR. The only outstanding feedback from me is to position simple span processor scenarios as dangerous, but valid for production (not as exceptions).

SergeyKanzhelev

This is an improvement to the wording without semantical change. Thank you!

justinfoote

Thank's for the clarifications @rachelleahklein!

cijothomas · 2020-10-30T15:09:41Z

specification/trace/sdk.md

+and the export-friendly span data representation to the configured
+`SpanExporter` as soon as they are finished.
+
+Typically, the simple processor will be most suitable for use in testing; it should be used with


Not sure this should be mentioned here.
We have production systems running using SimpleProcessors. In those systems, the OpenTelemetry SDK uses SimpleProcessor and exports the span as they arrive, without any batching.
The export may be done to some agent, which in turn may do batching.
Or Export could be done to the actual backend, to enable scenarios like "Live telemetry". Eg in Azure: https://docs.microsoft.com/azure/azure-monitor/app/live-stream

Added couple suggestions. @anuraaga, @cijothomas are these suggestions looks OK for you?

Thank you for adding more suggestions, @SergeyKanzhelev. I have incorporated the first (though happy to revert or refine if there are further comments).

The second is still open and I'm hoping @Oberon00 or others can help clarify a more accurate version.

Thank you all!

github-actions · 2020-11-07T03:19:17Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

specification/trace/sdk.md

SergeyKanzhelev · 2020-11-07T07:00:16Z

specification/trace/sdk.md

+custom attributes should be added to individual spans based on code scopes.



Suggested change

custom attributes should be added to individual spans based on code scopes.

custom attributes should be added to individual spans based on code scopes. Simple processors

might also be used for scenarios where the callback needs to be called before sampling. For

example, z-pages can be implemented this way.

This is wrong though. The simple processor is also not invoked for DROPped spans.

hm, I'm referring to the sampled flag as oppose to the recorded flag. Recorded, but sampled out spans should not make it to the batch processor, correct? Recorded will be processed by simple processors. dropped will not reach either

Co-authored-by: Sergey Kanzhelev <[email protected]>

github-actions · 2020-11-17T03:19:43Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

rachelleahklein · 2020-11-17T04:21:16Z

I don't seem to be able to reopen this PR, but I still do hope to work on it and get it approved. It sounds like there are some clarifications that people have found helpful. Please let me know if there is any further feedback I can address. Thank you.

bogdandrutu · 2020-11-17T19:48:29Z

specification/trace/sdk.md

 Built-in span processors are responsible for batching and conversion of spans to
-exportable representation and passing batches to exporters.
+exportable representation and passing batches to exporters or alternatives to


I think we are mixing 2 different things in this sentence:

What is a SpanProcessor, and how can this be used to interact with the SDK.

The SpanProcessor in general does not need to batch or talk to the exporter, it is a simple "hook" that allows custom processors to interact with the Span lifetime events (on start and on end). For example somebody can implement a processor that tracks all the active requests (one of the functionality that we want to offer in the zPages). Others can implement different other functionality.

What are the "built-in" SpanProcessors.

We will provide 2 flavors of SpanProcessors (Simple/Batch) that are able to batch (or not in case of simple), transform the Spans in an "exportable" format, and call the "exporter".

github-actions · 2020-11-25T03:22:12Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2020-12-02T03:25:24Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Rachel Klein added 4 commits October 15, 2020 10:20

Clarify exporter alternatives in span processor pipelines

8f6f3c0

Minor grammatical fixes

89d36ab

Clarify when to use simple vs. batching span processors

e146850

Multiple span processors/exporters may exist in pipeline

12777b7

rachelleahklein requested review from a team October 23, 2020 00:20

github-actions bot assigned SergeyKanzhelev Oct 23, 2020

SergeyKanzhelev reviewed Oct 23, 2020

View reviewed changes

SergeyKanzhelev suggested changes Oct 23, 2020

View reviewed changes

reyang reviewed Oct 23, 2020

View reviewed changes

Oberon00 reviewed Oct 23, 2020

View reviewed changes

specification/trace/sdk.md Outdated Show resolved Hide resolved

Rachel Klein and others added 2 commits October 23, 2020 10:22

Reword batching processor definition

c8872e8

Co-authored-by: Christian Neumüller <[email protected]>

Clarify spans above maxQueueSize are skipped, not dropped

698f4fe

Co-authored-by: Christian Neumüller <[email protected]>

iNikem approved these changes Oct 24, 2020

View reviewed changes

Rachel Klein added 2 commits October 26, 2020 10:44

Add custom attributes to production uses of simple span processor

7bd8771

Remove reference to tagging/filtering in span processor description

3e709c4

SergeyKanzhelev reviewed Oct 26, 2020

View reviewed changes

specification/trace/sdk.md Outdated Show resolved Hide resolved

Clarify language around production uses of simple span processor

3774664

SergeyKanzhelev approved these changes Oct 26, 2020

View reviewed changes

justinfoote approved these changes Oct 26, 2020

View reviewed changes

rachelleahklein mentioned this pull request Oct 29, 2020

Add documentation on usage scenarios for span processors open-telemetry/opentelemetry-ruby#461

Merged

cijothomas reviewed Oct 30, 2020

View reviewed changes

github-actions bot added the Stale label Nov 7, 2020

SergeyKanzhelev reviewed Nov 7, 2020

View reviewed changes

specification/trace/sdk.md Outdated Show resolved Hide resolved

SergeyKanzhelev reviewed Nov 7, 2020

View reviewed changes

Further clarify wording about production use of span processors

5aa24af

Co-authored-by: Sergey Kanzhelev <[email protected]>

github-actions bot closed this Nov 17, 2020

yurishkuro reopened this Nov 17, 2020

SergeyKanzhelev removed the Stale label Nov 17, 2020

bogdandrutu reviewed Nov 17, 2020

View reviewed changes

github-actions bot added the Stale label Nov 25, 2020

github-actions bot closed this Dec 2, 2020

		Typically, the batching processor will be more suitable for production environments
		than the simple processor.

	Each processor registered on `TracerProvider` is a start of a pipeline that consists
	of one or more span processors and, optionally, one or more exporters. The SDK MUST
	allow ending each pipeline with an individual exporter.

		custom attributes should be added to individual spans based on code scopes.

-custom attributes should be added to individual spans based on code scopes.
+custom attributes should be added to individual spans based on code scopes. Simple processors
+might also be used for scenarios where the callback needs to be called before sampling. For
+example, z-pages can be implemented this way.

Span processor clarifications #1135

Span processor clarifications #1135

Conversation

rachelleahklein commented Oct 23, 2020 • edited Loading

Changes

Related issues

linux-foundation-easycla bot commented Oct 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

rachelleahklein commented Oct 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachelleahklein Oct 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rachelleahklein Oct 28, 2020 • edited Loading

Choose a reason for hiding this comment

iNikem left a comment

Choose a reason for hiding this comment

rachelleahklein commented Oct 26, 2020

SergeyKanzhelev commented Oct 26, 2020

SergeyKanzhelev left a comment

Choose a reason for hiding this comment

justinfoote left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Nov 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Nov 17, 2020

rachelleahklein commented Nov 17, 2020

Choose a reason for hiding this comment

github-actions bot commented Nov 25, 2020

github-actions bot commented Dec 2, 2020

rachelleahklein commented Oct 23, 2020 •

edited

Loading

linux-foundation-easycla bot commented Oct 23, 2020 •

edited

Loading

rachelleahklein Oct 23, 2020 •

edited

Loading

rachelleahklein Oct 28, 2020 •

edited

Loading