-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to separate context propagation from observability #42
Changes from 25 commits
dff8df9
5ad7d1c
1dc3c7b
58248e6
68cb0ba
3dc6a76
459435e
c9c64f4
c3c7c24
4588096
2d80dae
7a73210
0d8e41b
aad5605
4a930eb
e1ef61f
f949435
1cb155e
7b9e861
07eb397
0ebeb6c
147d6b0
72d4651
1472197
18a37d4
7ea1834
7317747
43ba8fd
d7d6f1c
3a817a2
3381e0f
c15a107
310e8d5
f59fc27
153b9aa
f70855a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,213 @@ | ||||||||||
# Proposal: Separate Layer for Context Propagation | ||||||||||
|
||||||||||
Design OpenTelemetry as a set of separate applications which operate on a shared context propagation mechanism. | ||||||||||
|
||||||||||
|
||||||||||
## Motivation | ||||||||||
|
||||||||||
Based on prior art, we know that fusing the observability system and the context propagation system together creates issues. Observability systems have special rules for propagating information, such as sampling, and may have different requirements from other systems which require non-local information to be sent downstream. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You use "downstream" quite often in this document, but I don't know if it is completely clear to all readers what is meant by this. Maybe add some "definitions" section or explain it in parentheses at the first occurrence. But maybe it is clear enough as-is, I'm not a native speaker. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, that is good feedback. We use the term "downstream" in a number of places in OTel, maybe we should add a glossary? https://github.com/search?q=org%3Aopen-telemetry+downstream&type=Code |
||||||||||
* Separation of concerns | ||||||||||
* Remove the Tracer dependency from context propagation mechanisms. | ||||||||||
* Separate distributed context into Baggage and Correlations | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
* Extensibility | ||||||||||
* Allow users to create new applications for context propagation. | ||||||||||
* For example: A/B testing, encrypted or authenticated data, and new, experimental forms of observability. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even though you have two second-level points under "Extensibility", the second one is just examples for the first one. Maybe redistribute these e.g. as "Extensibility: Allow users ..., for example: * A/B testing * encrypted or ..." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense, thanks. |
||||||||||
|
||||||||||
## Explanation | ||||||||||
|
||||||||||
# OpenTelemetry Layered Architecture | ||||||||||
|
||||||||||
![drawing](img/context_propagation_explanation.png) | ||||||||||
|
||||||||||
OpenTelemetry is a distributed program, which requires non-local, transaction-level context in order to execute correctly. Transaction-level context can also be used to build other distributed programs, such as security, versioning, and network switching programs. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
To allow for this extensibility, OpenTelemetry is separated into an **application layer** and a **context propagation layer**. In this architecture, multiple distributed applications - such as the observability and baggage systems provided by OpenTelemetry - simultaneously share the same underlying context propagation system in order to execute their programs. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does "this extensibility" refer to here? Maybe better: "the required extensibility [for the aforementioned applications/"programs"]". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
"simultaneously" to what? This is redundant to "share", I guess. |
||||||||||
|
||||||||||
|
||||||||||
# Application Layer | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This RFC doesn't really explain how different applications would work, rather it goes into the implementation details of metrics and tracing "observability systems". It would be nice to have a better definition of what an application is in the application layer. Are all applications required to share a common interface? |
||||||||||
|
||||||||||
## Observability API | ||||||||||
|
||||||||||
OpenTelemetry currently contains two observability systems - Tracing and Metrics – and may be extended over time. These separate systems are bound into a unified Observability API through sharing labels – a mechanism for correlating independent observations – and through sharing propagators. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are the propagators themselves really shared though? I think it is merely the API that signals propagation points that is shared. That might be more hair-splitting than is good for easy understandability though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am thinking that Observability system does not automatically have a separate propagators for tracing and for metrics, but I see your point that Trace-Context and Correlation-Context are separate, and right now it's not clear if Tracing would use Correlation-Context. |
||||||||||
|
||||||||||
**Observe(context, labels…, observations...) -> context** | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some introducing sentence like "The following general forms of APIs exist:" would be good here. Although then the HTTP API's don't seem to fit, which really makes me wonder: What does this list of APIs actually enumerate? Or would it be better to reformat this like "The general form for all observability APIs is a function with the following signature: Observe(...). That is, it takes a Context, label keys, ...". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I should definitely remove the "all" here. In this RFC, I'm trying to define the I am trying to show how I suggest that the details about how exactly these changes will affect our Tracing and Metrics APIs should be worked out in specification PRs, not in an RFC. |
||||||||||
The general form for all observability APIs is a function which takes a Context, label keys, and observations as input, and returns an updated Context. | ||||||||||
|
||||||||||
**Correlate(context, label, value, hoplimit) -> context** | ||||||||||
To set the label values used by all observations in the current transaction, the Observability API provides a function which takes a context, a label key, a value, and a hoplimit, and returns an updated context. If the hoplimit is set to NO_PROPAGATION, the label will only be available to observability functions in the same process. If the hoplimit is set to UNLIMITED_PROPAGATION, it will be available to all downstream services. | ||||||||||
|
||||||||||
**GetHTTPExtractor() -> extractor** | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure of the value of calling out HTTP explicitly. There are plenty of practical examples where people use non-HTTP transports There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although the "HTTP"-propagators can really be used to propagate over any protocol that supports ASCII key-value pairs as metadata/headers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My intent is just to show that I am not proposing a "generic" propagator: HTTP, BINARY, etc, are each handled by a separate function. I didn't call this out before and it confused some people into thinking we were going back to the OpenTracing way of doing things. We could call it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The notion of extractor/injector is irrelevant to the API for the application (we didn't have them in OpenTracing). Injector/extractor are only needed when registering them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since these have the same name as their "Baggage" counterparts, it seems they'll have to be namespaced in some way. here is no explicit observability concept today tough, only meters and tracers. Should there be a separate propagator API for each of these? If not, why does Baggage have a separate one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Oberon00 I actually assumed that we would not have separate propagators for Metrics and Tracing as something in the API layer. So I was namespacing by splitting things into an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there can be no separation in the propagation API for any propagators, not just metrics/tracing, but any custom/baggage propagators. The only thing the application needs to know is "I want to inject/extract context in this format". |
||||||||||
To deserialize the state of the system sent from the prior upstream process, the Observability API provides a function which returns a HTTPExtract function. | ||||||||||
|
||||||||||
**GetHTTPInjector() -> injector** | ||||||||||
To serialize the the current state of the observability system and send it to the next downstream process, the Observability API provides a function which returns a HTTPInject function. | ||||||||||
|
||||||||||
|
||||||||||
## Baggage API | ||||||||||
|
||||||||||
In addition to observability, OpenTelemetry provides a simple mechanism for propagating arbitrary data, called Baggage. This allows new distributed applications to be implemented without having to create new propagators. | ||||||||||
|
||||||||||
To manage the state of a distributed application, the Baggage API provides a set of functions which read, write, and remove data. | ||||||||||
|
||||||||||
**SetBaggage(context, key, value) -> context** | ||||||||||
To record the distributed state of an application, the Baggage API provides a function which takes a context, a key, and a value as input, and returns an updated context which contains the new value. | ||||||||||
|
||||||||||
**GetBaggage(context, key) -> value** | ||||||||||
To access the distributed state of an application, the Baggage API provides a function which takes a context and a key as input, and returns a value. | ||||||||||
|
||||||||||
**RemoveBaggage(context, key) -> context** | ||||||||||
To delete distributed state from an application, the Baggage API provides a function which takes a context, a key, and a value as input, and returns an updated context which contains the new value. | ||||||||||
|
||||||||||
**ClearBaggage(context) -> context** | ||||||||||
To avoid sending baggage to an untrusted downstream process, the Baggage API provides a function remove all baggage from a context, | ||||||||||
|
||||||||||
**GetHTTPExtractor() -> extractor** | ||||||||||
To deserialize the state of the system sent from the the prior upstream process, the Baggage API provides a function which returns a HTTPExtract function. | ||||||||||
|
||||||||||
**GetHTTPInjector() -> injector** | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's misleading to combine injectors/extractors with the API for manipulating the baggage. I am not even sure there should be "get" methods for those - who would call that? The inter-process propagation layer (see my diagram above) sits below specific contexts, so they can register with it, but it shouldn't need to "call up" |
||||||||||
To serialize the the current state of the system and send it to the next downstream process, the Baggage API provides a function which returns a HTTPInject function. | ||||||||||
|
||||||||||
|
||||||||||
## Additional APIs | ||||||||||
|
||||||||||
Because the application and context propagation layers are separated, it is possible to create new distributed applications which do not depend on either the Observability or Baggage APIs. | ||||||||||
|
||||||||||
**GetHTTPExtractor() -> extractor** | ||||||||||
To deserialize the state of the system in the prior upstream process, all additional APIs provide a function which returns a HTTPExtract function. | ||||||||||
|
||||||||||
**GetHTTPInjector() -> injector** | ||||||||||
To serialize the the current state of the system and send it to the next downstream process, all additional APIs provide a function which returns a HTTPInject function. | ||||||||||
|
||||||||||
|
||||||||||
# Context Propagation Layer | ||||||||||
|
||||||||||
## Context API | ||||||||||
|
||||||||||
Distributed applications access data in-process using a shared context object. Each distributed application sets a single key in the context, containing all of the data for that system. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I.e., |
||||||||||
|
||||||||||
**SetValue(context, key, value) -> context** | ||||||||||
To record the local state of an application, the Context API provides a function which takes a context, a key, and a value as input, and returns an updated context which contains the new value. | ||||||||||
|
||||||||||
**GetValue(context, key) -> value** | ||||||||||
To access the local state of an application, the Context API provides a function which takes a context and a key as input, and returns a value. | ||||||||||
|
||||||||||
### Optional: Automated Context Management | ||||||||||
When possible, context should automatically be associated with program execution . Note that some languages do not provide any facility for setting and getting a current context. In these cases, the user is responsible for managing the current context. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good catch. |
||||||||||
|
||||||||||
**SetCurrent(context)** | ||||||||||
To associate a context with program execution, the Context API provides a function which takes a Context. | ||||||||||
|
||||||||||
**GetCurrent() -> context** | ||||||||||
To access the context associated with program execution, the Context API provides a function which takes no arguments and returns a Context. | ||||||||||
|
||||||||||
|
||||||||||
## Propagation API | ||||||||||
|
||||||||||
Distributed applications propagate their state by data to downstream processes via injectors, functions which read and write application context into RPC requests. Each distributed application creates a set of propagators for every type of supported medium - currently only HTTP requests. | ||||||||||
|
||||||||||
**HTTPInject(context, request)** | ||||||||||
To send the data for all distributed applications downstream to the next process, the Propagation API provides a function which takes a context and an HTTP request, and mutates the HTTP request to include an HTTP Header representation of the context. | ||||||||||
|
||||||||||
**HTTPExtract(context, request) -> context** | ||||||||||
To receive data injected by prior upstream processes, the Propagation API provides a function which takes a context and an HTTP request, and returns context which represents the state of the upstream system. | ||||||||||
|
||||||||||
**ChainHTTPInjector(injector, injector) injector** | ||||||||||
In order for the application layer to function correctly, Propagation choices must be syncronized between all processes in the distributed system, and multiple applications must be able to inject and extract their context into the same request. To meet these requirements, the Propagation API provides a function which registers a set of propagators, which will all be executed in order when the future calls to inject and extract are made. A canonical propagator consists of an inject and an extract function. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "applications" is a bit of a dangerous word choice here as it means not another application program but a application of the distributed context in the same program. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really struggle what to call these. I added a definition for distributed application at the beginning, to help. It feels natural to refer to this architecture as an "application layer" and a "context propagation layer" but it is easy to see how the term "application" could be confusing. Not sure what is better though - I see Yuri using Aspect Oriented Programming terms like "Cross-cutting concern" but people who know what that is might think we are being literal about leveraging AOP. Ideas?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: Leave out "the future". |
||||||||||
|
||||||||||
**ChainHTTPExtractor(extractor, extractor) extractor** | ||||||||||
In order for the application layer to function correctly, Propagation choices must be syncronized between all processes in the distributed system, and multiple applications must be able to inject and extract their context into the same request. To meet these requirements, the Propagation API provides a function which registers a set of propagators, which will all be executed in order when the future calls to inject and extract are made. A canonical propagator consists of an inject and an extract function. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Description is identical to ChainHTTPInjector. Perhaps combine them? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm, copypasta error, one moment please... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any news on this? It seems the two signatures should belong under this same description instead of vice versa. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||||||||||
|
||||||||||
### Optional: Global Propagators | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please avoid global state. It can always be added later, but cannot be removed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess most languages already crossed the Rubicon here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried to make it optional, but I'm now realizing this has to rethought, given named tracers. |
||||||||||
It is often convenient to create a chain of propagators during program initialization, and then access these combined propagators later in the program. To facilitate this, global injectors and extractors are optionally available. However, there is no requirement to use this feature. | ||||||||||
|
||||||||||
**SetHTTPInjector(injector)** | ||||||||||
To update the global injector, the Propagation API provides a function which takes an injector. | ||||||||||
|
||||||||||
**GetHTTPInjector() -> injector** | ||||||||||
To access the global injector, the Propagation API provides a function which returns an injector. | ||||||||||
|
||||||||||
**SetHTTPExtractor(extractor)** | ||||||||||
To update the global extractor, the Propagation API provides a function which takes an injector. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: s/takes an injector/takes an extractor/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, I'm getting buried by my own boilerplate here. :) |
||||||||||
|
||||||||||
**GetHTTPExtractor() -> extractor** | ||||||||||
To access the global extractor, the Propagation API provides a function which returns an extractor. | ||||||||||
|
||||||||||
# Internal details | ||||||||||
|
||||||||||
![drawing](img/context_propagation_details.png) | ||||||||||
|
||||||||||
## Context details | ||||||||||
OpenTelemetry currently implements three context types of context propagation. | ||||||||||
|
||||||||||
**Span Context -** The serializable portion of a span, which is injected and extracted. The readable attributes are defined to match those found in the W3C **traceparent** header. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To make positively sure I understand this -- when we do sample rate propagation, will it be classified as part of the span context for purposes of injection/extraction, but be output to a hidden Correlation context field? Or will we make it a part of the correlation context, even though it's not a label/value per se, only a value for a predefined key? And how do we make sure that nothing else clobbers our key if we share with correlation context? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ping @tedsuo @iredelmeier There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lizthegrey sorry for the slow update on this! I believe that sampling is intended to be propagated as part of There is also a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
traceparent is only one of two headers in the W3C spec (the other being tracestate). |
||||||||||
|
||||||||||
**Correlation Context -** Correlation Context contains a map of labels and values, to be shared between metrics and traces. This allows observability data to be indexed and dimensionalized in a variety of ways. Note that correlations can quickly add overhead when propagated in-band. But because this data is write-only, it may be possible to optimize how it is transmitted. | ||||||||||
|
||||||||||
**Baggage Context -** Transaction-level application data, meant to be shared with downstream components. This data is readable, and must be propagated in-band. Because of this, Baggage should be used sparingly, to avoid ballooning the size of RPC requests. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
Note that when possible, OpenTelemetry APIs calls are given access to the entire context object, and not a specific context type. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any ideas of scenarios where this might not be possible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No. I updated the language to be stronger here. |
||||||||||
|
||||||||||
|
||||||||||
## Context Management and in-process propagation | ||||||||||
|
||||||||||
In order for Context to function, it must always remain bound to the execution of code it represents. By default, this means that the programmer must pass a Context down the call stack as a function parameter. However, many languages provide automated context management facilities, such as thread locals. OpenTelemetry should leverage these facilities when available, in order to provide automatic context management. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not for setting the active span in the current context, but when moving work from one thread to another, or context switching in async/nonblocking runtimes, this needs to be addressed. In general, I'm concerned that there will be important details which present themselves better when we attempt to implement this. We've gotten a bit of a pass so far because we did so much work in java at the very beginning of this process, before RFCs, etc. FWIW, in OpenTracing, we required a working implementation before adding an committing a change to the spec. |
||||||||||
|
||||||||||
## Pre-existing Context implementations | ||||||||||
|
||||||||||
In some languages, a single, widely used Context implementation exists. In other languages, there many be too many implementations, or none at all. For example, Go has a the context.Context object, and widespread conventions for how to pass it down the call stack. | ||||||||||
tedsuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
In the cases where an extremely clear, pre-existing option is not available, OpenTelemetry should provide its own Context implementation. | ||||||||||
|
||||||||||
## Default Propagators | ||||||||||
|
||||||||||
When available, OpenTelemetry defaults to propagating via HTTP header definitions which have been standardized by the W3C. | ||||||||||
|
||||||||||
|
||||||||||
# Trade-offs and mitigations | ||||||||||
|
||||||||||
## Why separate Baggage from Correlations? | ||||||||||
|
||||||||||
Since Baggage Context and Correlation Context appear very similar, why have two? | ||||||||||
|
||||||||||
First and foremost, the intended uses for Baggage and Correlations are completely different. Secondly, the propagation requirements diverge significantly. | ||||||||||
|
||||||||||
Correlation values are solely to be used as labels for metrics and traces. By making Correlation data write-only, how and when it is transmitted remains undefined. This leaves the door open to optimizations, such as propagating some data out-of-band, and situations where sampling decisions may cease the need to propagate correlation context any further. | ||||||||||
|
||||||||||
Baggage values, on the other hand, are explicitly added in order to be accessed by downstream by other application code. Therefore, Baggage Context must be readable, and reliably propagated in-band in order to accomplish this goal. | ||||||||||
|
||||||||||
There may be cases where a key-value pair is propagated as TagMap for observability and as a Baggage for application specific use. AB testing is one example of such use case. There is potential duplication here at call site where a pair is created and also at propagation. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the only mention of TagMap in this document . Is this intentional? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nope!
tedsuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
Solving this issue is not worth having semantic confusion with dual purpose. However, because all observability functions take the complete context as input, it may still be possible to use baggage values as labels. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "it may still be possible to use baggage values as labels": That seems to contradict the separation. Should this sentence be removed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Haha, that is true, it is a contradiction. I'm pointing out that it would be technically possible for the SDK to still do this. Maybe that is not a good idea. :) |
||||||||||
|
||||||||||
|
||||||||||
## What about complex propagation behavior? | ||||||||||
|
||||||||||
Some OpenTelemetry proposals have called for more complex propagation behavior. For example, falling back to extracting B3 headers if W3C Trace-Context headers are not found. Chained propagators and other complex behavior can be modeled as implementation details behind the Propagator interface. Therefore, the propagation system itself does not need to provide chained propagators or other additional facilities. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with this, and therefore do not understand why earlier section explicitly mentions Chained API functions. Chaining can be achieved through composition of SDK objects without cluttering the API that the end-user is exposed to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there is no solution for this at the API-level (or at least SDK-level), then there must be only one dedicated component (e.g. "the application") that sets all propagators as there is no shared protocol for how to coordinate on setting propagators. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, this language should be updated, now that the API has simple chaining. But, at the API level, we are only chaining together propagators for different applications. Complex details within one application – such as checking for Trace-Context and falling back to B3 if it is not present – does not need to be something handled at the API level. I was getting comments and questions about that kind of behavior, so I felt compelled to add this... But, with the addition of named tracers, how a single inject/extract call can leverage multiple independent propagators needs to be thought about a bit more, since propagation behavior now depends on which tracer instance is used - or at least, I think that is what named tracers implies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. on chaining, I think it doesn't belong here (see earlier comment in the API section) on named tracers - I would propose to ignore those. Propagators, like exporters, are 1-1 with TracerFactory, not with Tracer, i.e. all differently named tracers share the propagator. And the application that ultimately invokes the propagation API does it at the generic propagation layer, not via tracers. |
||||||||||
|
||||||||||
|
||||||||||
## Did you add a context parameter to every API call because Go has infected your brain? | ||||||||||
|
||||||||||
No. The concept of an explicit context is fundamental to a model where independent distributed applications share the same context propagation layer. How this context appears or is expressed is language specific, but it must be present in some form. | ||||||||||
|
||||||||||
|
||||||||||
# Prior art and alternatives | ||||||||||
|
||||||||||
Prior art: | ||||||||||
* OpenTelemetry distributed context | ||||||||||
* OpenCensus propagators | ||||||||||
* OpenTracing spans | ||||||||||
* gRPC context | ||||||||||
Comment on lines
+200
to
+203
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please make these links. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will do! |
||||||||||
|
||||||||||
# Open questions | ||||||||||
|
||||||||||
Related work on HTTP propagators has not been completed yet. | ||||||||||
tedsuo marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
* [W3C Trace-Context](https://www.w3.org/TR/trace-context/) candidate is not yet accepted | ||||||||||
* Work on [W3C Correlation-Context](https://w3c.github.io/correlation-context/) has begun, but was halted to focus on Trace-Context. | ||||||||||
* No work has begun on a theoretical W3C Baggage-Context. | ||||||||||
|
||||||||||
Given that we must ship with working propagators, and the W3C specifications are not yet complete, how should we move forwards with implementing context propagation? | ||||||||||
|
||||||||||
# Future possibilities | ||||||||||
|
||||||||||
Cleanly splitting OpenTelemetry into an Application and Context Propagation layer may allow us to move the Context Propagation layer into its own, stand-alone project. This may facilitate adoption, by allowing us to share Context Propagation with gRPC and other projects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Reading this sentence as "Observability systems [...] may have different requirements from other systems which require non-local information to be sent downstream." makes me wonder how I should interpret the last part. Obviously, observability systems also need non-local information to be sent downstream so that is no requirement difference. Maybe reformulate as "Observability systems have special rules for propagating information, such as sampling. This may result in different requirements from other systems regarding which information should be sent downstream." "non-local" seems redundant with "sent downstream" BTW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I prefer your wording as well! Adding it in.