Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unifying propagator formatters #89

Conversation

toumorokoshi
Copy link
Member

@toumorokoshi toumorokoshi commented Aug 16, 2019

Paired with this RFC (open-telemetry/oteps#37), an implementation for a unified propagator that acts directly on context.

This also removes the need for generics and propagators that only
consume one or the other, which would otherwise require integrators to do extra work to
wire propagators appropriately.

Modifying the API of the propagators to consume the context as a mutable
argument. By passing in the context rather than returning, this enables the
chained use of propagators, allowing for situations such as supporting
multiple trace propagation standards simulatenously.

@toumorokoshi
Copy link
Member Author

This PR begins to address the many questions that exist from #78, and the discussions started in

open-telemetry/opentelemetry-specification#210
open-telemetry/opentelemetry-specification#112

There are concerns that this PR does not currently address, that I intend to address in further commits:

I think this will ultimately culminate in a couple PRs to the opentelemetry-specification, so I intend to broaden the audience a bit outside of just python contributors.

@toumorokoshi toumorokoshi force-pushed the feature/propagators-proposal branch from 81f4c57 to f1f8bb8 Compare August 16, 2019 03:36
@a-feld
Copy link
Member

a-feld commented Aug 16, 2019

I've been looking at this PR while trying to understand the motivations for the combination of the two contexts. From the discussions, it seems that one would want to have access to both the current span context and current distributed context simultaneously for most use cases (e.g. adding dimensions to either spans or metrics from current context). Is my understanding correct?

Given that motivation, I have a couple of concerns:

  1. Certain propagators are implicitly used to serialize a particular type of context (i.e. w3c trace context propagator for SpanContext serialization and w3c correlation context propagator for DistributedContext serialization). The approach in this PR requires the propagators to have access to contexts that it does not need which breaks down some of the separation between these propagator types.

  2. DistributedContext as I understand it is scoped to a request rather than a span which, as I understand it, is part of the reasoning for keeping the two objects separate. If these two contexts are combined into a single structure, the scope becomes unified. For DistributedContext, I'm imagining there will be a single context variable for the current DistributedContext that is set at the beginning of the request and torn down at the end of the request.

The current mental model that I have is that the DistributedContext does not need to be passed anywhere and can be accessed through a retrieval API that accesses a contextvar at any point in the request (since the DistributedContext is global to the request).

I was envisioning something like this when context needs to be accessed:

from opentelemetry import distributedcontext
from opentelemetry import trace as trace_api

tracer = trace_api.trace()
span_context = tracer.get_current_span().get_context()
distributed_context = distributedcontext.get_current_context()

In this example, the DistributedContext here is not scoped to the span. I can foresee the need to call these APIs separately being problematic but I think this can be solved by creating an API which wraps the above code and returns both contexts simultaneously.

As an aside, would it make sense to add a get_current_context() on a Tracer?

@toumorokoshi
Copy link
Member Author

toumorokoshi commented Aug 17, 2019

There was some discussion around some of what you brought up, so I'll outline I think the main points here. I think in all fairness DistributedContext is based on an evolving and as-of-yet unused standard so there's a little ambiguity here.

Outlining the discussion points:

SpanContext is scoped differently than DistributedContext

DistributedContext is a OTel representation of CorrelationContext. Note that the spec states:

To correlate events between components, these components need to exchange and store a piece of information called context. Typically context consists of an originating event identifier, an originating component identity and other event properties. Context has two parts. The first part is a trace context. Trace context consists of properties crucial for event correlation. The second part is correlation context. Correlation context carries user-defined properties. These properties may be helpful for correlation scenarios. But they are not required and components may choose to not carry or store them.

I think the abstract implies that if a correlation context exists, it is scoped to the same context as the span.

Joining of Distributed / SpanContext

Assuming that the two need to be paired, I think it's valuable to have the formatter have access to both. If a vendor does choose to integrate with OTel, their formatter will probably not just need TraceState (which contains basic tracing identification date) but also access to the CorrelationContext as well (which is specifically stated as a way for vendors to propagate bespoke data).

The choice to join them ensures that it is not possible to half-propagate vendor headers by only connecting a formatter for Span but not distributed, or visa versa.

Using Retrieval APIs to retrieve SpanContext / DistributedContext

Assuming there's consensus on the need to pair Span and Distributed context together, having different retrieval APIs will result in fragmenting and exposing possible incorrect use of retrieving DistributedContext that do not match the SpanContext they should paired with.

get_current_context as part of Tracer

I think this is still up in the air. A feature that people will eventually need for OTel is to be able to better choose what does and does not propagate per integration.

For example, one may want to expose specific formatters for a public endpoint vs a private endpoint. As such, it maybe important to have a client / http endpoint pair to be configured with a different set of formatters.

Another example is attaching different exporters to different APIs.

I think there is a need to somehow bind the whole context of propagators and exporters to a specific interface. But this may be worth a discussion in it's own right, so it may be valuable for velocity's sake to only support a singleton configuration.

@carlosalberto
Copy link
Contributor

If these two contexts are combined into a single structure, the scope becomes unified.

My same feeling. There was some talk about allowing users to only use the Correlation context alone, if/as needed, so this would break that up.

@yurishkuro I remember you mentioning adding some utility class that would allow users injecting/extracting both SpanContext and DistributedContext, so this might be of interest to you ;)

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good and the code is clear, but we may not want to depend on trace from context.

Is the plan to merge this in or point people to this PR for review of the concept?

of the APIs that modify it.

As it is designed to carry context specific to all telemetry use
cases, it's schema is explicit. Note that this is not intended to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cases, it's schema is explicit. Note that this is not intended to
cases, its schema is explicit. Note that this is not intended to

be an object that acts as a singleton that returns different results
based on the thread or coroutine of execution. For that, see `Context`.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

# limitations under the License.

from opentelemetry.distributedcontext import DistributedContext
from opentelemetry.trace import SpanContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, this is the reason we handle the context in the trace package now. If we think of distributed tracing as being built on top of context propagation then the context package should be ignorant of tracing. This seems to be the motivation for having the tracer inject/extract the SpanContext from the carrier in OpenTelemetry instead of passing it into the tracer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which reminds me, there was mention in the past of possibly using such context layer independently of tracing/metrics :)



Args:
distributed: The DistributedContext for this instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem for the specs repo for sure, but I don't see a good reason to call this DistributedContext instead of Tags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed: maybe using more direct terminology would benefit the APIs all around.

distributed: The DistributedContext for this instance.
span: The SpanContext for this instance.
"""
__slots__ = ["distributed", "span"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to do this here only and not in other API classes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this choice for unified_context because I didn't want users to mistakenly assign something to the object with the wrong name.

I'm not sure about the ramifications of slots on an abstract base class: I presume we want to provide implementors the ability to add their own fields.

But I think utility containers classes should specify slots since there's no good reason not to.

@c24t
Copy link
Member

c24t commented Aug 20, 2019

Some related discussion in OpenTracing: opentracing/specification#114.

@toumorokoshi
Copy link
Member Author

@c24t for the time being I'd like to point people at this PR. I think there's some ideas that need to be hashed out here, such as the choice to join or keep the two contexts separate and the relationship of how they propagate.

I can separately work on adding additional formatters with the existing code merged in.

@toumorokoshi
Copy link
Member Author

There was some feedback from the SIG meeting around this PR, with feedback to try break out the suggested API changes into separate PRs.

I'll be moving forward with trying to work the specification team to propose some of my changes. I'd like to leave this PR open until then (next week), but let me know if we need to close.

from opentelemetry.trace import SpanContext


class UnifiedContext:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this module could easily move out of the "Context" module. I think I just added that because the module name seemed to match.

In reality I imagine this would be higher-level container data structure, in a separate module.

Adding a UnifiedContext, composing DistributedContext and SpanContext.
This will enable propagators to extract and inject values from either system,
enabling more sophisticated schemes and standards to propagate data.

This also removes the need for generics and propagators that only
consume one or the other, requiring integrators to do extra work to
wire propagators appropriately.

Modifying the API of the propagators to consume the context as a mutable
argument. By passing in the context rather than returning, this enables the
chained use of propagators, allowing for situations such as supporting
multiple trace propagation standards simulatenously.
@toumorokoshi toumorokoshi force-pushed the feature/propagators-proposal branch from f1f8bb8 to a8dc9b3 Compare September 2, 2019 18:56
Migrating formatters outside of context. The design goal is to
allow context to be it's own object agnostic of the requirements
of specific telemetry. Instead, moving it under a broader propagators
module.

Removing UnifiedContext in lieu of directly using Context.
UnifiedContext adds an unneeded layer of abstraction that would
then require more operations to migrate into context.

Removing binaryformat and httptextformat modules specific to
SpanContext and DistributedContext. These are now handled by
a single formatter.
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the definition of UnifiedContext in this PR. That makes it harder to reason about what the propagators are supposed to do with it and how it interacts with the runtime context.

"""
The OpenTelemetry context module provides abstraction layer on top of
thread-local storage and contextvars. The long term direction is to switch to
contextvars provided by the Python runtime library.

A global object ``Context`` is provided to access all the context related
functionalities::
functionalities:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you look at the rendered Sphinx documentation? These double-colons are not typos but rst-Syntax for code blocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I am aware they are rst-syntax. I don't believe I modified this by hand: most likely an autoformatter did it since I touched this file. I'll take a look.

@toumorokoshi toumorokoshi changed the title Showing a proof of concept of changes to the propagation API unifying propagator formatters Sep 3, 2019
@toumorokoshi
Copy link
Member Author

@Oberon00 sorry UnifiedContext was removed due to this feedback (open-telemetry/oteps#37 (comment)). I've updated the description of this PR to better explain that it's tied closely with the RFC.

@Oberon00
Copy link
Member

Oberon00 commented Sep 3, 2019

@toumorokoshi I see. But there are still quite a few references to it in the examples and documentation of this PR.

@toumorokoshi
Copy link
Member Author

Got it. I see 4 references in HTTPTextfFormat, will push a fix.

Removing the last few mentions of UnifiedContext

Fixing rst syntax of the context module.
Python3.4 and 3.5 do not support types on class attributes.
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm not too happy with this design (but I acknowledge that it seems to follow the one from the spec summit https://docs.google.com/document/d/1rgXZik2_-2fU1KpVi7nze5ygWA8NhaLE8VmD20H4P8M/edit#heading=h.vkf5zbyfzuyy very closely, so probably I should also voice my concerns elsewhere)

With this PR the (runtime) context is now used in some scenarios where it was not used previously. One use-case that seems poorly supported by this design is using "explicit context propagation" (IIRC that was the name), i.e. not setting a span as active but explicitly passing the parent span to Tracer.create_span. In that case a new temporary context would need to be created in order to inject the span context into a carrier. It seems that a temporary context is needed in all cases when extracting (the extracted span context should not become current context -- it might never become a parent span, maybe it will only be added as a link.

@@ -15,7 +15,7 @@
import abc
import typing

from opentelemetry.trace import SpanContext
from opentelemetry.context import BaseRuntimeContext

Setter = typing.Callable[[object, str, str], None]
Getter = typing.Callable[[object, str], typing.List[str]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Getter = typing.Callable[[object, str], typing.List[str]]
Getter = typing.Callable[[object, str], typing.Sequence[str]]

@@ -57,7 +58,7 @@ def extract(cls, get_from_carrier, carrier):
elif len(fields) == 4:
trace_id, span_id, sampled, _parent_span_id = fields
else:
return trace.INVALID_SPAN_CONTEXT
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some kind of error info might be useful to return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the new system, that would be accomplished by assigning that to the global context object.

I was a bit hesitant to that because chaining propagators is now an option. But for now I can re-add since that's not a feature yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for now it could be logged IHMO.

@@ -92,7 +93,11 @@ def extract(cls, get_from_carrier, carrier):
if sampled in cls._SAMPLE_PROPAGATE_VALUES or flags == "1":
options |= trace.TraceOptions.RECORDED

return trace.SpanContext(
# TODO: there is no standard way to retrieve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual context used to be an implementation detail. You are making it a public interface now. When we want to switch to using just plain contextvars in the future, that may be a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think contextvars really solves this, as you still need some sort of public interface to define where to retrieve or set context-related values.

I agree that's what's happening here. Maybe @reyang has opinions on this? I'm +1 for just being explicit with what slots should be populated with which objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other option is another interface which handles specifying that. But I really don't imagine this being something alternate SDK implementers want to override.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea: We might just specify that a context is an opaque object of unspecified type (though we may add a type alias for the real thing). Tracer, Meter, etc. implement (probably in the API layer) methods like get_current_span_context(context: OpaqueContext = CURRENT_CONTEXT) -> SpanContext and set_current_span_context(context: OpaqueContext, newspancontext: SpanContext) -> None. That way we achieve type safety (kinda).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's what I think we have to do.

personally I'd like there to be standards around the setters /getter into context, maybe even via a wrapper object, and to have all SDKs adhere to that. I'm probably limited in imagination here but I don't feel like there's a lot of uses cases where one need to interact with the context differently.
@c24t @reyang I know you've had thoughts on this in the past. Any opinions on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I could imagine storing additional custom information in the context that doesn't fit into the predefined Span/Distributed context. E.g., (just a random example from the top of my mind) a handle to some statistics object that counts how often certain OTel operations were called.

# the SpanContext or DistributedContext from a
# context object declared in opentelemetry-api.
# this should be more standard.
context[Tracer.CONTEXT_SLOT_NAME] = trace.SpanContext(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. In the context there is always a full span, not just a context (see Tracer.get_current_span). The parent span context should never make it to the context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look.

@@ -102,12 +107,15 @@ def extract(cls, get_from_carrier, carrier):

@classmethod
def inject(cls, context, set_in_carrier, carrier):
sampled = (trace.TraceOptions.RECORDED & context.trace_options) != 0
span_context = context[Tracer.CONTEXT_SLOT_NAME]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar error here: span_context is actually a Span, you need to first retrieve the SpanContext from it.

@toumorokoshi
Copy link
Member Author

@Oberon00 I think those are great points, and I'd suggest joining the conversation around the OTEP as well: open-telemetry/oteps#37

I'll think through the need for explicit context propagation: are there examples in OpenCensus or other prior art I can look at to better understand the motivation?

I'm going to do some research and come back as well. FWIW I also support a separate object to populate values into, that would then be loaded / retrieved from context. But @yurishkuro noted that may add confusion (open-telemetry/oteps#37 (comment))

from .base_context import BaseRuntimeContext

__all__ = ["Context"]


Context = None # type: typing.Optional[BaseRuntimeContext]
def create_context() -> BaseRuntimeContext:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is useful for this module's testing, I don't think this is something that should be exposed - I don't think users are really supposed to be creating more than one context.

@carlosalberto
Copy link
Contributor

Btw, you might have seen that there is now a RFC for the proposed/mentioned changes: open-telemetry/oteps#42

@carlosalberto
Copy link
Contributor

Hey hey @toumorokoshi

So basically what I mentioned is that there's no need to have create_context(), as we will always have a single context (at least with the current context API), and as such, there's no need to explicitly pass around a BaseRuntimeContext parameter.

(I keep feeling that the name of a few methods will have a name that is does not sound 'correct', such as to_bytes() taking no parameters, but we need to work on that later, or now if you feel like it ;) )

@Oberon00
Copy link
Member

If we only have a single context that's never passed around then effectively all functions handling contexts return their results in global variables, which is a pain to unit test and not very flexible to use.

@Oberon00
Copy link
Member

BTW, if we want something quick for the Alpha would it make more sense to directly merge #121 instead of this one?

@Oberon00 Oberon00 added this to the Alpha Release milestone Sep 13, 2019
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toumorokoshi toumorokoshi removed this from the Alpha Release milestone Sep 19, 2019
@toumorokoshi
Copy link
Member Author

This is not needed for alpha, as #137 is merged.

@toumorokoshi
Copy link
Member Author

closing in favor of #278, please continue any discussion there.

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
* truncate tracestate after exceeding item limit

* lint: address circle failures

* tracestate: use slice instead of reducer

* tracestate: move truncate after reverse

* comment: add tracestate reverse reasoning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants