Add context propagation requirements to HTTP conventions #1783

lmolkova · 2021-06-29T21:58:02Z

Blocked by #1811

Changes

add a requirement to propagate context created for HTTP client request
if a context is already injected, back-off and don't instrument

Motivation

Context propagation is an essential part of tracing and de-facto done by every http instrumentation
Multiple layers of instrumentation can potentially legidly (or by mistake) co-exit. In this case, the best solution is to let higher-level instrumentation win:

Backoff if the context is injected. If the context is already injected on the request (HTTP/gRPC/anything else), there is no other good option than to back off. Options are:

re-instrument, i.e. create a new span and inject header (replace or add another value)? Then the logic that injected the header (and created a previous span) will be broken. There is no way to suppress that span.

re-instrument, but not inject header? Then this instrumentation layer is broken - there is no reason to export this span

don't instrument: ok, someone above already instrumented this request and perhaps created a span, nothing else to do. It seems nothing is broken and we didn't even create a span

This approach also means that the user's manual instrumentation always wins, which seems like a good default to have in terms of supportability. This is really short-term mitigation for a subset of double-instrumentation problems.

Related issues #
#530
#1767

lmolkova · 2021-06-29T22:03:21Z

Open questions:

do we want to be specific on which propagator should be used? (arguably it's up to downstream service and instrumentation)
do we want to make optimization in propagator API to check if a context is already created? We can do it with extract, but it will require unnecessary context parsing/population when there are multiple layers of instrumentation. We can also leave it up to the language to decide.

cijothomas · 2021-06-29T22:06:29Z

specification/trace/semantic_conventions/http.md

@@ -111,6 +112,11 @@ from the `net.peer.name`
 used to look up the `net.peer.ip` that is actually connected to.
 In that case it is strongly recommended to set the `net.peer.name` attribute in addition to `http.host`.

+### Context propagation
+
+- context created for HTTP client span MUST be injected on outgoing request using configured [propagator](../../context/api-propagators.md)


this addition can be for not just http, but every instrumentation dealing with out-of-proc communication?

yes, if this direction is supported by the community, I can update other relevant specs, I can also update propagators doc to mention it.

CHANGELOG.md

specification/trace/semantic_conventions/http.md

iNikem · 2021-07-05T05:21:44Z

specification/trace/semantic_conventions/http.md

+### Context propagation
+
+- context created for HTTP client span (if valid) MUST be injected on outgoing request header using configured [propagator](../../context/api-propagators.md)
+- Exception: if outgoing HTTP request already has valid context (for configured propagator), it cannot be changed and new span MUST NOT be recorded


I oppose this change. #1738 made in clear and explicit that nested CLIENT spans are allowed. There are totally valid cases for having them: e.g. DB call which used HTTP as its transport. In this case I want both a span with DB semantic convention and spans with HTTP semantic conventions.

I totally agree that only the outer-most or first CLIENT span must have its context propagated.

I see your point and agree nested client spans should be allowed.
This change does not prohibit nested client spans. This is an edge case of 2 HTTP instrumentations fighting for the same request. Arguably there is no real use-case behind 2 identical spans created on different API levels.
At the same time, DB spans should not inject context into http requests.

Maybe this is worth clarifying in this spec that only HTTP instrumentation should inject context into http request? I.e. if OTel instrumentation injects headers into HTTP requests it MUST also create an HTTP span?

Maybe this is worth clarifying in this spec that only HTTP instrumentation should inject context into http request? I.e. if OTel instrumentation injects headers into HTTP requests it MUST also create an HTTP span?

Why? Let's take another example: messaging system with HTTP transport. PRODUCER span is created by it and it definitely has to be injected and propagated. Hm, but it probably will be injected into the message and not http request... But then do we want for http instrumentation to inject its context into http headers? Probably not?

Arguably there is no real use-case behind 2 identical spans created on different API levels

Agree

But then do we want for http instrumentation to inject its context into http headers? Probably not?

We probably don't need it but it probably won't hurt if we have it in both HTTP and the message.

As a special case, for AWS messaging systems (SQS, SNS), setting the right HTTP header will actually end up being translated to a message property. https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-message-metadata.html#sqs-message-system-attributes

lmolkova · 2021-07-06T15:17:39Z

With messaging, you have to allow separate context on message and transport. Reasons:

messaging systems allow batching messages on send
they allow buffering and sending absolutely independent messages in the background
retries happen and you can change the transport span (create a new one), but not the message span (or you'll end up with multiple contexts for the same message, which may be hard to debug)

Service meshes that handle messaging are the ultimate example of all the above.

Why?

Assuming there is instrumentation for all almost HTTP clients in nearly every language, having consistent high-quality spans created for HTTP requests will bring a better user experience - vendors always know how to visualize it, build metrics and alerts based on it.
Letting DB instrumentation instrument HTTP requests and HTTP instrumentation only create a span (retries again? complex multi-request operations?) removes the assumption that under HTTP client span there is a server HTTP span and complicates experience.

…y-specification

lmolkova · 2021-07-09T01:48:30Z

@iNikem @bogdandrutu please take a look.

jkwatson · 2021-07-09T02:21:53Z

specification/trace/semantic_conventions/http.md

+
+`Context` created for HTTP client span (if valid) MUST be injected on outgoing request headers using configured [propagator](../../context/api-propagators.md). Instrumentation that injects `Context` into HTTP request headers MUST also emit a span that complies with other client requirements in this specification.
+
+**Exception**: if outgoing HTTP request already has valid `Context` (for configured propagator), it cannot be changed and new span MUST NOT be recorded.


How can this be determined using the current OpenTelemetry API? Would all http client instrumentation be required to query the propagator, grab the fields() and check to see if they exist already? How would that work if the http request object was being re-used and the fields were left over from a previous request? (see the documentation on the fields() function of the propagator to see what I'm referring to).

How can this be determined using the current OpenTelemetry API?

I was thinking about having this logic within the propagator and have 2 options:

Sub-optimal, no propagator change requires: instrumentation calls extract and checks if it got valid context. It's cheap when there is one layer of instrumentation, but if there are multiple, it involves extra context parsing and allocation.

Optimal: add CanExtract (naming is always hard) method that only checks for context presence and may do minimum validity checks.

This way decision on context presence and validity stays within the propagator and multiple HTTP (and other protocol) instrumentations don't have to do it.

I suggest starting with 1 as perf hit affects relatively rare scenarios (multiple instrumentation layers) and is not very big in the general case of single transparent. Optimization (option 2) can come later.
I'm also open to starting with option 2 right away add a new method on propagators if there is any consensus around it.

How would that work if the http request object was being re-used and the fields were left over from a previous request?

Great point! I believe the current spec requires to clean up the context for the reused carriers, Is there an assumption that it's not always possible?

I think this is also worth mentioning here that context should be cleaned up after HTTP try if the HTTP client allows reusing the same request instance.

The spec says "successive calls should clear these fields first.", so your proposed call to extract would still be operating on the previous contents of the headers, meaning that the resulting Context would always be valid, so I don't think this is a feasible approach, with either options 1 or 2.

oh, you're right. what's the feeling about changing it to clean up after rather than before? this way everyone cleans up for what they've done and never breaks anything else.

Heh. I have no idea...I don't think I've ever seen a web client that re-uses request objects like this, so I don't have the context for what it's even referring to. :)

it seems it was there from the very beginning (#147) and there was no discussion/explanation around this choice. It also seems that at least golang http client allows to reuse http request instances between tries. I'll raise it as a separate issue and until then this PR is blocked.

iNikem · 2021-07-12T06:47:16Z

Ok, I agree with you about messaging use-case. Different contexts propagated via message and via http connection make sense. Although I am not sure our current ecosystem (both instrumentations and backends) handles this correctly.

But I am not convinced about tying together "should I create a new CLIENT span" and "should I propagate this span". This proposal forbids creating new CLIENT span if it will not be propagated. Take the following use-case.

User space high level http client (e.g. reactive WebClient in Java) makes an http request.
Lower level "transport" layer library (Netty in this case) is invoked several times by WebClient to handle retries or redirects or circuit-breaker whatever.

Both libraries have auto-instrumentations in Java. I argue that the valid outcome should be the following:

CLIENT span with http semantic conventions should be created by WebClient instrumentation.
Its SpanContext should, obviously, be injected into the request
Nested CLIENT spans with http semantic conventions should be created by Netty instrumentation. At least in order to expose that lower-level transport layer information to the developer.

This proposal forbids creating those nested spans by Netty. I think it is wrong.

ahayworth · 2021-07-12T15:14:20Z

My understanding of this PR is that there is some desire to handle competing "http" client spans; to collapse them into one sensible span rather than multiple duplicate spans.

We've been exploring this in opentelemetry-ruby, trying to de-duplicate http client spans in our auto-instrumentation. We have an approach that works well for us, by modifying the context around a block of code. Then, our HTTP auto-instrumentation libraries merge those context values into the attributes of the outgoing client span. The result is one outgoing span, annotated with relevant attributes from a higher-level instrumentation.

In pseudocode, it looks like:

# Higher-level HTTP-like instrumentation
OpenTelemetry::Common::HTTP::ClientContext.with_attributes({ foo: "bar" }) do
  # do work, usually calling `super`
end

# Lower-level HTTP client instrumentation
attributes = OpenTelemetry::Common::HTTP::ClientContext.attributes
tracer.in_span("GET", kind: :client, attributes: attributes.merte({ other: "attributes" }) do |span|
  # do more work
end

You can see this in practice in the "higher-level" koala instrumentation (a facebook HTTP client, I believe), and the "lower-level" Net::HTTP instrumentation (standard ruby HTTP client, used by many things). The context modification bits can be found here.

I would submit this as an alternate approach to prohibiting nested spans; this allows instrumentation that knows it is just an extremely thin wrapper around an HTTP call to decorate spans with relevant attributes without confusing duplication. This approach has downsides and does not solve all of the things you're setting out to do, but I think it is worth considering.

lmolkova · 2021-07-12T15:52:12Z

@iNikem thanks for the useful input!

Lower level "transport" layer library (Netty in this case) is invoked several times by WebClient to handle retries or redirects or circuit-breaker whatever.

I think there is an assumption that retries and redirects should be represented as a nested span but have to share the same context on the wire. Why so?

Redirects may or may not be traceable and users configure HTTP clients to allow/prevent auto-redirects. So we should assume any HTTP client instrumentation will create new spans and propagate new context for redirects (if app logic handles them). For the sake of consistency, we should make auto-instrumentation do the same.

Retries are part of application logic in the majority of HTTP clients. It's not feasible to instrument most of the HTTP clients in the way to have a higher-level span to group retries without the user's help. So I believe we have to agree each retry has to have its own span AND context to bring consistent and understandable experience.

lmolkova · 2021-07-12T15:53:44Z

@ahayworth this PR attempts to address corner case: what happens if http request is already instrumented. There is a broader discussion here #1767 with context-based instrumentation suppression. Thanks for sharing your approach!

ahayworth · 2021-07-13T16:41:50Z

@lmolkova Yes, I understand - and I think the context propagation requirements in this PR are useful. I mentioned the context-based span augmentation that the ruby SDK does because I feel as though it can address this part of the PR:

Exception: if outgoing HTTP request already has valid Context (for configured propagator), it cannot be changed and new span MUST NOT be recorded.

I just wanted to raise that an alternate way to address it would be through context-based augmentation: while it creates an implicit assumption that lower-level instrumentation will be present, at least in the ruby world we presume that low-level HTTP implementation will be something that many folks want. Taking that approach, there is no already-instrumented request in progress; requests are only instrumented at the lowest level, and it solves that part of the problem you've raised.

Put differently: we could instead advise instrumentation not to do this in the first place, and would perhaps avoid some of the difficulties and problems that others have raised.

(edit: I will engage in the PR you mentioned! 😄 )

lmolkova · 2021-07-13T17:35:11Z

I'm closing this PR based on @trask feedback on #1811 (comment) - reusable requests are much more common than I expected and requiring context clean up will bring perf hit on the happy popular case of no-retries.

I'm trying to come up with ~~yet another~~ context marker proposal, but I believe we need more than a terminal marker (we just don't want two layers of the same e.g. HTTP instrumentation but we want to allow deeper layers of different kinds of instrumentations: DNS/streaming messages/TCP/profiling, etc).

Add context propagation requriements to HTTP conventions

6487af5

lmolkova requested review from a team June 29, 2021 21:58

github-actions bot assigned carlosalberto Jun 29, 2021

cijothomas reviewed Jun 29, 2021

View reviewed changes

changelog

d40ab59

bogdandrutu reviewed Jun 30, 2021

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

specification/trace/semantic_conventions/http.md Outdated Show resolved Hide resolved

Oberon00 changed the title ~~Add context propagation requriements to HTTP conventions~~ Add context propagation requirements to HTTP conventions Jun 30, 2021

lmolkova added 2 commits June 30, 2021 20:06

add examples

71d620c

up

ee9d872

iNikem suggested changes Jul 5, 2021

View reviewed changes

lmolkova added 2 commits July 8, 2021 18:40

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

bdc4cda

…y-specification

instrumentation that injects http header must also emit http span

ff86380

jkwatson reviewed Jul 9, 2021

View reviewed changes

lmolkova mentioned this pull request Jul 9, 2021

Cleaning up context on reusable carriers before or after injection #1811

Closed

lmolkova closed this Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add context propagation requirements to HTTP conventions #1783

Add context propagation requirements to HTTP conventions #1783

lmolkova commented Jun 29, 2021 •

edited

Loading

lmolkova commented Jun 29, 2021 •

edited

Loading

cijothomas Jun 29, 2021 •

edited

Loading

lmolkova Jun 29, 2021

iNikem Jul 5, 2021

lmolkova Jul 6, 2021 •

edited

Loading

lmolkova Jul 6, 2021

iNikem Jul 6, 2021

Oberon00 Jul 6, 2021

lmolkova commented Jul 6, 2021 •

edited

Loading

lmolkova commented Jul 9, 2021

jkwatson Jul 9, 2021

lmolkova Jul 9, 2021 •

edited

Loading

lmolkova Jul 9, 2021

jkwatson Jul 9, 2021

lmolkova Jul 9, 2021

jkwatson Jul 9, 2021

lmolkova Jul 9, 2021 •

edited

Loading

iNikem commented Jul 12, 2021

ahayworth commented Jul 12, 2021

lmolkova commented Jul 12, 2021

lmolkova commented Jul 12, 2021

ahayworth commented Jul 13, 2021 •

edited

Loading

lmolkova commented Jul 13, 2021 •

edited

Loading


		`Context` created for HTTP client span (if valid) MUST be injected on outgoing request headers using configured [propagator](../../context/api-propagators.md). Instrumentation that injects `Context` into HTTP request headers MUST also emit a span that complies with other client requirements in this specification.

		Exception: if outgoing HTTP request already has valid `Context` (for configured propagator), it cannot be changed and new span MUST NOT be recorded.

Add context propagation requirements to HTTP conventions #1783

Add context propagation requirements to HTTP conventions #1783

Conversation

lmolkova commented Jun 29, 2021 • edited Loading

Changes

Motivation

lmolkova commented Jun 29, 2021 • edited Loading

cijothomas Jun 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova commented Jul 6, 2021 • edited Loading

lmolkova commented Jul 9, 2021

Choose a reason for hiding this comment

lmolkova Jul 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jul 9, 2021 • edited Loading

Choose a reason for hiding this comment

iNikem commented Jul 12, 2021

ahayworth commented Jul 12, 2021

lmolkova commented Jul 12, 2021

lmolkova commented Jul 12, 2021

ahayworth commented Jul 13, 2021 • edited Loading

lmolkova commented Jul 13, 2021 • edited Loading

lmolkova commented Jun 29, 2021 •

edited

Loading

lmolkova commented Jun 29, 2021 •

edited

Loading

cijothomas Jun 29, 2021 •

edited

Loading

lmolkova Jul 6, 2021 •

edited

Loading

lmolkova commented Jul 6, 2021 •

edited

Loading

lmolkova Jul 9, 2021 •

edited

Loading

lmolkova Jul 9, 2021 •

edited

Loading

ahayworth commented Jul 13, 2021 •

edited

Loading

lmolkova commented Jul 13, 2021 •

edited

Loading