Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make creating new spanids for server spans optional #359

Closed
toumorokoshi opened this issue Nov 19, 2019 · 6 comments
Closed

Make creating new spanids for server spans optional #359

toumorokoshi opened this issue Nov 19, 2019 · 6 comments
Labels
area:api Cross language API specification issue area:span-relationships Related to span relationships spec:trace Related to the specification/trace directory

Comments

@toumorokoshi
Copy link
Member

toumorokoshi commented Nov 19, 2019

We've been working toward implementing B3 propagation properly, and have encountered a need to clarify the expected behavior for server spans.

Link for more info: open-telemetry/opentelemetry-python#236

Background

B3 propagation explicitly states the need for server spans to share the span id of the client span. This is done to enable joining them together for the consumer (I believe zipkin is the practical example here: https://zipkin.io/pages/instrumenting.html)

The most common propagation use case is to copy a trace context from a client sending an RPC request to a server receiving it.

In this case, the same span ID is used, which means that both the client and server side of an operation end up in the same node in the trace tree.

https://github.com/openzipkin/b3-propagation#overall-process

Although I don't see a lot of explicit mentions, I believe this contradicts the behavior that tracecontext is advocating, which is different span ids for client and server spans. I don't see mention of this in the w3c spec, but it was called out here:

https://github.com/open-telemetry/opentelemetry-specification/blob/27b738b74eeb10560dc0308554a1d626cb93df79/specification/data-rpc.md#grpc

Implementations MUST create a span, when the gRPC call starts, one for client-side and one for server-side.

Proposal

This ticket should be a general discussion for the problem, but I'll throw at a proposal to kick off the discussion.

I am proposing that the specification explicitly calls out that integrations / extensions should NOT create new spans, instead deferring that responsibility to the propagator. This then delegates the control of whether to create a new span or not to the propagator, enabling both B3 and tracecontext in OpenTelemetry.

@toumorokoshi
Copy link
Member Author

pulling in conversations around this that are occuring in the original python ticket:

open-telemetry/opentelemetry-python#236 (comment)

To be honest, the B3 semantics seem quite exotic to me. It would have never occurred to me that client and server would share a span, so I also operated under that assumption when I updated the HTTP semantic conventions.

@Oberon00 I don't disagree, but at the same time I think that support for the standard propagation systems is critical to OpenTelemetry's success. My company uses B3 headers internally, and it will be very difficult to argue our continued contribution to OpenTelemetry knowing that we can't really use it without widespread standard changes to all of our services.

@Oberon00
Copy link
Member

One thing to note is that this would have implications for sampling. For example, when merging OTEP 6, the attributes argument was added so that informed sampling decisions can be made. This argument would then probably be required for the extract call.

Of course, this also interacts with open-telemetry/oteps#66 "Separate Layer for Context Propagation" (almost everything does, it seems).

So assuming we would currently have start_span(context, spanargs) -> context and extract(context, extractargs) -> context, this would probably mean that we would need a span_for_server(context, spanargs, extractargs) -> context that does everything at once: Actual extraction, deciding if a new span should be created, deciding whether to sample the span (part), actual creation of the span object.

@toumorokoshi
Copy link
Member Author

Is that sample choice necessary? I figured that the Span could be annotated with various metadata that would be available to the SpanProcessor to decide whether to propagate.

Is the concern this line?

Sampling decisions are made within the start Span operation, after attributes relevant to the span have been added to the Span start operation but before a concrete Span object exists (so that either a NoOpSpan can be made, or an actual Span instance can be produced depending upon the sampler's decision).

I figured that the integration is responsible for STARTING the span, just not creating one. would that solve that issue?

@mwear
Copy link
Member

mwear commented Nov 19, 2019

Just as a point of discussion, I'm wondering if we have to support shared IDs between client and server spans. The worst thing that will happen if we don't support the shared ID use case, is that RPC spans in Zipkin backends will be represented as two operations. I imagine that other non-Zipkin tracing backends will have issues with shared span IDs, but may want to be able to use B3 as a propagation format for interoperability purposes.

I'm not sure this is the right thing to do, but it could be an option to consider.

@yurishkuro
Copy link
Member

+1 to what @mwear said - to my knowledge Zipkin backend/UI will function just fine if the server span gets a new ID.

In the worst case, this can be a configuration parameter to the SDK, which is how Jaeger SDKs support compatibility with Zipkin's shared span model - when a new span is started AND is tagged as server kind, the tracer reuses the inbound span ID for the server span.

@toumorokoshi
Copy link
Member Author

I just checked with my team and we're using B3 propagation, but chose not to emit the same spanid because of the desire to actually evaluate the server's understanding of the timing separate from the client.

Since no one has really spoken up to the value of supporting shared span IDs, I guess it's not worth tackling or changing spec to enable yet. In the meantime though, I'll be filing a PR to make the need for spawning a new span id on servers explicit in the spec.

toumorokoshi added a commit to toumorokoshi/opentelemetry-specification that referenced this issue Nov 30, 2019
Fixes open-telemetry#359. Standardizing the behavior of the creation of server span
ids helps ensure server behavior will be the same regardless of the
language used.

In practice, the negative consequences of SpanID being different for
client and server in storage systems that expect it are minimal,
at worst appearing as separate spans.
@Oberon00 Oberon00 added area:api Cross language API specification issue spec:trace Related to the specification/trace directory area:span-relationships Related to span relationships labels Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue area:span-relationships Related to span relationships spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants