Attributes added by sampler cannot be populated on activity #953

lmolkova · 2020-07-30T17:04:16Z

[renamed]

Describe your environment. Describe any aspect of your environment relevant to the problem:
N/A

SDK version: (nuget versions of the all the relevant packages)
0.4.0-beta
.NET runtime version (.NET or .NET Core, TargetFramework in the .csproj file):
N/A (any)
Platform and OS version:
any

Steps to reproduce.

configure probability sampling
create activities without a parent
random id will be generated
sampling decision will be made based on random trace id
trace id will not be set on activity

opentelemetry-dotnet/src/OpenTelemetry/Sdk.cs

Lines 158 to 191 in 920b0ed

    
               internal static ActivityDataRequest ComputeActivityDataRequest( 
        
                   in ActivityCreationOptions<ActivityContext> options, 
        
                   Sampler sampler) 
        
               { 
        
                   var isRootSpan = options.Parent.TraceId == default; 
        
                   // This is not going to be the final traceId of the Activity (if one is created), however, it is 
        
                   // needed in order for the sampling to work. This differs from other OTel SDKs in which it is 
        
                   // the Sampler always receives the actual traceId of a root span/activity. 
        
                   ActivityTraceId traceId = !isRootSpan 
        
                       ? options.Parent.TraceId 
        
                       : ActivityTraceId.CreateRandom(); 
        
                   var samplingParameters = new SamplingParameters( 
        
                       options.Parent, 
        
                       traceId, 
        
                       options.Name, 
        
                       options.Kind, 
        
                       options.Tags, 
        
                       options.Links); 
        
                   var shouldSample = sampler.ShouldSample(samplingParameters); 
        
                   if (shouldSample.IsSampled) 
        
                   { 
        
                       return ActivityDataRequest.AllDataAndRecorded; 
        
                   } 
        
                   // If it is the root span select PropagationData so the trace ID is preserved 
        
                   // even if no activity of the trace is recorded (sampled per OpenTelemetry parlance). 
        
                   return isRootSpan 
        
                       ? ActivityDataRequest.PropagationData 
        
                       : ActivityDataRequest.None; 
        
               } 
        
           }

We need a way to pass back trace-id if it was created and attributes that were set by sampler to comply with the spec along with sampling decision.

What is the expected behavior?
Trace-id used to make sampling decision should be reused for consistent sampling across services. Attributes set by sampler should also be propagated back from sampler (see spec).

What is the actual behavior?

Trace id will not be set on the activity (i.e. sampling result will be independent of trace-id and non-deterministic). Old PR about probability sampling suggests we determinism
Assuming another service downstream has a different sampling configuration and would need to recalculate sampling hash - it will create inconsistent sampling decisions between upstream and downstream.

Thoughts? @tarekgh @cijothomas @reyang

cijothomas · 2020-07-30T17:20:21Z

@lmolkova This is addressed in preview8 :)
I am making a PR to update this repo to preview8 and leverage the new feature to fix this using PregenerateNewRootId

https://github.com/dotnet/designs/blob/master/accepted/2020/diagnostics/activity-improvements-2.md#automatic-trace-id--generation-in-case-of-null-parent

tarekgh · 2020-07-30T17:26:28Z

it is called AutoGenerateRootContextTraceId

lmolkova · 2020-07-30T18:14:22Z

Makes sense, thanks for the update! Great that trace-id is addressed already!
It does not seem to support attributes scenario, correct?

ShouldSample

Return value:

It produces an output called SamplingResult which contains:

A sampling Decision. One of the following enum values:
...
A set of span Attributes that will also be added to the Span.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

cijothomas · 2020-07-30T18:27:05Z

Makes sense, thanks for the update! Great that trace-id is addressed already!
It does not seem to support attributes scenario, correct?

ShouldSample

Return value:

It produces an output called SamplingResult which contains:

A sampling Decision. One of the following enum values:
...

A set of span Attributes that will also be added to the Span.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

Its tracked here: #941. It doesn't require change from .NET as this is SDK only detail.

lmolkova · 2020-07-30T18:29:08Z

I mean there is no way to propagate attributes generated by sampler inGetRequestedDataUsingContext to Activity. Am I missing something?

tarekgh · 2020-07-30T18:47:10Z

You can listen to the Activity start in the listener and add the whatever attributes. But you are right, cannot directly propagate attributes inside GetRequestedDataUsingContext callback. I think the purpose of GetRequestedDataUsingContext is just passing the information can allow taking sampling decision and not for propagating more data the Activity.

lmolkova · 2020-07-30T18:57:18Z

yeah, I can add attributes in Start event, but if sampler generates something (which is my scenario) - there would be no way to reflect them on the Activity. This scenario also seems to be part of OTel spec, so I wonder if we cautiously wanted to drop this scenario and if so why. Otherwise, I'd like to plan on fixing it.

tarekgh · 2020-07-30T19:41:23Z

This scenario is not encountered or requested when we discussed the design with all parties. could you please send a pointer from the specs regarding that?
My question now, do you see this scenario is a blocking or this can be workaround for now and we can consider supporting that in next release?

CC @noahfalk

lmolkova · 2020-07-30T21:01:55Z

I guess the scenario is somewhat tricky and not something that exists today at least in .NET so it was hard to expect it.

Scenario

The spec mentions a set of attributes that are returned in SamplingResult that should be populated on Span being created iff it's sampled in.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

It's not crystal clear from the spec, whether it's the initial set of attributes that were used to create activity or sampler can modify them.

Sampling OTEP is more clear and clarifies that these attributes are an addition to the initial set:

It produces an output called SamplingResult that includes:

A SamplingDecision enum [NOT_RECORD, RECORD, RECORD_AND_PROPAGATE].

A set of span Attributes that will also be added to the Span.

These attributes will be added after the initial set of Attributes.

(under discussion in separate RFC) the SamplingRate float."

My scenario is actually the last one (one under discussion): sampler would populate its rate on the Activity and it will be used across multiple services to make consistent sampling decisions regardless of the initial algorithm used.

What's the plan

So my main concern is: let's say we will want to support this scenario in the next release. We'll need a breaking change for it (return more than ActivityDataRequest enum) or a new callback that returns the extendable result of sampling.

I propose to have a new struct that is returned so that we comply with OTel spec and have a room for at least some extensibility here (I'm bad with naming, that's just an example).

public readonly struct ActivitySamplingResult
{
    public readonly IEnumberable<string, object> Attributes;
    public readonly  ActivityDataRequest DataRequest;
}

And change GetRequestedDataUsingContext deletate to return ActivitySamplingResult.

I only see very hacky ways to solve it now (do this in http-in listeners), but it won't work with the ActivitySource listeners. It won't be even possible to hack it within user applications.

lmolkova · 2020-07-30T21:10:38Z

Here is how java does it:

https://github.com/open-telemetry/opentelemetry-java/blob/f7a52336516435d065370d83a6980893a9c8df23/sdk/src/main/java/io/opentelemetry/sdk/trace/SpanBuilderSdk.java#L247-L259

And go:

https://github.com/open-telemetry/opentelemetry-go/blob/5616fc55fc15f956f05f07d23d6febb1b195735e/sdk/trace/span.go#L240-L246

tarekgh · 2020-07-30T21:32:13Z

Could you please open a new issue in our runtime repo to discuss it and look at the proposal.

cijothomas · 2020-07-30T21:36:31Z

Thanks @lmolkova for sharing more details. Yes I agree this scenario is not handled. And I also think its not possible to work around in OTel SDK. We need to change the return type to return ActivityDataRequest , and attributes/tags.

cijothomas · 2020-07-31T03:38:23Z

Update: This is being discussed with .NET runtime team. As fixing this requires changes from .NET.
Will update early next week with the conclusions.

cijothomas · 2020-08-04T22:17:44Z

dotnet/runtime#40339 - Issue tracking this in .NET side.

lmolkova added the bug Something isn't working label Jul 30, 2020

lmolkova changed the title ~~Inconsistent sampling issue if parent is not set~~ Attributes added by sampler cannot be populate on activity Jul 30, 2020

lmolkova changed the title ~~Attributes added by sampler cannot be populate on activity~~ Attributes added by sampler cannot be populated on activity Jul 30, 2020

reyang added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package priority:p1 labels Jul 31, 2020

tarekgh mentioned this issue Aug 4, 2020

Add a way to the ActivityListeners to add more data to the Activity when it gets created dotnet/runtime#40339

Closed

This was referenced Aug 4, 2020

Fix Samplers to match spec. #941

Closed

Leverage ActivityListener.AutoGenerateRootContextTraceId #1007

Merged

cijothomas self-assigned this Aug 20, 2020

cijothomas added this to the 0.6.0-beta (Beta 3) milestone Aug 31, 2020

cijothomas mentioned this issue Aug 31, 2020

Update diagnosticsource to rc1 #1203

Merged

3 tasks

cijothomas closed this as completed in #1203 Sep 1, 2020

cijothomas mentioned this issue Sep 9, 2020

Sampler attributes must be part of activity for legacy instrumentation #1245

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attributes added by sampler cannot be populated on activity #953

Attributes added by sampler cannot be populated on activity #953

lmolkova commented Jul 30, 2020 •

edited

Loading

cijothomas commented Jul 30, 2020

tarekgh commented Jul 30, 2020

lmolkova commented Jul 30, 2020 •

edited

Loading

cijothomas commented Jul 30, 2020

ShouldSample

Return value:

lmolkova commented Jul 30, 2020

tarekgh commented Jul 30, 2020

lmolkova commented Jul 30, 2020

tarekgh commented Jul 30, 2020 •

edited

Loading

lmolkova commented Jul 30, 2020 •

edited

Loading

lmolkova commented Jul 30, 2020 •

edited

Loading

tarekgh commented Jul 30, 2020 •

edited

Loading

cijothomas commented Jul 30, 2020

cijothomas commented Jul 31, 2020

cijothomas commented Aug 4, 2020

Attributes added by sampler cannot be populated on activity #953

Attributes added by sampler cannot be populated on activity #953

Comments

lmolkova commented Jul 30, 2020 • edited Loading

cijothomas commented Jul 30, 2020

tarekgh commented Jul 30, 2020

lmolkova commented Jul 30, 2020 • edited Loading

ShouldSample

Return value:

cijothomas commented Jul 30, 2020

ShouldSample

Return value:

lmolkova commented Jul 30, 2020

tarekgh commented Jul 30, 2020

lmolkova commented Jul 30, 2020

tarekgh commented Jul 30, 2020 • edited Loading

lmolkova commented Jul 30, 2020 • edited Loading

Scenario

What's the plan

lmolkova commented Jul 30, 2020 • edited Loading

tarekgh commented Jul 30, 2020 • edited Loading

cijothomas commented Jul 30, 2020

cijothomas commented Jul 31, 2020

cijothomas commented Aug 4, 2020

lmolkova commented Jul 30, 2020 •

edited

Loading

lmolkova commented Jul 30, 2020 •

edited

Loading

tarekgh commented Jul 30, 2020 •

edited

Loading

lmolkova commented Jul 30, 2020 •

edited

Loading

lmolkova commented Jul 30, 2020 •

edited

Loading

tarekgh commented Jul 30, 2020 •

edited

Loading