Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attributes added by sampler cannot be populated on activity #953

Closed
lmolkova opened this issue Jul 30, 2020 · 14 comments · Fixed by #1203
Closed

Attributes added by sampler cannot be populated on activity #953

lmolkova opened this issue Jul 30, 2020 · 14 comments · Fixed by #1203
Assignees
Labels
bug Something isn't working pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package

Comments

@lmolkova
Copy link

lmolkova commented Jul 30, 2020

[renamed]

Describe your environment. Describe any aspect of your environment relevant to the problem:
N/A

  • SDK version: (nuget versions of the all the relevant packages)
    0.4.0-beta

  • .NET runtime version (.NET or .NET Core, TargetFramework in the .csproj file):
    N/A (any)

  • Platform and OS version:
    any

Steps to reproduce.

  • configure probability sampling
  • create activities without a parent
  • random id will be generated
  • sampling decision will be made based on random trace id
  • trace id will not be set on activity

internal static ActivityDataRequest ComputeActivityDataRequest(
in ActivityCreationOptions<ActivityContext> options,
Sampler sampler)
{
var isRootSpan = options.Parent.TraceId == default;
// This is not going to be the final traceId of the Activity (if one is created), however, it is
// needed in order for the sampling to work. This differs from other OTel SDKs in which it is
// the Sampler always receives the actual traceId of a root span/activity.
ActivityTraceId traceId = !isRootSpan
? options.Parent.TraceId
: ActivityTraceId.CreateRandom();
var samplingParameters = new SamplingParameters(
options.Parent,
traceId,
options.Name,
options.Kind,
options.Tags,
options.Links);
var shouldSample = sampler.ShouldSample(samplingParameters);
if (shouldSample.IsSampled)
{
return ActivityDataRequest.AllDataAndRecorded;
}
// If it is the root span select PropagationData so the trace ID is preserved
// even if no activity of the trace is recorded (sampled per OpenTelemetry parlance).
return isRootSpan
? ActivityDataRequest.PropagationData
: ActivityDataRequest.None;
}
}

We need a way to pass back trace-id if it was created and attributes that were set by sampler to comply with the spec along with sampling decision.

What is the expected behavior?
Trace-id used to make sampling decision should be reused for consistent sampling across services. Attributes set by sampler should also be propagated back from sampler (see spec).

What is the actual behavior?

Trace id will not be set on the activity (i.e. sampling result will be independent of trace-id and non-deterministic). Old PR about probability sampling suggests we determinism
Assuming another service downstream has a different sampling configuration and would need to recalculate sampling hash - it will create inconsistent sampling decisions between upstream and downstream.

Thoughts? @tarekgh @cijothomas @reyang

@lmolkova lmolkova added the bug Something isn't working label Jul 30, 2020
@cijothomas
Copy link
Member

@lmolkova This is addressed in preview8 :)
I am making a PR to update this repo to preview8 and leverage the new feature to fix this using PregenerateNewRootId

https://github.com/dotnet/designs/blob/master/accepted/2020/diagnostics/activity-improvements-2.md#automatic-trace-id--generation-in-case-of-null-parent

@tarekgh
Copy link
Contributor

tarekgh commented Jul 30, 2020

it is called AutoGenerateRootContextTraceId

@lmolkova
Copy link
Author

lmolkova commented Jul 30, 2020

Makes sense, thanks for the update! Great that trace-id is addressed already!
It does not seem to support attributes scenario, correct?

ShouldSample

Return value:

It produces an output called SamplingResult which contains:

  • A sampling Decision. One of the following enum values:
    ...
  • A set of span Attributes that will also be added to the Span.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

@cijothomas
Copy link
Member

Makes sense, thanks for the update! Great that trace-id is addressed already!
It does not seem to support attributes scenario, correct?

ShouldSample

Return value:

It produces an output called SamplingResult which contains:

  • A sampling Decision. One of the following enum values:
    ...
  • A set of span Attributes that will also be added to the Span.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

Its tracked here: #941. It doesn't require change from .NET as this is SDK only detail.

@lmolkova
Copy link
Author

I mean there is no way to propagate attributes generated by sampler inGetRequestedDataUsingContext to Activity. Am I missing something?

@tarekgh
Copy link
Contributor

tarekgh commented Jul 30, 2020

You can listen to the Activity start in the listener and add the whatever attributes. But you are right, cannot directly propagate attributes inside GetRequestedDataUsingContext callback. I think the purpose of GetRequestedDataUsingContext is just passing the information can allow taking sampling decision and not for propagating more data the Activity.

@lmolkova
Copy link
Author

yeah, I can add attributes in Start event, but if sampler generates something (which is my scenario) - there would be no way to reflect them on the Activity. This scenario also seems to be part of OTel spec, so I wonder if we cautiously wanted to drop this scenario and if so why. Otherwise, I'd like to plan on fixing it.

@tarekgh
Copy link
Contributor

tarekgh commented Jul 30, 2020

This scenario is not encountered or requested when we discussed the design with all parties. could you please send a pointer from the specs regarding that?
My question now, do you see this scenario is a blocking or this can be workaround for now and we can consider supporting that in next release?

CC @noahfalk

@lmolkova
Copy link
Author

lmolkova commented Jul 30, 2020

I guess the scenario is somewhat tricky and not something that exists today at least in .NET so it was hard to expect it.

Scenario

The spec mentions a set of attributes that are returned in SamplingResult that should be populated on Span being created iff it's sampled in.

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#shouldsample

It's not crystal clear from the spec, whether it's the initial set of attributes that were used to create activity or sampler can modify them.

Sampling OTEP is more clear and clarifies that these attributes are an addition to the initial set:

It produces an output called SamplingResult that includes:

  • A SamplingDecision enum [NOT_RECORD, RECORD, RECORD_AND_PROPAGATE].
  • A set of span Attributes that will also be added to the Span.
    • These attributes will be added after the initial set of Attributes.
    • (under discussion in separate RFC) the SamplingRate float."

My scenario is actually the last one (one under discussion): sampler would populate its rate on the Activity and it will be used across multiple services to make consistent sampling decisions regardless of the initial algorithm used.


What's the plan

So my main concern is: let's say we will want to support this scenario in the next release. We'll need a breaking change for it (return more than ActivityDataRequest enum) or a new callback that returns the extendable result of sampling.

I propose to have a new struct that is returned so that we comply with OTel spec and have a room for at least some extensibility here (I'm bad with naming, that's just an example).

public readonly struct ActivitySamplingResult
{
    public readonly IEnumberable<string, object> Attributes;
    public readonly  ActivityDataRequest DataRequest;
}

And change GetRequestedDataUsingContext deletate to return ActivitySamplingResult.

I only see very hacky ways to solve it now (do this in http-in listeners), but it won't work with the ActivitySource listeners. It won't be even possible to hack it within user applications.

@lmolkova lmolkova changed the title Inconsistent sampling issue if parent is not set Attributes added by sampler cannot be populate on activity Jul 30, 2020
@lmolkova lmolkova changed the title Attributes added by sampler cannot be populate on activity Attributes added by sampler cannot be populated on activity Jul 30, 2020
@tarekgh
Copy link
Contributor

tarekgh commented Jul 30, 2020

Could you please open a new issue in our runtime repo to discuss it and look at the proposal.

@cijothomas
Copy link
Member

Thanks @lmolkova for sharing more details. Yes I agree this scenario is not handled. And I also think its not possible to work around in OTel SDK. We need to change the return type to return ActivityDataRequest , and attributes/tags.

@cijothomas
Copy link
Member

Update: This is being discussed with .NET runtime team. As fixing this requires changes from .NET.
Will update early next week with the conclusions.

@cijothomas
Copy link
Member

dotnet/runtime#40339 - Issue tracking this in .NET side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants