-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added distributed tracing support: activity and diagnostic listener. #6853
Conversation
src/Orleans.Core/Diagnostics/ActivityPropagationGrainCallFilter.cs
Outdated
Show resolved
Hide resolved
Have you looked at |
@alexvaluyskiy, I have seen Activity source, but it is not available in netcore 3. Also, if I understand correctly, activity source just simplify working with activities and diagnostic source and my current code do the same. |
It works in NetCore 3.1, you just should a reference to the latest version of ‘ System.Diagnostics.DiagnosticSource’ |
ActivitySource allows you to add a tracing to OpenTelemetry without any additional code |
@alexvaluyskiy, It is interesting, according to msdn it is available since net5, but also there is a nuget package. But Orleans uses version 4.7 of the |
Also, the main goal of this PR for me is propagating trace id over Orleans transport to support trace headers inside |
Thank you, this looks good to me and we can accept it with the change mentioned (DiagnosticSource/DiagnosticListener subclass). We can upgrade all references to .NET 5 libs as long as they are still compatible with .NET Standard 2.0. It will take some work (must edit Directory.Build.props in the root directory to upgrade the packages). This could also be a separate package, on OrleansContrib, for example. |
@ReubenBond please confirm, that you prefer subclass to a static field. |
I'm fine with a static field if the value will never need to participate in DI. A subclass may be cleaner. |
Thanks, I will come back in a few days. |
@ReubenBond done. |
@cijothomas could you please look at this implementation? |
Ack. I'm on vacation now, will be back by Jan 1st week and take a look. |
It is strongly advised to use |
Additionally, would recommend to use OpenTelemetry concept of Propagators to propagate context, instead of propagating manually. |
@ReubenBond could you please summarize, should I cancel this PR and wait 5.0 libs, reference OpenTelemetry or it is OK for current orleans version? |
@Ilchert I was under the impression that @cijothomas' comment needed addressing. |
hey @Ilchert -- this might help you kickstart the modifications to use what @cijothomas is suggesting: https://rehansaeed.com/deep-dive-into-open-telemetry-for-net/ |
@ReubenBond thanks, I will update this PR when orleans updates dependencies to 5.x version. I believe that framework for distributed application must support distributed tracing :) |
This is the link to official instrumentation doc: https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Api#instrumenting-a-libraryapplication-with-net-activity-api |
@cijothomas thanks! |
I agree, it's very important |
@Ilchert @cijothomas -- I would love to use this functionality with Orleans 3.4.x -- if I understand the conversation above, if I reference the OpenTelemetry nuget package(s), I should be able to get this working with minimal changes, right? |
@oising, you can just upgrade Diagnostic source package to 5.0. Open telemetry does not have stable package. |
It does have stable now. (Released last week) https://www.nuget.org/packages/opentelemetry.api |
Mostly you'd just need |
src/Orleans.Core/Diagnostics/ActivityPropagationGrainCallFilter.cs
Outdated
Show resolved
Hide resolved
src/Orleans.Core/Diagnostics/ActivityPropagationGrainCallFilter.cs
Outdated
Show resolved
Hide resolved
/// <returns>The builder.</returns> | ||
public static ISiloBuilder AddActivityPropagation(this ISiloBuilder builder) | ||
{ | ||
if (Activity.DefaultIdFormat != ActivityIdFormat.W3C) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am running into systems that have DefaultIdFormat
set to hierarchical, which causes this to throw. It appears that https://devblogs.microsoft.com/aspnet/improvements-in-net-core-3-0-for-troubleshooting-and-monitoring-distributed-apps/ and other posts such as https://jimmybogard.com/building-end-to-end-diagnostics-and-tracing-a-primer-trace-context/ are recommending
Activity.DefaultIdFormat = ActivityIdFormat.W3C;
Obviously the ClientBuilderExtension would need to change as well.
The systems in question are server2016 with dotnet3.1 runtime. dotnet5.0 is not installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, @amccool, we have this check because internal implementation of context uses some w3c specific features, like scope, tags, etc. So, to use this extension you must change default format into W3C.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reconsider limiting to only W3C. Many systems rely on hierarchical IDs and it's set for entire app. Hierarchical IDs help with logs stored in DBs, since prefix matches use index. For example: startswith "|f728de91a402224da40046d1234e83c5.3b48459462d8e54e."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default DistributedContextPropagator
will handle both hierarchical and W3C ActivityId formats.
UPDATE Never mind... it turns out the use of GraphQL dataloaders and the way field resolvers are lazily evaluated is causing the chain to be broken |
4744702
to
c041e61
Compare
I really like this and it is a super cool feature. But I guess to also support stuff like serializers and other filters you would have to implement this deeper into the system. But can it not be merged and then improved later? |
cc @shirhatti |
Yes, it can. I think this PR is taking the right approach, too. |
// Create activity from context directly | ||
var traceParent = RequestContext.Get(TraceParentHeaderName) as string; | ||
var traceState = RequestContext.Get(TraceStateHeaderName) as string; | ||
var parentContext = new ActivityContext(); | ||
|
||
if (traceParent is not null) | ||
{ | ||
parentContext = ActivityContext.Parse(traceParent, traceState); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would replace this logic and rely on the configured DistributedContextPropagator
instead. I'm not familiar with Orleans, but in ASP.NET Core we use the propagator registered in DI and if not present, we fall back to the globally configured DistributedContextPropagator.Current
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, was it intentional not to propagate Baggage? Or was it an omission?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DistributedContextPropagator is NET 6.0 only. I'm not sure Orleans 4.0 should be constrained to that, unless there's plans to backport it to 5.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may try to reference the latest Diagnostic Source package https://www.nuget.org/packages/System.Diagnostics.DiagnosticSource/6.0.0-preview.7.21377.19 which should be supporting the propagators and should work in .NET 5.0 too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarekgh Oh, that would be good if it works :)
if (currentActivity is not null && | ||
currentActivity.IdFormat == ActivityIdFormat.W3C) | ||
{ | ||
RequestContext.Set(TraceParentHeaderName, currentActivity.Id); | ||
if (currentActivity.TraceStateString is not null) | ||
RequestContext.Set(TraceStateHeaderName, currentActivity.TraceStateString); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned on the incoming filter, the outgoing filter should also use a propagator.
👀 @tarekgh |
We have a PR open to implement this, courtesy of @suraciii - comments welcome |
would be good to be able to have the option to filter out system grains, similar to the dashboard. |
@amccool I agree, wonder if it can be implemented by a sampler |
The PR to add DistributedContextPropagator: #7443 There's another choice is to manage activities along with |
I've opened a follow-up PR which changes some of the details here. Please take a look if you're using this and/or are interested: #7647 |
Added distributed tracing support: added grain filters for activity propagation.
#4992