-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DiagnosticsHandler doesn't support W3C trace context correctly with aspnetcore #31862
Comments
You shouldn’t need to do that, can you provide a repro project that shows the problem? |
I've now reproduced the issue using my sample solution on Windows 10 Enterprise and Mac OsX Catalina. |
I am experiencing the same issue as well. |
Me too. Using the hack solved the issue. |
@davidfowl If one of us wanted to make a PR to fix this issue, would we make the PR to master or is there a 3.1 procedure? What is the procedure? Also, would the fix just be to say "Is diagnostic listener enabled OR is .net core" instead of just checking if the listener is enabled? |
I think the code is working 'by design', but the feedback would then be that this current design didn't match your expectations and we can discuss if it is a good design or not. The mismatched assumptions appear to be something like this: Current design - Any code is allowed to create an Activity. If you want to log an outgoing Activity ID then you need to use a DiagnosticListener (or starting in 5.0 an ActivityListener) to note when the id changes and log the new id. Having a separate Activity wrap each outbound request has the benefit that if you make multiple requests the dependencies of each can be distinguished. At this point there is also a significant back-compat consideration, we probably wouldn't want to change the default behavior but we could make it easier to opt-in to some alternate behavior. One other concern is that HttpClient is one example of the problem, but any library that transmits data can (and probably will) have similar behavior. For example I assume gRPC library, Azure/AWS/GCP SDKs, and database clients would all be examples that create a new Activity sometime after the last ILogger message is emitted. A general purpose solution to this probably isn't to prevent all these libraries from creating an Activity, but instead to ensure that we can log a message when the Activity ID changes. Perhaps what would be useful is a built-in asp.net core logger that does that? As a shorter term workaround we could probably put together an example showing what any developer could add to their own app to log the Activity ID? Glad to hear thoughts on this and thanks for the feedback! |
I agree that the Activity system within the .NET context makes sense that anybody anywhere at anytime can create a "sub" Activity to embody a piece of work. However, the original poster and others who have "me too" responded have a VERY valid point and I've spent the better part of two days tracking this down. We have services that are logging trace-id, parent-id, and span-id so that we can do correlation from logs -- and when sub activities go UNLOGGED you suddenly have child spans that link to non-existent parents. That just doesn't work. There's two big issues at play for me here: First, the W3C standard expects that you have logging for every parent/child in the hierarchy. By default this logging is NOT working as the original poster has shown. Something has to be done so that we can trivially hook into and log EVERY Activity that is created in the flow of processing a request in order to prevent some-third-party-library from messing up our logging of this crucial information. Second, the work around provided exposes a really BAD "side effect" driven behavior of the DiagnosticHandler. In order to disable the creation of the Activity inside of the DiagnosticHandler you have to meet two conditions: First, you MUST have a registered listener on @gislikonrad bravo for figuring that out, any many thanks for your discovery! But, I want to go back to how this is using a REALLY BAD side effect driven behavior. Let's look at a few scenarios:
My primary complaint is that the first two scenarios should NOT be different -- that is violation of any sense of consistency. I appreciate the work around, but I'm concerned without a proper solution that we're all going to suffer massive failures if A) they ever fix this violation of consistency or B) any of us add some third-party library that registers a listener that handles that event (maybe a library for doing automatic telemetry)... As soon as we add that type of library into our app boom these unlogged Activities come back. The only way I have to work around that is to change the Null observer into one that actually handles the event and logs the new That's at least my findings after two days of hunting for this problem. I don't understand why this default behavior is acceptable in a world of W3C tracing, and I'm glad to find others hitting the same problem. |
I would also offer this as a more appropriate way of connecting the null listener: DiagnosticListener.AllListeners.Subscribe(delegate (DiagnosticListener listener)
{
if (listener.Name == "HttpHandlerDiagnosticListener")
{
listener.Subscribe(new NullObserver(), name => false);
}
}); You need the It also looks like you can just pass |
Thanks for the feedback @mikedoug!
I assume 'they' means Microsoft? 'They' = me in this case : ) Although I wasn't the original author I do have a large say in determining how the code evolves forward. I'm also in position to say that I (or Microsoft as a whole) has no plans to change that default behavior for exactly the reason you raised - consistent or no it is the behavior people have come to take a dependency on in their apps and we want to be respectful of that.
We have been working on improving this during .NET 5. Anyone who references the new version of System.Diagnostics.DiagnosticSource.dll (either by upgrading to .NET 5 or by referencing the updated NuGet package regardless of the .NET version you use) gains access to a new type called ActivityListener. The goal of this listener is to make it simple and consistent to receive notifications when Activities start and stop. For example you could write: using System;
using System.Diagnostics;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
ActivitySource.AddActivityListener(new ActivityListener()
{
// This controls which ActivitySources we want to listen to. We'd like all of them
ShouldListenTo = (activitySource) => true,
// These callbacks get invoked whenever an activity created by a source we are listening
// to starts or stops
ActivityStarted = (activity) => Console.WriteLine($"Start: {activity.OperationName,-30} {activity.Id}"),
ActivityStopped = (activity) => Console.WriteLine($"Stop: {activity.OperationName,-30} {activity.Id}")
});
Activity a = new Activity("Example");
a.Start();
HttpClient client = new HttpClient();
await client.GetAsync("http://bing.com");
a.Stop();
}
} Example output:
What do you think of this and is there more/different stuff you'd like to see to improve the story here? |
Also while not a direct solution to this problem there is some related work in .NET to Add Activity Ids and Context to all logs |
cc @tarekgh |
@noahfalk Yes -- "they" was referring to Microsoft or whomever wrote that original code. For me it is nonsensical. I am REALLY happy to hear that .NET 5 is going to give us the ActivityListener that will allow us to follow ANY Activity regardless of its source; that's what I have been searching for to help answer this issue of mine. Also, really glad to hear that you're working on adding all of that stuff into logs. Right now it is rather clunky and the biggest issue I have (at least with Serilog and it's That said, we really do need a mechanism to configure it so that the DiagnosticHandler skips the creation of a new Action so that we can stop using this hacky work around. That way, if we want to opt out of a sub-Action for every outgoing HTTP call (which is a legitimate desire) we can do so in a way that won't be broken by 3rd party libraries who might add a listener for the events and inadvertently enable it. The behavior bothers me because NONE of the documentation I could find (Microsoft or otherwise) made me expect to find the core libraries doing this. Everybody talks about simply service A having a spanId and when it calls service B the parentId will be the spanId from service A -- it's always kept simple. I spent a lot of time thinking Refit was doing it, and hunting through that was fun. Then, not finding anything there, I was hunting through Serilog trying to figure out why it would be doing something. It was a surprise to me to find this behavior as the default behavior in the core of the HttpClient (through the DiagnosticsHandler) of the language. I'll be less bothered if an option to explicitly disable it is provided, and the documentation shored up slightly to make this "additional Activity" more clear. Thanks for the feedback as well! |
Excellent! That is what we were hoping for!
Yep, that feature I referenced logs the Activity which is current on the thread at the time the log message is being emitted : )
Thanks, this is great feedback. I've captured your requests here and here. If you want to add any more details about what you are looking for feel free to add or fix anything I wrote there. I'll leave this issue open in case there is more feedback, but for now I am assuming that the work already in flight + the new issues I linked above covers everything that needs to happen. Thanks again! |
Closing this issue as there been no further feedback. Issues 41072 and docs/20195 remain open to track adding an opt-out mechanism and better docs respectively. |
I have two aspnetcore applications using the W3C activity id format. When I have one call the other, the tracing context in the logs is missing a link in the chain.
Expected behavior
Trace id is the same in the logs of both services.
Parent id in the logs for the second service matches the span id in the logs of the first service.
Observed behavior
Trace id is the same in the logs of both services.
Parent id in the logs for the second service does not match the span id in the logs of the first service.
This is due to what is happening in the DiagnosticsHandler.
On line 52, the DiagnosticListener (s_diagnosticListener) is queried for whether it's enabled. This is always false for .net core. In this statement, a new activity is started and not logged. The part of the method that we want to run (line 95-99) is never run.
I have currently hacked this to work in my project with the following code
The text was updated successfully, but these errors were encountered: