-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch .NET tracer to injecting both base64 & binary headers #6257
base: master
Are you sure you want to change the base?
Conversation
Datadog ReportBranch report: ❌ 1 Failed (0 Known Flaky), 451220 Passed, 2732 Skipped, 18h 26m 13.73s Total Time ❌ Failed Tests (1)
|
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (72ms) : 64, 80
. : milestone, 72,
master - mean (71ms) : 65, 78
. : milestone, 71,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (981ms) : 962, 1000
. : milestone, 981,
master - mean (981ms) : 960, 1003
. : milestone, 981,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (108ms) : 105, 110
. : milestone, 108,
master - mean (108ms) : 106, 110
. : milestone, 108,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (680ms) : 660, 699
. : milestone, 680,
master - mean (683ms) : 665, 702
. : milestone, 683,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (92ms) : 89, 95
. : milestone, 92,
master - mean (92ms) : 89, 95
. : milestone, 92,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (632ms) : 616, 649
. : milestone, 632,
master - mean (638ms) : 623, 653
. : milestone, 638,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (191ms) : 185, 196
. : milestone, 191,
master - mean (190ms) : 186, 194
. : milestone, 190,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (1,103ms) : 1072, 1134
. : milestone, 1103,
master - mean (1,094ms) : 1072, 1115
. : milestone, 1094,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (276ms) : 272, 281
. : milestone, 276,
master - mean (276ms) : 272, 279
. : milestone, 276,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (877ms) : 849, 905
. : milestone, 877,
master - mean (875ms) : 850, 900
. : milestone, 875,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6257) - mean (265ms) : 262, 269
. : milestone, 265,
master - mean (264ms) : 260, 268
. : milestone, 264,
section CallTarget+Inlining+NGEN
This PR (6257) - mean (855ms) : 823, 887
. : milestone, 855,
master - mean (854ms) : 825, 884
. : milestone, 854,
|
Throughput/Crank Report ⚡Throughput results for AspNetCoreSimpleController comparing the following branches/commits: Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red. Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards! gantt
title Throughput Linux x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (6257) (11.168M) : 0, 11167674
master (11.247M) : 0, 11246955
benchmarks/2.9.0 (11.033M) : 0, 11032866
section Automatic
This PR (6257) (7.257M) : 0, 7256823
master (7.407M) : 0, 7406927
benchmarks/2.9.0 (7.786M) : 0, 7785853
section Trace stats
master (7.695M) : 0, 7695476
section Manual
master (11.240M) : 0, 11240426
section Manual + Automatic
This PR (6257) (6.700M) : 0, 6700131
master (6.871M) : 0, 6870622
section DD_TRACE_ENABLED=0
master (10.206M) : 0, 10206195
gantt
title Throughput Linux arm64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (6257) (9.559M) : 0, 9558760
master (9.563M) : 0, 9563014
benchmarks/2.9.0 (9.495M) : 0, 9494821
section Automatic
This PR (6257) (6.347M) : 0, 6346842
master (6.412M) : 0, 6411572
section Trace stats
master (6.703M) : 0, 6703390
section Manual
master (9.369M) : 0, 9369266
section Manual + Automatic
This PR (6257) (5.939M) : 0, 5939281
master (5.978M) : 0, 5977872
section DD_TRACE_ENABLED=0
master (8.996M) : 0, 8996087
gantt
title Throughput Windows x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (6257) (9.004M) : 0, 9004425
benchmarks/2.9.0 (10.020M) : 0, 10019592
section Automatic
This PR (6257) (6.012M) : 0, 6012109
benchmarks/2.9.0 (7.255M) : 0, 7255257
section Manual + Automatic
This PR (6257) (5.434M) : 0, 5433882
|
Benchmarks Report for tracer 🐌Benchmarks for #6257 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SpanBenchmark - Slower
|
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑net6.0 | 1.195 | 477.52 | 570.43 | |
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 | 1.170 | 399.48 | 467.49 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | StartFinishSpan |
net6.0 | 399ns | 0.725ns | 2.81ns | 0.00807 | 0 | 0 | 576 B |
master | StartFinishSpan |
netcoreapp3.1 | 590ns | 1.42ns | 5.51ns | 0.00775 | 0 | 0 | 576 B |
master | StartFinishSpan |
net472 | 659ns | 0.643ns | 2.32ns | 0.0916 | 0 | 0 | 578 B |
master | StartFinishScope |
net6.0 | 478ns | 0.619ns | 2.4ns | 0.00968 | 0 | 0 | 696 B |
master | StartFinishScope |
netcoreapp3.1 | 705ns | 1.06ns | 4.1ns | 0.0095 | 0 | 0 | 696 B |
master | StartFinishScope |
net472 | 844ns | 1.45ns | 5.62ns | 0.105 | 0 | 0 | 658 B |
#6257 | StartFinishSpan |
net6.0 | 467ns | 0.544ns | 2.11ns | 0.00815 | 0 | 0 | 576 B |
#6257 | StartFinishSpan |
netcoreapp3.1 | 561ns | 2.93ns | 14.9ns | 0.00781 | 0 | 0 | 576 B |
#6257 | StartFinishSpan |
net472 | 669ns | 1.35ns | 5.23ns | 0.0915 | 0 | 0 | 578 B |
#6257 | StartFinishScope |
net6.0 | 570ns | 0.721ns | 2.79ns | 0.0099 | 0 | 0 | 696 B |
#6257 | StartFinishScope |
netcoreapp3.1 | 707ns | 3.61ns | 16.5ns | 0.00951 | 0 | 0 | 696 B |
#6257 | StartFinishScope |
net472 | 877ns | 1.47ns | 5.71ns | 0.104 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 656ns | 0.758ns | 2.94ns | 0.00974 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 903ns | 1.41ns | 5.45ns | 0.00941 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.06μs | 1.46ns | 5.66ns | 0.105 | 0 | 0 | 658 B |
#6257 | RunOnMethodBegin |
net6.0 | 670ns | 1.67ns | 6.48ns | 0.0097 | 0 | 0 | 696 B |
#6257 | RunOnMethodBegin |
netcoreapp3.1 | 992ns | 1.93ns | 7.47ns | 0.00901 | 0 | 0 | 696 B |
#6257 | RunOnMethodBegin |
net472 | 1.17μs | 2.8ns | 10.9ns | 0.105 | 0 | 0 | 658 B |
{ | ||
var encodedBytes = PathwayContextEncoder.Encode(context.SpanContext.PathwayContext.Value); | ||
var base64EncodedContext = Convert.ToBase64String(encodedBytes); | ||
sb.AppendFormat("\"{0}\":\"{1}\",", DataStreamsPropagationHeaders.PropagationKeyBase64, base64EncodedContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't use AppendFormat
it's much less performant (we can use $""
for the first one because it's embedding a constant so gets handled at compile time)
sb.AppendFormat("\"{0}\":\"{1}\",", DataStreamsPropagationHeaders.PropagationKeyBase64, base64EncodedContext); | |
sb.Append($"\"{DataStreamsPropagationHeaders.PropagationKeyBase64}\":\""); | |
.Append(base64EncodedContext) | |
.Append('"'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, replaced.
var base64EncodedContext = Convert.ToBase64String(encodedBytes); | ||
sb.AppendFormat("\"{0}\":\"{1}\",", DataStreamsPropagationHeaders.PropagationKeyBase64, base64EncodedContext); | ||
|
||
if (Tracer.Instance.Settings.IsDataStreamsLegacyHeadersEnabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this has been said elsewhere, but I think we should def try to pass the tracer in from externally, it's not great to have static access to a tracer instance here (apart from anything else, it's going to make your unit tests impossibly flaky 🙁 )
if (Tracer.Instance.Settings.IsDataStreamsLegacyHeadersEnabled) | ||
{ | ||
// Both PropagationKeyBase64 and PropagationKey use the Base64 encoded context | ||
sb.AppendFormat("\"{0}\":\"{1}\",", DataStreamsPropagationHeaders.PropagationKey, base64EncodedContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't use AppendFormat
sb.AppendFormat("\"{0}\":\"{1}\",", DataStreamsPropagationHeaders.PropagationKey, base64EncodedContext); | |
```suggestion | |
sb.Append($"\"{DataStreamsPropagationHeaders.PropagationKey}\":\""); | |
.Append(base64EncodedContext) | |
.Append('"'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, replaced.
tracer/test/Datadog.Trace.Tests/DataStreamsMonitoring/DataStreamsContextPropagatorTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted the calls to Tracer.Instance
into additional overloads, so 🤞 the tests will pass now and not be flaky 🙂
sb.Append($"\"{DataStreamsPropagationHeaders.PropagationKeyBase64}\":\"") | ||
.Append(base64EncodedContext) | ||
.Append('"'); | ||
const string ddPathwayCtxBase64 = $"\"{DataStreamsPropagationHeaders.PropagationKeyBase64}\":\""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This declaration/extraction wasn't necessary, just proves (to me) that it's definitely a const
😄
Environment.SetEnvironmentVariable("DD_DATA_STREAMS_LEGACY_HEADERS", "false"); | ||
try | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file are just removing the Environment
calls and try-finally
, and passing the boolean in directly
{ | ||
// Both PropagationKeyBase64 and PropagationKey use the Base64 encoded context | ||
const string ddPathwayCtx = $"\"{DataStreamsPropagationHeaders.PropagationKey}\":\""; | ||
sb.Append(ddPathwayCtx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a comma here, but for some reason adding it in doesn't seem to get picked up locally 😖
The content that this produces is "dd-pathway-ctx-base64":"kjZ85G4hViqe65Xh6mSe65Xh6mQ=""dd-pathway-ctx":"kjZ85G4hViqe65Xh6mSe65Xh6mQ=","dd-pathway-ctx-base64":"a2paODVHNGhWaXFlNjVYaDZtU2U2NVhoNm1RPQ=="
note the missing comma and duplicated key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into this at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the JSON injection done here
- It didn't properly handle
comma
values between the JSON so it produced Newtonsoft deserialization errors - It doesn't appear to be needed by my testing and was duplicating the headers.
And example of the JSONified headers now
{
"x-datadog-trace-id": "5758925048254194083",
"x-datadog-parent-id": "9282207551708089276",
"x-datadog-sampling-priority": "1",
"x-datadog-tags": "_dd.p.dm=-0,_dd.p.tid=6741191500000000",
"traceparent": "00-67411915000000004febcfd1bd4a61a3-80d10696cb024bbc-01",
"tracestate": "dd=s:1;p:80d10696cb024bbc;t.dm:-0",
"dd-pathway-ctx-base64": "R0lmbmJPT210ZUd3L2MvazZtU3cvYy9rNm1RPQ==",
"dd-pathway-ctx": "GIfnbOOmteGw/c/k6mSw/c/k6mQ="
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in be18958
|
||
// Internal for testing | ||
internal void Inject<TCarrier>(PathwayContext context, TCarrier headers, bool isDataStreamsLegacyHeadersEnabled) | ||
where TCarrier : IBinaryHeadersCollection | ||
{ | ||
if (headers is null) { ThrowHelper.ThrowArgumentNullException(nameof(headers)); } | ||
|
||
headers.Add(DataStreamsPropagationHeaders.PropagationKey, PathwayContextEncoder.Encode(context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this likely needs to be removed, testing locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this, as it should be added below when checking isDataStreamsLegacyHeadersEnabled
.
However, isDataStreamsLegacyHeadersEnabled
was always false, so it appeared to be working as intended.
I fixed isDataStreamsLegacyHeadersEnabled
returning false in 47d92fe
Removed this line in f48d1dd
Reviewing the produced JSON it appears accurate to my eyes now
This appears to be handled already and doing it here just dupes the content. And it didn't add comma's between the JSON fields which ccause JSON deserialization errors. Additionally, it was re-encoding the Base64 again.
so that it isn't disabled.
This is added already below, but initially it wasn't obivous because the legacy data streams was always false.
Snapshots difference summaryThe following differences have been observed in committed snapshots. It is meant to help the reviewer. 8 occurrences of : - aws.queue.name: MyAsyncSQSQueue,
- aws.queue.url: http://localhost:00000/000000000000/MyAsyncSQSQueue,
+ aws.queue.name: MyAsyncSQSQueue2,
+ aws.queue.url: http://localhost:00000/000000000000/MyAsyncSQSQueue2,
2 occurrences of : - http.url: http://localhost:00000/000000000000/MyAsyncSQSQueue,
+ http.url: http://localhost:00000/000000000000/MyAsyncSQSQueue2,
[...]
- queuename: MyAsyncSQSQueue,
+ queuename: MyAsyncSQSQueue2,
2 occurrences of : - queuename: MyAsyncSQSQueue,
+ queuename: MyAsyncSQSQueue2,
2 occurrences of : - http.url: http://localhost:00000/000000000000/MyAsyncSQSQueue,
+ http.url: http://localhost:00000/000000000000/MyAsyncSQSQueue2,
[...]
- peer.service: MyAsyncSQSQueue,
- queuename: MyAsyncSQSQueue,
+ peer.service: MyAsyncSQSQueue2,
+ queuename: MyAsyncSQSQueue2,
2 occurrences of : - peer.service: MyAsyncSQSQueue,
- queuename: MyAsyncSQSQueue,
+ peer.service: MyAsyncSQSQueue2,
+ queuename: MyAsyncSQSQueue2,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will follow up on to see if the tests all pass now, but they appear to locally from what I can tell.
I updated snapshots, but it looks like they were initially updated and so I just un-updated them to get them working again.
I'm not super comfortable approving this and merging now that I have altered the code to the extent that I did :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have someone from DSM re-review it
tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/AWS/Shared/ContextPropagation.cs
Outdated
Show resolved
Hide resolved
@@ -138,6 +138,7 @@ internal ImmutableTracerSettings(TracerSettings settings, bool unusedParamNotToU | |||
IsActivityListenerEnabled = settings.IsActivityListenerEnabled; | |||
|
|||
_isDataStreamsMonitoringEnabled = settings.IsDataStreamsMonitoringEnabled; | |||
IsDataStreamsLegacyHeadersEnabled = settings.IsDataStreamsLegacyHeadersEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lack of adding this originally feels like it should have been caught by a test 🤔 Feels like we're either missing some generalised testing, or something in this PR?
@@ -370,18 +370,18 @@ | |||
Tags: { | |||
aws.agent: dotnet-aws-sdk, | |||
aws.operation: DeleteMessageBatch, | |||
aws.queue.name: MyAsyncSQSQueue, | |||
aws.queue.url: http://localhost:00000/000000000000/MyAsyncSQSQueue, | |||
aws.queue.name: MyAsyncSQSQueue2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why these have all changed 😕
…hared/ContextPropagation.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we need to look into tests (fixing existing ones and adding new ones), so I'm changing my approval for now so this doesn't get merged by accident 😅
Summary of changes
Reason for change
Data Streams previously used binary encoding in Kafka headers. This was causing issues in cross language communication because of the difference in negative byte handling. That’s why we switched to base64 encoding.
Today, .NET is the only remaining tracer using binary encoding for Kafka, SQS & RabbitMQ.
This is causing 3 issues:
Also, byte headers are causing a crash of the .NET application (not reproduced yet).
Implementation details
DD_DATA_STREAMS_LEGACY_HEADERS
(default: true) is enabled for backward compatibility.Test coverage
I've added tests for the following cases:
Inject_WhenLegacyHeadersDisabled_DoesNotIncludeBinaryHeader
Extract_WhenBothHeadersPresent_PrefersBase64Header
InjectedHeaders_HaveCorrectFormat
InjectHeaders_WhenLegacyHeadersDisabled_DoesNotIncludeLegacyHeader
Inject_WhenLegacyHeadersEnabled_IncludesBothHeaders
Extract_WhenBase64HeaderIsMalformed_ReturnsFallbackToBinary
Other details