Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip the mass transit test to see if it solves flake issues (#5861 -> v2) #5911

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

andrewlock
Copy link
Member

Summary of changes

Skip the mass transit smoke test as it seems to be a cause of a lot of flakiness

Reason for change

We've seen a lot of errors in the CheckBuildlogsForErr stage:

CheckBuildLogsForErr: 03:08:39 [Error] An error occurred while sending data to the agent at http://127.0.0.1:39573/v0.4/traces. If the error isn't transient, please check https://docs.datadoghq.com/tracing/troubleshooting/connection_errors/?code-lang=dotnet for guidance. System.Net.Http.HttpRequestException: Error while copying content to a stream.

These seemed to get a lot worse after we disabled keep-alive, but that's anecdotal.

Implementation details

It's not entirely clear if the problem is just coincidentally related to the MassTransit test (i.e. it's a test ordering process) or if it's actually something about the test.

As a check I tried skipping the test in this branch and did 4 full (all TFM) integration tests runs, and didn't see the issue again. It's all still anecdotal, but rather trade off flakiness here. If the problem reappears subsequently, we can look into it again further.

Test coverage

Did 4 full runs, and didn't see the issue again

Other details

Backport of #5861 (as still getting a lot of flake on the release/2.x branch)

## Summary of changes

Skip the mass transit smoke test as it seems to be a cause of a lot of
flakiness

## Reason for change

We've seen a lot of errors in the `CheckBuildlogsForErr` stage:

```
CheckBuildLogsForErr: 03:08:39 [Error] An error occurred while sending data to the agent at http://127.0.0.1:39573/v0.4/traces. If the error isn't transient, please check https://docs.datadoghq.com/tracing/troubleshooting/connection_errors/?code-lang=dotnet for guidance. System.Net.Http.HttpRequestException: Error while copying content to a stream.
```

These seemed to get a lot worse after we disabled keep-alive, but that's
anecdotal.

## Implementation details

It's not entirely clear if the problem is just coincidentally related to
the MassTransit test (i.e. it's a test ordering process) or if it's
actually something about the test.

As a check I tried skipping the test in this branch and did 4 full (all
TFM) integration tests runs, and didn't see the issue again. It's all
still anecdotal, but rather trade off flakiness here. If the problem
reappears subsequently, we can look into it again further.

## Test coverage

Did 4 full runs, and didn't see the issue again
@andrewlock andrewlock added area:builds project files, build scripts, pipelines, versioning, releases, packages area:tests unit tests, integration tests area:test-apps apps used to test integrations labels Aug 16, 2024
@andrewlock andrewlock requested a review from a team as a code owner August 16, 2024 07:12
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Aug 16, 2024

Datadog Report

Branch report: andrew/ci/masstransit-fix-backport
Commit report: ce7f3a8
Test service: dd-trace-dotnet

✅ 0 Failed, 353745 Passed, 1797 Skipped, 14h 35m 6.14s Total Time

@andrewlock
Copy link
Member Author

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (77ms)  : 62, 92
     .   : milestone, 77,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (1,072ms)  : 1050, 1093
     .   : milestone, 1072,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (111ms)  : 107, 114
     .   : milestone, 111,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (782ms)  : 763, 801
     .   : milestone, 782,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (94ms)  : 92, 97
     .   : milestone, 94,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (723ms)  : 705, 742
     .   : milestone, 723,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (191ms)  : 188, 195
     .   : milestone, 191,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (1,155ms)  : 1128, 1182
     .   : milestone, 1155,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (277ms)  : 272, 281
     .   : milestone, 277,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (947ms)  : 926, 969
     .   : milestone, 947,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5911) - mean (265ms)  : 260, 270
     .   : milestone, 265,

    section CallTarget+Inlining+NGEN
    This PR (5911) - mean (930ms)  : 908, 951
     .   : milestone, 930,

Loading

@andrewlock
Copy link
Member Author

Benchmarks Report for tracer 🐌

Benchmarks for #5911 compared to master:

  • 1 benchmarks are faster, with geometric mean 1.128
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 7.87μs 42.6ns 319ns 0.0162 0.00808 0 5.42 KB
master StartStopWithChild netcoreapp3.1 9.85μs 54.6ns 349ns 0.0145 0.00966 0 5.62 KB
master StartStopWithChild net472 16μs 39.9ns 154ns 1.03 0.318 0.0955 6.07 KB
#5911 StartStopWithChild net6.0 7.75μs 43.5ns 275ns 0.0184 0.00737 0 5.43 KB
#5911 StartStopWithChild netcoreapp3.1 9.77μs 53.7ns 304ns 0.0145 0.00482 0 5.62 KB
#5911 StartStopWithChild net472 16μs 43.2ns 167ns 1.02 0.297 0.0939 6.06 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 472μs 288ns 1.12μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 632μs 224ns 838ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 855μs 505ns 1.89μs 0.428 0 0 3.3 KB
#5911 WriteAndFlushEnrichedTraces net6.0 483μs 397ns 1.54μs 0 0 0 2.7 KB
#5911 WriteAndFlushEnrichedTraces netcoreapp3.1 639μs 277ns 1.07μs 0 0 0 2.7 KB
#5911 WriteAndFlushEnrichedTraces net472 827μs 291ns 1.09μs 0.414 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 192μs 1.02μs 6.79μs 0.184 0 0 18.45 KB
master SendRequest netcoreapp3.1 209μs 1.18μs 7.91μs 0.207 0 0 20.61 KB
master SendRequest net472 0.000596ns 0.000353ns 0.00132ns 0 0 0 0 b
#5911 SendRequest net6.0 188μs 1.04μs 6.55μs 0.187 0 0 18.45 KB
#5911 SendRequest netcoreapp3.1 215μs 1.23μs 11.3μs 0.218 0 0 20.61 KB
#5911 SendRequest net472 0.00107ns 0.00058ns 0.00209ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 566μs 1.87μs 7.25μs 0.561 0 0 41.73 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 664μs 2.09μs 7.82μs 0.326 0 0 41.98 KB
master WriteAndFlushEnrichedTraces net472 855μs 4.18μs 17.7μs 8.63 2.47 0.411 53.28 KB
#5911 WriteAndFlushEnrichedTraces net6.0 550μs 2.68μs 11.7μs 0.553 0 0 41.64 KB
#5911 WriteAndFlushEnrichedTraces netcoreapp3.1 694μs 3.03μs 10.9μs 0.349 0 0 41.94 KB
#5911 WriteAndFlushEnrichedTraces net472 844μs 3.84μs 14.9μs 8.08 2.55 0.425 53.31 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.17μs 0.924ns 3.58ns 0.0143 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.66μs 0.756ns 2.73ns 0.0133 0 0 1.02 KB
master ExecuteNonQuery net472 2.05μs 1.8ns 6.96ns 0.157 0 0 987 B
#5911 ExecuteNonQuery net6.0 1.27μs 1.37ns 4.74ns 0.0146 0 0 1.02 KB
#5911 ExecuteNonQuery netcoreapp3.1 1.8μs 2.66ns 9.97ns 0.0133 0 0 1.02 KB
#5911 ExecuteNonQuery net472 2.04μs 2.67ns 10.4ns 0.157 0 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.18μs 0.457ns 1.77ns 0.0136 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.47μs 0.46ns 1.72ns 0.0133 0 0 976 B
master CallElasticsearch net472 2.51μs 1.61ns 6.23ns 0.158 0 0 995 B
master CallElasticsearchAsync net6.0 1.21μs 0.956ns 3.7ns 0.0133 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.59μs 0.972ns 3.64ns 0.0136 0 0 1.02 KB
master CallElasticsearchAsync net472 2.65μs 1.82ns 7.06ns 0.166 0 0 1.05 KB
#5911 CallElasticsearch net6.0 1.16μs 1.28ns 4.79ns 0.0134 0 0 976 B
#5911 CallElasticsearch netcoreapp3.1 1.51μs 2.72ns 9.82ns 0.0129 0 0 976 B
#5911 CallElasticsearch net472 2.41μs 1.06ns 3.98ns 0.157 0.0012 0 995 B
#5911 CallElasticsearchAsync net6.0 1.32μs 0.691ns 2.68ns 0.0132 0 0 952 B
#5911 CallElasticsearchAsync netcoreapp3.1 1.69μs 0.823ns 3.19ns 0.0135 0 0 1.02 KB
#5911 CallElasticsearchAsync net472 2.69μs 1.06ns 4.11ns 0.167 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.17μs 0.935ns 3.5ns 0.0134 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.68μs 0.805ns 3.12ns 0.0125 0 0 952 B
master ExecuteAsync net472 1.81μs 1.03ns 3.85ns 0.145 0 0 915 B
#5911 ExecuteAsync net6.0 1.26μs 1.01ns 3.77ns 0.0132 0 0 952 B
#5911 ExecuteAsync netcoreapp3.1 1.58μs 0.394ns 1.47ns 0.0127 0 0 952 B
#5911 ExecuteAsync net472 1.82μs 0.69ns 2.67ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.08μs 1.74ns 6.5ns 0.0305 0 0 2.22 KB
master SendAsync netcoreapp3.1 5.04μs 2.85ns 11.1ns 0.038 0 0 2.76 KB
master SendAsync net472 7.78μs 1.67ns 6.24ns 0.497 0 0 3.15 KB
#5911 SendAsync net6.0 4.03μs 1.53ns 5.94ns 0.0303 0 0 2.22 KB
#5911 SendAsync netcoreapp3.1 5.15μs 1.93ns 7.22ns 0.036 0 0 2.76 KB
#5911 SendAsync net472 7.82μs 2.02ns 7.8ns 0.497 0 0 3.15 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.5μs 0.663ns 2.48ns 0.0234 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.12μs 0.486ns 1.75ns 0.0221 0 0 1.64 KB
master EnrichedLog net472 2.71μs 4.48ns 17.4ns 0.249 0 0 1.57 KB
#5911 EnrichedLog net6.0 1.55μs 0.732ns 2.74ns 0.0226 0 0 1.64 KB
#5911 EnrichedLog netcoreapp3.1 2.24μs 0.907ns 3.39ns 0.0224 0 0 1.64 KB
#5911 EnrichedLog net472 2.77μs 2.28ns 8.84ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 114μs 163ns 630ns 0.0575 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 120μs 146ns 566ns 0 0 0 4.28 KB
master EnrichedLog net472 149μs 159ns 615ns 0.673 0.224 0 4.46 KB
#5911 EnrichedLog net6.0 117μs 208ns 779ns 0.0576 0 0 4.28 KB
#5911 EnrichedLog netcoreapp3.1 119μs 240ns 931ns 0 0 0 4.28 KB
#5911 EnrichedLog net472 147μs 117ns 422ns 0.658 0.219 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.19μs 0.898ns 3.48ns 0.0303 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.34μs 1.63ns 6.3ns 0.0298 0 0 2.2 KB
master EnrichedLog net472 4.83μs 1.27ns 4.75ns 0.319 0 0 2.02 KB
#5911 EnrichedLog net6.0 3.02μs 0.53ns 2.05ns 0.0302 0 0 2.2 KB
#5911 EnrichedLog netcoreapp3.1 4.14μs 1.07ns 3.86ns 0.029 0 0 2.2 KB
#5911 EnrichedLog net472 5μs 1.41ns 5.45ns 0.321 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.33μs 0.654ns 2.53ns 0.016 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.78μs 1.33ns 5.13ns 0.0152 0 0 1.14 KB
master SendReceive net472 2.24μs 2.28ns 8.82ns 0.183 0.00112 0 1.16 KB
#5911 SendReceive net6.0 1.34μs 0.497ns 1.86ns 0.0162 0 0 1.14 KB
#5911 SendReceive netcoreapp3.1 1.75μs 1.75ns 6.57ns 0.0156 0 0 1.14 KB
#5911 SendReceive net472 2.27μs 2.06ns 8ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.79μs 0.556ns 2.08ns 0.0222 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.88μs 1.29ns 5ns 0.0214 0 0 1.65 KB
master EnrichedLog net472 4.43μs 1.19ns 4.46ns 0.323 0 0 2.04 KB
#5911 EnrichedLog net6.0 2.88μs 0.83ns 3.11ns 0.0216 0 0 1.6 KB
#5911 EnrichedLog netcoreapp3.1 3.81μs 3.52ns 13.6ns 0.0209 0 0 1.65 KB
#5911 EnrichedLog net472 4.31μs 2.47ns 9.57ns 0.323 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #5911

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑net6.0 1.128 547.60 485.28

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 401ns 0.224ns 0.868ns 0.00809 0 0 576 B
master StartFinishSpan netcoreapp3.1 567ns 0.387ns 1.5ns 0.0079 0 0 576 B
master StartFinishSpan net472 598ns 0.647ns 2.5ns 0.0916 0 0 578 B
master StartFinishScope net6.0 547ns 0.227ns 0.878ns 0.00964 0 0 696 B
master StartFinishScope netcoreapp3.1 747ns 0.654ns 2.53ns 0.0093 0 0 696 B
master StartFinishScope net472 889ns 0.598ns 2.32ns 0.104 0 0 658 B
#5911 StartFinishSpan net6.0 399ns 0.285ns 1.07ns 0.00802 0 0 576 B
#5911 StartFinishSpan netcoreapp3.1 554ns 0.63ns 2.27ns 0.00768 0 0 576 B
#5911 StartFinishSpan net472 627ns 0.515ns 2ns 0.0916 0 0 578 B
#5911 StartFinishScope net6.0 485ns 0.385ns 1.49ns 0.00978 0 0 696 B
#5911 StartFinishScope netcoreapp3.1 702ns 0.985ns 3.81ns 0.00919 0 0 696 B
#5911 StartFinishScope net472 835ns 0.816ns 3.16ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 652ns 0.672ns 2.6ns 0.00986 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 929ns 0.986ns 3.82ns 0.00924 0 0 696 B
master RunOnMethodBegin net472 1.06μs 0.959ns 3.71ns 0.104 0 0 658 B
#5911 RunOnMethodBegin net6.0 589ns 0.49ns 1.9ns 0.00971 0 0 696 B
#5911 RunOnMethodBegin netcoreapp3.1 900ns 0.845ns 3.27ns 0.00947 0 0 696 B
#5911 RunOnMethodBegin net472 1.09μs 1.38ns 5.35ns 0.104 0 0 658 B

@andrewlock andrewlock merged commit ed6c355 into release/2.x Aug 16, 2024
50 of 54 checks passed
@andrewlock andrewlock deleted the andrew/ci/masstransit-fix-backport branch August 16, 2024 12:37
@github-actions github-actions bot added this to the vNext-v2 milestone Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:builds project files, build scripts, pipelines, versioning, releases, packages area:test-apps apps used to test integrations area:tests unit tests, integration tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants