[Perf][Mono] 50 % Regression System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue, other Queue Benchmarks #46684

naricc · 2021-01-07T17:22:24Z

50 % regression in System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue and 4 other concurrent data structure benchmarks.

Based on the time stamps and changes, it looks like the only plausible cause for this is: 8e6415d

@kouvel Could that change have caused this regression?

Although it could theoretically be anything else in this date range: 8e913a2...acd4855

Run Information

Architecture	x64
OS	ubuntu 18.04
Baseline	03e4f5e36410203a7ebab55ac177c166406ea1fe
Compare	be27e294a12514638bb5337c7fe5cc72d93a443a

Regressions in System.Collections.Concurrent.AddRemoveFromSameThreads

Benchmark	Baseline	Test	Test/Base	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
ConcurrentQueue	80.34 ms	294.75 ms	3.67

Related Issue on x64 Windows

[Perf 54%] System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue

Related Issue on x86 Windows

[Perf 55%] System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue

Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.Concurrent.AddRemoveFromSameThreads<Int32>*'

.

Payloads

Baseline
Compare

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

[ 66866642.407 ;  96580088.564) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[ 96580088.564 ; 126293534.721) | 
[126293534.721 ; 145294481.863) | 
[145294481.863 ; 165677844.647) | @
[165677844.647 ; 195391290.803) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[195391290.803 ; 214722165.028) | @@@
[214722165.028 ; 244435611.185) | 
[244435611.185 ; 256983764.647) | 
[256983764.647 ; 282366221.572) | @@@@@@@@@
[282366221.572 ; 312079667.728) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[312079667.728 ; 331663785.078) | @

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture	x64
OS	ubuntu 18.04
Baseline	03e4f5e36410203a7ebab55ac177c166406ea1fe
Compare	be27e294a12514638bb5337c7fe5cc72d93a443a

Regressions in System.Buffers.Tests.RentReturnArrayPoolTests

Benchmark	Baseline	Test	Test/Base	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
SingleParallel	1.31 μs	1.82 μs	1.39
ProducerConsumer	375.67 ns	718.61 ns	1.91

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Buffers.Tests.RentReturnArrayPoolTests<Byte>*'

.

Payloads

Baseline
Compare

Histogram

System.Buffers.Tests.RentReturnArrayPoolTests.SingleParallel(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

[1290.581 ; 1354.437) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[1354.437 ; 1438.053) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[1438.053 ; 1501.909) | 
[1501.909 ; 1565.765) | 
[1565.765 ; 1601.169) | 
[1601.169 ; 1669.893) | @@@
[1669.893 ; 1752.036) | @@@@@@@@@@@@@@@@@@@@
[1752.036 ; 1815.893) | @@@@@@@@@@@@@@@@@@@@@@@
[1815.893 ; 1900.578) | @@@@@@@@@@@@@@@@@@@@@@@@@@

System.Buffers.Tests.RentReturnArrayPoolTests.ProducerConsumer(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

[ 317.396 ;  345.744) | @
[ 345.744 ;  398.837) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[ 398.837 ;  447.892) | @@@@@
[ 447.892 ;  507.888) | @
[ 507.888 ;  560.981) | @@@@@@
[ 560.981 ;  583.242) | 
[ 583.242 ;  642.394) | @@@@@@@@@@@@@@@
[ 642.394 ;  695.487) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[ 695.487 ;  745.330) | @@@@@@@@@@@@@@
[ 745.330 ;  800.163) | @@@@@
[ 800.163 ;  853.256) | 
[ 853.256 ;  906.349) | 
[ 906.349 ;  959.443) | 
[ 959.443 ; 1012.536) | 
[1012.536 ; 1042.868) | 
[1042.868 ; 1095.961) | @@@@@@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture	x64
OS	ubuntu 18.04
Baseline	03e4f5e36410203a7ebab55ac177c166406ea1fe
Compare	be27e294a12514638bb5337c7fe5cc72d93a443a

Regressions in System.Collections.Concurrent.AddRemoveFromSameThreads

Benchmark	Baseline	Test	Test/Base	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
ConcurrentQueue	89.44 ms	315.83 ms	3.53

Related Issue on x64 Windows

[Perf 55%] System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue

Related Issue on x86 Windows

[Perf 55%] System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue

Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.Concurrent.AddRemoveFromSameThreads<String>*'

.

Payloads

Baseline
Compare

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

[ 74979309.824 ; 107057845.804) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[107057845.804 ; 139136381.783) | 
[139136381.783 ; 162908924.360) | 
[162908924.360 ; 180526631.235) | @
[180526631.235 ; 212605167.215) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[212605167.215 ; 244683703.194) | 
[244683703.194 ; 279535952.931) | 
[279535952.931 ; 304010762.260) | @@@@@@
[304010762.260 ; 336089298.240) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[336089298.240 ; 359113695.090) | @@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture	x64
OS	ubuntu 18.04
Baseline	03e4f5e36410203a7ebab55ac177c166406ea1fe
Compare	be27e294a12514638bb5337c7fe5cc72d93a443a

Regressions in System.Threading.Tests.Perf_ThreadPool

Benchmark	Baseline	Test	Test/Base	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
QueueUserWorkItem_WaitCallback_Throughput	2.04 secs	2.93 secs	1.43

Related Issue on x64 Windows

[Perf 25%] System.Threading.Tests.Perf_ThreadPool.QueueUserWorkItem_WaitCallback_Throughput

Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Threading.Tests.Perf_ThreadPool*'

.

Payloads

Baseline
Compare

Histogram

System.Threading.Tests.Perf_ThreadPool.QueueUserWorkItem_WaitCallback_Throughput(WorkItemsPerCore: 20000000)

[1991644608.011 ; 2108683013.538) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2108683013.538 ; 2225721419.064) | 
[2225721419.064 ; 2332261561.437) | 
[2332261561.437 ; 2449299966.963) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2449299966.963 ; 2566338372.490) | 
[2566338372.490 ; 2654947608.828) | 
[2654947608.828 ; 2744420636.989) | @@@
[2744420636.989 ; 2861459042.516) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2861459042.516 ; 3027483990.369) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[3027483990.369 ; 3133549085.991) | @@@@@@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

The text was updated successfully, but these errors were encountered:

ghost · 2021-01-07T17:22:26Z

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

Issue Details

50 % regression in System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue and 4 other concurrent data structure benchmarks.

Based on the time stamps and changes, it looks like the only plausible cause for this is: 8e6415d

@kouvel Could that change have caused this regression?

Although it could theoretically be anything else in this date range: 8e913a2...acd4855

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2108683013.538 ; 2225721419.064) |
[2225721419.064 ; 2332261561.437) |
[2332261561.437 ; 2449299966.963) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2449299966.963 ; 2566338372.490) |
[2566338372.490 ; 2654947608.828) |
[2654947608.828 ; 2744420636.989) | @@@
[2744420636.989 ; 2861459042.516) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2861459042.516 ; 3027483990.369) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[3027483990.369 ; 3133549085.991) | @@@@@@@



### Docs

[Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md)
[Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

</details>

<table>
  <tr>
    <th align="left">Author:</th>
    <td>naricc</td>
  </tr>
  <tr>
    <th align="left">Assignees:</th>
    <td>-</td>
  </tr>
  <tr>
    <th align="left">Labels:</th>
    <td>

`area-Codegen-Interpreter-mono`, `tenet-performance`

</td>
  </tr>
  <tr>
    <th align="left">Milestone:</th>
    <td>-</td>
  </tr>
</table>
</details>

stephentoub · 2021-01-07T17:29:11Z

@naricc, was there a corresponding 50% improvement in the same benchmarks after #44265 went in?

DrewScoggins · 2021-01-07T18:28:15Z

No there was not. Here is the graph

You can see that we got a win, but then after we reverted the change we are worse than we were before.

kouvel · 2021-01-07T20:18:25Z

It's likely 8e6415d that caused the ConcurrentQueue regressions, that was expected. Based on what I saw before, the test is not very realistic in measuring contended performance and any spin-waiting in ConcurrentQueue.Enqueue/Dequeue would have the effect of turning this benchmark into a more single-threaded benchmark and its perf benefits from that. The large gain in the microbenchmark from #44265 also coincided with most of the large regressions in #45716. From the perf data in #46120 it seems clear that the change works well, even though it regresses this perf test.

Not sure about the other tests, will have to see.

DrewScoggins · 2021-01-08T00:29:51Z

I filed an issue with all the tests that we are seeing with this issue, it is here #46714. Mostly they regressed tests are in Concurrent* classes and we had a few in RentReturnArrayPool tests.

Based on what I saw before, the test is not very realistic in measuring contended performance and any spin-waiting in ConcurrentQueue.Enqueue/Dequeue would have the effect of turning this benchmark into a more single-threaded benchmark and its perf benefits from that. The large gain in the microbenchmark from #44265 also coincided with most of the large regressions in #45716. From the perf data in #46120 it seems clear that the change works well, even though it regresses this perf test.

Given this can we get a work item cut to get rid of these tests and put in place more realistic ones? Doesn't seem super useful to have around if when they regress we can't take any meaningful action.

kouvel · 2021-01-08T00:44:31Z

Yea it would probably be reasonable to remove those tests. The ASP.NET tests seem to be better and seem to sufficiently cover this type of change for ConcurrentQueue at least, but not sure about the other Concurrent* tests. It's not easy to create realistic concurrent tests esp. in BDN style tests, I typically don't use BDN for those types of tests.

kouvel · 2021-01-08T00:53:04Z

Actually it might be ok to keep the tests. They may not be good at assessing policy changes like in spin-waiting, but they could still identify regressions in pure overhead not involving policy changes. For a policy-type change we can just consider it a re-baseline since it's actually measuring something different from before.

kouvel · 2021-07-07T21:42:16Z

Closing, as these are one-off differences and we can re-baseline on the new numbers

naricc added tenet-performance Performance related issue area-Codegen-Interpreter-mono labels Jan 7, 2021

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jan 7, 2021

kouvel added area-System.Threading and removed area-Codegen-Interpreter-mono untriaged New issue has not been triaged by the area owner labels Jan 7, 2021

kouvel self-assigned this Jan 7, 2021

kouvel added this to the 6.0.0 milestone Jan 7, 2021

DrewScoggins mentioned this issue Jan 8, 2021

Regressions in RentReturn ArrayPool and ConcurrentQueue #46714

Closed

kouvel closed this as completed Jul 7, 2021

ghost locked as resolved and limited conversation to collaborators Aug 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf][Mono] 50 % Regression System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue, other Queue Benchmarks #46684

[Perf][Mono] 50 % Regression System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue, other Queue Benchmarks #46684

naricc commented Jan 7, 2021 •

edited

Loading

Payloads

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

Docs

Payloads

Histogram

System.Buffers.Tests.RentReturnArrayPoolTests.SingleParallel(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

System.Buffers.Tests.RentReturnArrayPoolTests.ProducerConsumer(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

Docs

Payloads

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

Docs

Payloads

Histogram

System.Threading.Tests.Perf_ThreadPool.QueueUserWorkItem_WaitCallback_Throughput(WorkItemsPerCore: 20000000)

Docs

ghost commented Jan 7, 2021

stephentoub commented Jan 7, 2021

DrewScoggins commented Jan 7, 2021

kouvel commented Jan 7, 2021 •

edited

Loading

DrewScoggins commented Jan 8, 2021

kouvel commented Jan 8, 2021

kouvel commented Jan 8, 2021

kouvel commented Jul 7, 2021

[Perf][Mono] 50 % Regression System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue, other Queue Benchmarks #46684

[Perf][Mono] 50 % Regression System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue, other Queue Benchmarks #46684

Comments

naricc commented Jan 7, 2021 • edited Loading

Run Information

Regressions in System.Collections.Concurrent.AddRemoveFromSameThreads

Related Issue on x64 Windows

Related Issue on x86 Windows

Repro

Payloads

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

Docs

Run Information

Regressions in System.Buffers.Tests.RentReturnArrayPoolTests

Related Issue on x64 Windows

Related Issue on x64 Windows

Related Issue on x86 Windows

Related Issue on x86 Windows

Repro

Payloads

Histogram

System.Buffers.Tests.RentReturnArrayPoolTests.SingleParallel(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

System.Buffers.Tests.RentReturnArrayPoolTests.ProducerConsumer(RentalSize: 4096, ManipulateArray: False, Async: True, UseSharedPool: True)

Docs

Run Information

Regressions in System.Collections.Concurrent.AddRemoveFromSameThreads

Related Issue on x64 Windows

Related Issue on x86 Windows

Repro

Payloads

Histogram

System.Collections.Concurrent.AddRemoveFromSameThreads.ConcurrentQueue(Size: 2000000)

Docs

Run Information

Regressions in System.Threading.Tests.Perf_ThreadPool

Related Issue on x64 Windows

Repro

Payloads

Histogram

System.Threading.Tests.Perf_ThreadPool.QueueUserWorkItem_WaitCallback_Throughput(WorkItemsPerCore: 20000000)

Docs

ghost commented Jan 7, 2021

stephentoub commented Jan 7, 2021

DrewScoggins commented Jan 7, 2021

kouvel commented Jan 7, 2021 • edited Loading

DrewScoggins commented Jan 8, 2021

kouvel commented Jan 8, 2021

kouvel commented Jan 8, 2021

kouvel commented Jul 7, 2021

naricc commented Jan 7, 2021 •

edited

Loading

kouvel commented Jan 7, 2021 •

edited

Loading