Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock #80447

Closed
performanceautofiler bot opened this issue Jan 10, 2023 · 12 comments · Fixed by #86046
Closed
Assignees
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Jan 10, 2023

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 83ce1253cf71348776fec4464e3fad1277fac7f0
Compare 5a10aa6a3fa43d9d0e30e48ae22f2767a358af41
Diff Diff

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Count - Duration of single invocation 4.25 ms 4.94 ms 1.16 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock*'

Payloads

Baseline
Compare

Histogram

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: Compiled)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 4.944040256410257 > 4.470104023437501.
IsChangePoint: Marked as a change because one of 1/5/2023 12:24:13 PM, 1/10/2023 1:28:02 AM falls between 1/1/2023 3:57:25 AM and 1/10/2023 1:28:02 AM.
IsRegressionStdDev: Marked as regression because -46.744375261168585 (T) = (0 -4768527.395233576) / Math.Sqrt((253083086.14412147 / (27)) + (2248016736.454753 / (20))) is less than -2.0141033888794695 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (20) - 2, .025) and -0.12129495544576763 = (4252696.734319884 - 4768527.395233576) / 4252696.734319884 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo EgorBo changed the title [Perf] Windows/x64: 3 Regressions on 1/5/2023 3:38:13 PM Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock Jan 10, 2023
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo EgorBo transferred this issue from dotnet/perf-autofiling-issues Jan 10, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo
Copy link
Member

EgorBo commented Jan 10, 2023

Regressed via #78927 cc @MihaZupan

@ghost
Copy link

ghost commented Jan 10, 2023

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 83ce1253cf71348776fec4464e3fad1277fac7f0
Compare 5a10aa6a3fa43d9d0e30e48ae22f2767a358af41
Diff Diff

Regressions in System.Buffers.Tests.ReadOnlySequenceTests<Char>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IterateGetPositionTenSegments - Duration of single invocation 88.64 ns 94.53 ns 1.07 0.07 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Buffers.Tests.ReadOnlySequenceTests&lt;Char&gt;*'

Payloads

Baseline
Compare

Histogram

System.Buffers.Tests.ReadOnlySequenceTests<Char>.IterateGetPositionTenSegments


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 94.52713939332814 > 93.02948390094744.
IsChangePoint: Marked as a change because one of 12/6/2022 6:34:01 PM, 12/17/2022 10:44:50 PM, 1/5/2023 12:24:13 PM, 1/10/2023 1:28:02 AM falls between 1/1/2023 3:57:25 AM and 1/10/2023 1:28:02 AM.
IsRegressionStdDev: Marked as regression because -17.118853775804876 (T) = (0 -95.47996438655115) / Math.Sqrt((0.6120732677830997 / (27)) + (2.4221451343567497 / (20))) is less than -2.0141033888794695 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (20) - 2, .025) and -0.07294293228394086 = (88.9888562696488 - 95.47996438655115) / 88.9888562696488 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS Windows 10.0.18362
Baseline 83ce1253cf71348776fec4464e3fad1277fac7f0
Compare 5a10aa6a3fa43d9d0e30e48ae22f2767a358af41
Diff Diff

Regressions in System.Collections.ContainsKeyFalse<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IDictionary - Duration of single invocation 4.91 μs 6.06 μs 1.24 0.20 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.ContainsKeyFalse&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsKeyFalse<Int32, Int32>.IDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 6.060088645548168 > 5.343659686397301.
IsChangePoint: Marked as a change because one of 11/18/2022 8:48:12 PM, 12/1/2022 7:29:59 PM, 12/20/2022 11:23:14 PM, 12/28/2022 5:13:50 PM, 1/5/2023 12:24:13 PM, 1/10/2023 1:28:02 AM falls between 1/1/2023 3:57:25 AM and 1/10/2023 1:28:02 AM.
IsRegressionStdDev: Marked as regression because -8.619112141642141 (T) = (0 -5912.352564087677) / Math.Sqrt((49704.028608457615 / (27)) + (105325.01627739171 / (20))) is less than -2.0141033888794695 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (20) - 2, .025) and -0.1401200372941473 = (5185.728143257173 - 5912.352564087677) / 5185.728143257173 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 83ce1253cf71348776fec4464e3fad1277fac7f0
Compare 5a10aa6a3fa43d9d0e30e48ae22f2767a358af41
Diff Diff

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Count - Duration of single invocation 4.25 ms 4.94 ms 1.16 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock*'

Payloads

Baseline
Compare

Histogram

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: Compiled)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 4.944040256410257 > 4.470104023437501.
IsChangePoint: Marked as a change because one of 1/5/2023 12:24:13 PM, 1/10/2023 1:28:02 AM falls between 1/1/2023 3:57:25 AM and 1/10/2023 1:28:02 AM.
IsRegressionStdDev: Marked as regression because -46.744375261168585 (T) = (0 -4768527.395233576) / Math.Sqrt((253083086.14412147 / (27)) + (2248016736.454753 / (20))) is less than -2.0141033888794695 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (27) + (20) - 2, .025) and -0.12129495544576763 = (4252696.734319884 - 4768527.395233576) / 4252696.734319884 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: dakersnar
Labels:

area-System.Text.RegularExpressions, os-windows, tenet-performance, tenet-performance-benchmarks, arch-x64, untriaged

Milestone: -

@EgorBo EgorBo assigned MihaZupan and unassigned dakersnar Jan 10, 2023
@EgorBo
Copy link
Member

EgorBo commented Jan 10, 2023

Looks like that PR introduced a lot of improvements dotnet/perf-autofiling-issues#11463 so feel free to close this issue if this single regression can be ignored

@stephentoub
Copy link
Member

It'd be good at least to understand why it regressed.

@MihaZupan
Copy link
Member

MihaZupan commented Jan 10, 2023

Pattern: [a-zA-Z]+ing

The diff (for the source generator at least) looks like

ReadOnlySpan<char> span = inputSpan.Slice(pos);
for (int i = 0; i < span.Length; i++)
{
    if (char.IsAsciiLetter(span[i]))
    {
        base.runtextpos = pos + i;
        return true;
    }
}
int iteration = 0;
while ((uint)iteration < (uint)slice.Length && char.IsAsciiLetter(slice[iteration]))
{
    iteration++;
}

turned into

int i = inputSpan.Slice(pos).IndexOfAny(Utilities.s_asciiLetters);
if (i >= 0)
{
    base.runtextpos = pos + i;
    return true;
}
int iteration = slice.IndexOfAnyExcept(Utilities.s_asciiLetters);
if (iteration < 0)
{
    iteration = slice.Length;
}

Looking at what the offsets look like for the sherlock input.
value: number of appearances

NextPositionIterationFrequency
Value of 'i' at https://gist.github.com/MihaZupan/79bf362f42b69c0500adc5879559b788#file-regex-cs-L92

0: 84301
1: 19250
2: 2761
6: 1524
5: 770
7: 233
3: 192
4: 148
8: 17
9: 14
10: 6
12: 5
13: 4
19: 1
43: 1
38: 1
22: 1
30: 1
27: 1
25: 1
14: 1
50: 1
29: 1
20: 1
15: 1
23: 1
MatchAtCurrentIterationFrequency
Value of 'iteration' at https://gist.github.com/MihaZupan/79bf362f42b69c0500adc5879559b788#file-regex-cs-L123

3: 24737
4: 21127
2: 20633
5: 11621
6: 8801
1: 6501
7: 6416
8: 4118
9: 2581
10: 1468
11: 646
12: 354
13: 170
14: 52
15: 9
17: 2
18: 1
16: 1

The makeshift test code to generate the numbers above: gist.

So in the vast majority of cases, we're seeing matches very close to the start of the input, so the overhead of calling into the vectorized path shows up. And with the compiled version the overhead of calling into IndexOfAnyValues is higher than the source-gened version (the instances aren't static readonlys). I haven't run it through a profiler but it seems plausible given the number of matches at position 0.

(by "match" I just mean the loop exited, not an actual Regex match)

@stephentoub
Copy link
Member

stephentoub commented Jan 11, 2023

Thanks for the investigation.

And with the compiled version the overhead of calling into IndexOfAnyValues is higher than the source-gened version (the instances aren't static readonlys)

Hmm, that's an issue I hadn't previously noticed. So there's no devirtualization/inlining happening as part of the use of IndexOfAnyValues with Compiled? We may need to rethink that part.

@MihaZupan
Copy link
Member

If I change the compiled version to use a static readonly in this case, it's ~5% faster, bringing it in line with the source-gened version.

|          Method |     Mean |     Error |    StdDev |
|---------------- |---------:|----------:|----------:|
|  Count_Compiled | 3.463 ms | 0.0154 ms | 0.0144 ms | // Main
|  Count_Compiled | 3.256 ms | 0.0132 ms | 0.0124 ms | // Static readonly
| Count_SourceGen | 3.249 ms | 0.0143 ms | 0.0134 ms |

The traces look about how you'd expect
image
vs.
image


As an experiment, I also added a fast-path to catch the offset == 0 case in TryFindNextPossibleStartingPosition before calling into IndexOfAnyValues.

Method Mean Error Ratio
Count_SourceGen 3.225 ms 0.0152 ms 1.00
Count_SourceGen_Modified 3.041 ms 0.0135 ms 0.94

@stephentoub
Copy link
Member

If I change the compiled version to use a static readonly in this case, it's ~5% faster, bringing it in line with the source-gened version.

What happens if you remove the readonly? I'm curious how much of the gap was due to lookup of the instance vs the devirtualization/inlining/etc. of the IndexOfAny implementation?

@MihaZupan
Copy link
Member

Method Toolchain Mean Error Ratio
Count_Compiled main 3.473 ms 0.0103 ms 1.00
Count_Compiled static 3.431 ms 0.0355 ms 0.99
Count_Compiled static-readonly 3.213 ms 0.0107 ms 0.93

@stephentoub
Copy link
Member

stephentoub commented Jan 12, 2023

Thanks, so it is primarily the devirtualization/inlining. I'm not currently sure what to do about that. I did consider a fast-path like you experimented with where we check the next character directly and only subsequently fall back to IndexOfAny, but that just kicks the can down the road: you still have the same issue if the character is the second one, and the more checks you add, the more expensive you make it for the longer cases. Maybe just checking one is a sweet spot, but we'd need to do a lot of perf validation.

@dakersnar
Copy link
Contributor

dakersnar commented Jan 12, 2023

Same regression on arm64: dotnet/perf-autofiling-issues#11579

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 10, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 15, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants