Fixes for NonBacktracking NFA mode #72199

olsaarik · 2022-07-14T17:16:11Z

This PR fixes bugs uncovered by setting the DFA->NFA transition state limit to zero.

The main thing was that the non-capturing NFA execution mode was missing logic for backtracking simulation. The logic in taking NFA transitions is amended such that after the transitions for each source state (which are handled in order of priority) have been added to the set of next states, it is checked whether that source state is nullable. If so, the backtracking engine would prefer to end there rather than exploring lower priority paths, and as such any further source states are skipped.

Since the backtracking simulation logic must be disabled in the second reverse phase, as indicated by the presence of a DisableBacktrackingSimulation node on the top level, I added a new bit to the ContextIndependentState flags to cache the check. That brought up the number of booleans in the "state info" tuples up to five, which started feeling unweildly. I've refactored this into just returning the underlying enum and adding extension methods to make their use concise. The enum is now called StateFlags.

An existing bug affecting both the DFA and NFA modes was uncovered by setting the DFA state limit to zero: the first matching phase was not actually limiting the input for the inner loop to 1000 characters at a time, but this was hidden in the relevant unit test by the pattern switching from DFA to NFA mode after ~10000 characters, which also triggered a timeout check, thus letting the test pass. I'm guessing I introduced this in my latest refactoring of the matching loops. This is now fixed.

I ran the benchmarks and there are no significant slowdowns. In fact, some tests show a double digit speedup, which if I had to guess was due to the StateFlags refactoring.

summary:
better: 40, geomean: 1.056
worse: 5, geomean: 1.042
total diff: 46

Slower	diff/base	Base Median (ns)	Diff Median (ns)
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: NonBacktracking)	1.06	49273537.50	52179975.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: Compiled)	1.05	419165650.00	439475700.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 4, Options: NonBacktracking)	1.05	125.20	131.23
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 1, Options: None)	1.03	114.91	118.19
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: None)	1.03	3424692200.00	3517604600.00

Faster	base/diff	Base Median (ns)	Diff Median (ns)	Modality
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{2,4}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: NonBacktracking)	1.18	50448975.00	42751350.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?s).*", Options: NonBacktracking)	1.17	3470643.75	2969993.75
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes", Options: NonBacktracking)	1.14	2050385.29	1793558.39
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes\s+\w+", Options: NonBacktracking)	1.13	2026005.74	1798357.78
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: ".*", Options: NonBacktracking)	1.10	5641125.00	5124262.50	bimodal
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: NonBacktracking)	1.09	3007115.88	2756977.72
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 2, Options: NonBacktracking)	1.09	133.60	122.82
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: NonBacktracking)	1.07	4532033.04	4224599.18
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w\.+-]+@[\w\.-]+\.[\w\.-]+", Options: Compiled)	1.07	559869.61	522074.69
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sher[a-z]+\|Hol[a-z]+", Options: NonBacktracking)	1.07	127917.56	119873.66
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\p{Ll}", Options: NonBacktracking)	1.06	29418175.00	27644175.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 0, Options: NonBacktracking)	1.06	80.42	75.65
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[^\\n]*", Options: NonBacktracking)	1.06	5669218.48	5337582.98
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 6, Options: NonBacktracking)	1.06	91.26	86.06
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: NonBacktracking)	1.06	`2462680`.00	2323636.25
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sherlock", Options: NonBacktracking)	1.06	101281.09	95562.82
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\b\w+n\b", Options: NonBacktracking)	1.05	5851841.86	5577358.89
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 2, Options: Compiled)	1.05	60.10	57.30
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sherlock Holmes", Options: NonBacktracking)	1.05	99986.30	95350.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Holmes.{0,25}Watson\|Watson.{0,25}Holmes", Options: NonBacktracking)	1.05	111889.09	106935.70
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Holmes", Options: NonBacktracking)	1.04	509017.50	489625.95
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w\.+-]+@[\w\.-]+\.[\w\.-]+", Options: None)	1.04	586542.94	565490.07
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 3, Options: NonBacktracking)	1.04	170.55	164.45
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 8, Options: NonBacktracking)	1.04	88.83	85.74
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: Compiled)	1.04	`2234738`.54	2158540.63
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Holmes", Options: NonBacktracking)	1.03	85098.89	82319.31
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "[a-z]shing", Options: Compiled)	1.03	2331271.88	2257754.46
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "[a-z]shing", Options: NonBacktracking)	1.03	2409907.50	2335412.50	several?
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Tom.{10,25}river\|river.{10,25}Tom", Options: NonBacktracking)	1.03	9022710.34	8750925.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 9, Options: NonBacktracking)	1.03	90.22	87.58
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: Compiled)	1.03	1348814.58	1311024.22
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Huck[a-zA-Z]+\|Saw[a-zA-Z]+", Options: None)	1.03	4728007.81	4599918.75
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 13, Options: NonBacktracking)	1.03	113.54	110.51
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 1, Options: NonBacktracking)	1.03	477.27	464.64
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock Holmes", Options: NonBacktracking)	1.03	50447.27	49124.54
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock\|Holmes\|Watson", Options: NonBacktracking)	1.03	154421.26	150426.52
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock\s+Holmes", Options: NonBacktracking)	1.02	55327.18	54023.55
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-q][^u-z]{13}x", Options: NonBacktracking)	1.02	52648.28	51436.88
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 0, Options: Compiled)	1.02	35.72	34.92
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 7, Options: NonBacktracking)	1.02	82.40	80.62

ghost · 2022-07-14T17:16:31Z

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR fixes bugs uncovered by setting the DFA->NFA transition state limit to zero.

The main thing was that the non-capturing NFA execution mode was missing logic for backtracking simulation. The logic in taking NFA transitions is amended such that after the transitions for each source state (which are handled in order of priority) have been added to the set of next states, it is checked whether that source state is nullable. If so, the backtracking engine would prefer to end there rather than exploring lower priority paths, and as such any further source states are skipped.

Since the backtracking simulation logic must be disabled in the second reverse phase, as indicated by the presence of a DisableBacktrackingSimulation node on the top level, I added a new bit to the ContextIndependentState flags to cache the check. That brought up the number of booleans in the "state info" tuples up to five, which started feeling unweildly. I've refactored this into just returning the underlying enum and adding extension methods to make their use concise. The enum is now called StateFlags.

An existing bug affecting both the DFA and NFA modes was uncovered by setting the DFA state limit to zero: the first matching phase was not actually limiting the input for the inner loop to 1000 characters at a time, but this was hidden in the relevant unit test by the pattern switching from DFA to NFA mode after ~10000 characters, which also triggered a timeout check, thus letting the test pass. I'm guessing I introduced this in my latest refactoring of the matching loops. This is now fixed.

I ran the benchmarks and there are no significant slowdowns. In fact, some tests show a double digit speedup, which if I had to guess was due to the StateFlags refactoring.

summary:
better: 40, geomean: 1.056
worse: 5, geomean: 1.042
total diff: 46

Slower	diff/base	Base Median (ns)	Diff Median (ns)
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: NonBacktracking)	1.06	49273537.50	52179975.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: Compiled)	1.05	419165650.00	439475700.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 4, Options: NonBacktracking)	1.05	125.20	131.23
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 1, Options: None)	1.03	114.91	118.19
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: None)	1.03	3424692200.00	3517604600.00

Faster	base/diff	Base Median (ns)	Diff Median (ns)	Modality
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{2,4}(Tom\|Sawyer\|Huckleberry\|Finn)", Options: NonBacktracking)	1.18	50448975.00	42751350.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?s).*", Options: NonBacktracking)	1.17	3470643.75	2969993.75
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes", Options: NonBacktracking)	1.14	2050385.29	1793558.39
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes\s+\w+", Options: NonBacktracking)	1.13	2026005.74	1798357.78
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: ".*", Options: NonBacktracking)	1.10	5641125.00	5124262.50	bimodal
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: NonBacktracking)	1.09	3007115.88	2756977.72
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 2, Options: NonBacktracking)	1.09	133.60	122.82
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: NonBacktracking)	1.07	4532033.04	4224599.18
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w\.+-]+@[\w\.-]+\.[\w\.-]+", Options: Compiled)	1.07	559869.61	522074.69
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sher[a-z]+\|Hol[a-z]+", Options: NonBacktracking)	1.07	127917.56	119873.66
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\p{Ll}", Options: NonBacktracking)	1.06	29418175.00	27644175.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 0, Options: NonBacktracking)	1.06	80.42	75.65
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[^\\n]*", Options: NonBacktracking)	1.06	5669218.48	5337582.98
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 6, Options: NonBacktracking)	1.06	91.26	86.06
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: NonBacktracking)	1.06	`2462680`.00	2323636.25
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sherlock", Options: NonBacktracking)	1.06	101281.09	95562.82
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\b\w+n\b", Options: NonBacktracking)	1.05	5851841.86	5577358.89
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 2, Options: Compiled)	1.05	60.10	57.30
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sherlock Holmes", Options: NonBacktracking)	1.05	99986.30	95350.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Holmes.{0,25}Watson\|Watson.{0,25}Holmes", Options: NonBacktracking)	1.05	111889.09	106935.70
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Holmes", Options: NonBacktracking)	1.04	509017.50	489625.95
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w\.+-]+@[\w\.-]+\.[\w\.-]+", Options: None)	1.04	586542.94	565490.07
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 3, Options: NonBacktracking)	1.04	170.55	164.45
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 8, Options: NonBacktracking)	1.04	88.83	85.74
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: Compiled)	1.04	`2234738`.54	2158540.63
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Holmes", Options: NonBacktracking)	1.03	85098.89	82319.31
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "[a-z]shing", Options: Compiled)	1.03	2331271.88	2257754.46
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "[a-z]shing", Options: NonBacktracking)	1.03	2409907.50	2335412.50	several?
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Tom.{10,25}river\|river.{10,25}Tom", Options: NonBacktracking)	1.03	9022710.34	8750925.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 9, Options: NonBacktracking)	1.03	90.22	87.58
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: Compiled)	1.03	1348814.58	1311024.22
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Huck[a-zA-Z]+\|Saw[a-zA-Z]+", Options: None)	1.03	4728007.81	4599918.75
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 13, Options: NonBacktracking)	1.03	113.54	110.51
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 1, Options: NonBacktracking)	1.03	477.27	464.64
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock Holmes", Options: NonBacktracking)	1.03	50447.27	49124.54
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock\|Holmes\|Watson", Options: NonBacktracking)	1.03	154421.26	150426.52
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "Sherlock\s+Holmes", Options: NonBacktracking)	1.02	55327.18	54023.55
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-q][^u-z]{13}x", Options: NonBacktracking)	1.02	52648.28	51436.88
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 0, Options: Compiled)	1.02	35.72	34.92
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 7, Options: NonBacktracking)	1.02	82.40	80.62

Author:	olsaarik
Assignees:	-
Labels:	`area-System.Text.RegularExpressions`
Milestone:	-

...ies/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/StateFlags.cs

....Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs

olsaarik · 2022-07-18T20:02:17Z

Thanks for merging @stephentoub!

Performance after the last changes looks good. Tested on a different day than the baseline, so wouldn't read too much into this, but things were actually faster after the fixes from review:

summary:
better: 62, geomean: 1.062
total diff: 62

No Slower results for the provided threshold = 1% and noise filter = 0.3 ns.

Faster	base/diff	Base Median (ns)	Diff Median (ns)	Modality
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.25	2558190.57	2050762.70
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.24	50448975.00	40529712.50
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.17	49273537.50	42085900.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.17	2026005.74	1733377.93
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.17	2050385.29	1754689.58
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.16	3470643.75	2987188.54
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.14	133.60	117.59
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.12	5669218.48	5068547.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.11	3007115.88	2715355.06
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.11	154421.26	139541.72
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.10	`2462680`.00	2235627.50
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.09	559869.61	512012.50
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.09	4532033.04	`4145720`.34
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.08	88.83	82.60
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.07	91.26	85.24
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.07	90.22	84.44
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.07	101281.09	94870.49
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.07	5851841.86	5483600.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.06	4728007.81	4466753.13
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.06	95.38	90.23
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.06	29418175.00	27849100.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.05	12263900.00	11668622.73	several?
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.05	96.96	92.38
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.05	1208215.87	1152697.54
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.05	127917.56	122068.16
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.05	`2141091`.50	20433158.33
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.05	85098.89	81316.58
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.05	586542.94	561259.38
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.05	60.10	57.51
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.04	34.60	33.12
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.04	80.42	77.02
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.04	111889.09	107226.27
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.04	11298804.76	10836638.89
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.04	99986.30	95944.69
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Ctor(Pattern:	1.04	2628.73	2524.87
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.04	1348814.58	1295922.60
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.04	82.40	79.18
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.04	52648.28	50610.60
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.04	50447.27	48517.62
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.04	59.18	56.99
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.04	633551.50	611131.97
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.04	113.54	109.63
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.04	573463.89	553987.05
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern	1.03	2184305.36	2111239.06
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.03	2456756.70	2376029.46
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.03	136.35	132.05
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.03	2372175.45	2299385.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Ctor(Pattern:	1.03	19191.53	18627.51
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.03	509017.50	494399.41
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.03	74978950.00	72835575.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Ctor(Pattern:	1.03	201457.96	195718.82
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.03	83.15	80.86
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Option	1.03	2007518350.00	1953346950.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.03	2666647.50	2597185.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.02	662402400.00	646265700.00
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.02	51337.29	50201.75
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.02	265.11	259.29
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.02	227.19	222.43
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.02	110.69	108.52
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern:	1.02	12877521.88	12631209.38
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count	1.02	55327.18	54365.20
System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatc	1.02	34.13	33.57

olsaarik added 3 commits July 13, 2022 16:14

Fix NFA mode backtracking simulation

f2d9d9f

Refactor to StateFlags

bb9c3e2

Fix bug in timeout check

774dd55

dotnet-issue-labeler bot added the area-System.Text.RegularExpressions label Jul 14, 2022

ghost assigned olsaarik Jul 14, 2022

olsaarik requested a review from joperezr July 14, 2022 17:16

olsaarik requested a review from stephentoub July 14, 2022 17:16

joperezr reviewed Jul 14, 2022

View reviewed changes

...ies/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/StateFlags.cs Outdated Show resolved Hide resolved

stephentoub approved these changes Jul 14, 2022

View reviewed changes

Changes from review

bd9eac2

stephentoub approved these changes Jul 14, 2022

View reviewed changes

joperezr approved these changes Jul 14, 2022

View reviewed changes

stephentoub merged commit c5759fa into dotnet:main Jul 15, 2022

olsaarik deleted the fix-nfa branch July 18, 2022 20:00

ghost locked as resolved and limited conversation to collaborators Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for NonBacktracking NFA mode #72199

Fixes for NonBacktracking NFA mode #72199

olsaarik commented Jul 14, 2022

ghost commented Jul 14, 2022

olsaarik commented Jul 18, 2022

Fixes for NonBacktracking NFA mode #72199

Fixes for NonBacktracking NFA mode #72199

Conversation

olsaarik commented Jul 14, 2022

ghost commented Jul 14, 2022

olsaarik commented Jul 18, 2022