-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable RegexOptions.RightToLeft and lookbehinds in compiler / source generator #66280
Conversation
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsFor .NET 7 we rewrote RegexCompiler as we were writing the source generator, and in doing so we left out support for RegexOptions.RightToLeft as well as lookbehinds (which are implemented via RightToLeft). This adds support for both. I initially started incrementally adding in support for various constructs in lookbehinds, but from a testing perspective it made more sense to just add it all, as then all of the RightToLeft tests are used to validate the constructs that are also in lookbehinds.
|
...aries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexTreeAnalyzer.cs
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Show resolved
Hide resolved
...aries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexTreeAnalyzer.cs
Outdated
Show resolved
Hide resolved
Just to make sure it's clear, with this our entire regex corpus is supported with both RegexCompiler and the source generator:
The only cases now that aren't supported are those that use the new RegexOptions.NonBacktracking or those with super duper deep nesting, which none of the expressions we see in real-life get anywhere close to. |
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Outdated
Show resolved
Hide resolved
…generator For .NET 7 we rewrote RegexCompiler as we were writing the source generator, and in doing so we left out support for RegexOptions.RightToLeft as well as lookbehinds (which are implemented via RightToLeft). This adds support for both. I initially started incrementally adding in support for various constructs in lookbehinds, but from a testing perspective it made more sense to just add it all, as then all of the RightToLeft tests are used to validate the constructs that are also in lookbehinds.
fffdf98
to
c181120
Compare
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Show resolved
Hide resolved
|
||
string expr = !rtl ? | ||
$"{sliceSpan}[{sliceStaticPos}]" : | ||
"inputSpan[pos - 1]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputSpan
NIT: Should we consider to have this saved as a label instead, in order to avoid issues if for some reason we rename the parameter in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We hard code "inputSpan" and "pos" everywhere already. You mean changing it everywhere? Or something specific to this use?
I'm not against creating a local for it, but not in this PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do it now, I only brought it up here since you modified this particular line but I understand this is not the only instance of it, so we can do this later instead if we feel it would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to extract it or see it extracted separately.
yield return (@"abc(?!XXX)\w+", "abcXXXdef", RegexOptions.None, 0, 9, false, string.Empty); | ||
|
||
// Zero-width positive lookbehind assertion: Actual - "(\\w){6}(?<=XXX)def" | ||
// Zero-width positive lookbehind assertion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Comment says positive lookbehind but there seems to be also negative ones. Of course don't reset CI just for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll fix the comment separately. Thanks.
Improvement on win-x64 dotnet/perf-autofiling-issues#3994 |
Replaces #66127
Fixes #62345
For .NET 7 we rewrote RegexCompiler as we were writing the source generator, and in doing so we left out support for RegexOptions.RightToLeft as well as lookbehinds (which are implemented via RightToLeft). This adds support for both. I initially started incrementally adding in support for various constructs in lookbehinds, but from a testing perspective it made more sense to just add it all, as then all of the RightToLeft tests are used to validate the constructs that are also in lookbehinds.