Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex.Match(string, int, int) throws an exception when using a pattern with '^' or '$' and the Multiline option #55557

Closed
strobelleder opened this issue Jul 13, 2021 · 2 comments · Fixed by #55574

Comments

@strobelleder
Copy link

When using the Regex.Match function that also takes the start index and the length on the input string, and matching the input against a pattern containing '^' and '$' (using the multiline option), the function wrongfully throws an argument out of range exception if a match can be 'started' but cannot be found.

string input   = "ABC\n";

Regex  pattern = new Regex(@"^A$", RegexOptions.Multiline);

Match m = pattern.Match(input, 0, 2); // <-- Argument out of range exception (Parameter 'start'), for lengths 2 or 3.

(The match can be 'started' because of the 'A', but then the '$' cannot be matched to either end-of-line or end-of-text, I guess).

Other combinations, like the pattern '^A' (without '$'), also caused the exception (For different yet similar input strings, start indices and lengths) but we couldn't quite figure out when and why.

Configuration:

OS: Windows 10 Enterprise, 20H2
Arch: x64
Target framework: .NET5
Installed:
.NET SDK 5.0.301 (x64)
Microsoft.AspNetCore.App 5.0.7
Microsoft.NETCore.App 5.0.7
Microsoft.WindowsDesktop.App 5.0.7
(And some previous versions too.)
Visual studio:
Professional 2019 16.10.2

Regression:

This worked fine using .NET Framework 4.7.

Other information:

The exact exception says:
System.ArgumentOutOfRangeException: 'Specified argument was out of the range of valid values. (Parameter 'start')'
With the inner exception being:
at System.Text.RegularExpressions.RegexInterpreter.FindFirstChar()
at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick, TimeSpan timeout)
at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
at System.Text.RegularExpressions.Regex.Match(String input, Int32 beginning, Int32 length)
at the code above

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Text.RegularExpressions untriaged New issue has not been triaged by the area owner labels Jul 13, 2021
@ghost
Copy link

ghost commented Jul 13, 2021

Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

When using the Regex.Match function that also takes the start index and the length on the input string, and matching the input against a pattern containing '^' and '$' (using the multiline option), the function wrongfully throws an argument out of range exception if a match can be 'started' but cannot be found.

string input   = "ABC\n";

Regex  pattern = new Regex(@"^A$", RegexOptions.Multiline);

Match m = pattern.Match(input, 0, 2); // <-- Argument out of range exception (Parameter 'start'), for lengths 2 or 3.

(The match can be 'started' because of the 'A', but then the '$' cannot be matched to either end-of-line or end-of-text, I guess).

Other combinations, like the pattern '^A' (without '$'), also caused the exception (For different yet similar input strings, start indices and lengths) but we couldn't quite figure out when and why.

Configuration:

OS: Windows 10 Enterprise, 20H2
Arch: x64
Target framework: .NET5
Installed:
.NET SDK 5.0.301 (x64)
Microsoft.AspNetCore.App 5.0.7
Microsoft.NETCore.App 5.0.7
Microsoft.WindowsDesktop.App 5.0.7
(And some previous versions too.)
Visual studio:
Professional 2019 16.10.2

Regression:

This worked fine using .NET Framework 4.7.

Other information:

The exact exception says:
System.ArgumentOutOfRangeException: 'Specified argument was out of the range of valid values. (Parameter 'start')'
With the inner exception being:
at System.Text.RegularExpressions.RegexInterpreter.FindFirstChar()
at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick, TimeSpan timeout)
at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
at System.Text.RegularExpressions.Regex.Match(String input, Int32 beginning, Int32 length)
at the code above

Author: strobelleder
Assignees: -
Labels:

area-System.Text.RegularExpressions, untriaged

Milestone: -

@stephentoub stephentoub added bug and removed untriaged New issue has not been triaged by the area owner labels Jul 13, 2021
@stephentoub stephentoub added this to the 6.0.0 milestone Jul 13, 2021
@stephentoub
Copy link
Member

Thanks. This appears to be a regression introduced as part of all the work we did on regex in .NET 5:

C:\Users\stoub\Desktop\tmp> dotnet run -f net48
C:\Users\stoub\Desktop\tmp> dotnet run -f netcoreapp3.1
C:\Users\stoub\Desktop\tmp> dotnet run -f net5.0
Unhandled exception. System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'start')
   at System.Text.RegularExpressions.RegexInterpreter.FindFirstChar()
   at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick, TimeSpan timeout)
   at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
   at System.Text.RegularExpressions.Regex.Match(String input, Int32 beginning, Int32 length)
   at tmp.Program.Main(String[] args) in C:\Users\stoub\Desktop\tmp\Program.cs:line 14
C:\Users\stoub\Desktop\tmp> dotnet run -f net6.0
Unhandled exception. System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'start')
   at System.Text.RegularExpressions.RegexInterpreter.FindFirstChar() in System.Text.RegularExpressions.dll:token 0x60001d9+0x32d
   at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick, TimeSpan timeout) in System.Text.RegularExpressions.dll:token 0x6000291+0xbb
   at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat) in System.Text.RegularExpressions.dll:token 0x6000158+0x2a
   at System.Text.RegularExpressions.Regex.Match(String input, Int32 beginning, Int32 length) in System.Text.RegularExpressions.dll:token 0x600016b+0x0
   at tmp.Program.Main(String[] args) in C:\Users\stoub\Desktop\tmp\Program.cs:line 14

@stephentoub stephentoub self-assigned this Jul 13, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 13, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 13, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants