Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegexOptions.Multiline not matching text #40577

Closed
VS-ux opened this issue Aug 8, 2020 · 6 comments
Closed

RegexOptions.Multiline not matching text #40577

VS-ux opened this issue Aug 8, 2020 · 6 comments
Labels
area-System.Text.RegularExpressions untriaged New issue has not been triaged by the area owner

Comments

@VS-ux
Copy link

VS-ux commented Aug 8, 2020

Description

  • Using RegexOptions.Multiline on a string with \r\n fails with this regex:
    ^(?:[a-zA-Z]:|\\[\w.]+\[\w.$]+)\(?:[\w]+\)*\w([\w.])+$

Configuration

  • .NET Core 3.1
  • Windows 10 20180
  • x64
  • It is not specific to this config.

Regression?

  • Did this work in a previous build or release of .NET Core, or from .NET Framework? If you can try a previous release or build to find out, that can help us narrow down the problem. If you don't know, that's OK.
    No.

Other information

string textToMatch = "C:\\testpath1.txt\r\nC:\\testpath2.txt";
int matchCount = Regex.Matches(textToMatch, @"^(?:[a-zA-Z]\:|\\\\[\w\.]+\\[\w.$]+)\\(?:[\w]+\\)*\w([\w.])+$", RegexOptions.Compiled | RegexOptions.Multiline).Count;

Console.WriteLine(matchCount); // 1, but if you swap \r\n with \n, it returns 2
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Text.RegularExpressions untriaged New issue has not been triaged by the area owner labels Aug 8, 2020
@ghost
Copy link

ghost commented Aug 8, 2020

Tagging subscribers to this area: @eerhardt, @pgovind
See info in area-owners.md if you want to be subscribed.

@danmoseley
Copy link
Member

Can you please try to minimize the repro code/pattern/text? Like last time, this often makes things clearer.

@VS-ux
Copy link
Author

VS-ux commented Aug 8, 2020

@danmosemsft Hi! The text is just above.
Here is the code to reproduce the issue:

string textToMatch = "C:\\testpath1.txt\r\nC:\\testpath2.txt";
int matchCount = Regex.Matches(textToMatch, @"^(?:[a-zA-Z]\:|\\\\[\w\.]+\\[\w.$]+)\\(?:[\w]+\\)*\w([\w.])+$", RegexOptions.Compiled | RegexOptions.Multiline).Count;

Console.WriteLine(matchCount); // 1, but if you swap \r\n with \n, it returns 2

@danmoseley
Copy link
Member

Yes thanks I see that I’m asking you to minimize the pattern please.

@VS-ux
Copy link
Author

VS-ux commented Aug 8, 2020

Oops, sorry. Here is the simplified version:

string textToMatch = "hello\r\nasd";
int matchCount = Regex.Matches(textToMatch, @"^[a-z]+$", RegexOptions.Compiled | RegexOptions.Multiline).Count;

Console.WriteLine(matchCount); // 1, but if you swap \r\n with \n, it returns 2

The issue is that RegexOptions.Multiline doesn't get rid of \r so it matches 1 character too much.

@danmoseley
Copy link
Member

Right, it only understands \n. The complete description is here:
#25598 (comment)
Do the docs need clarification? Feel free to submit a PR if so (pencil icon in top right of doc).

I’m going to close this as it’s working as designed. The AnyNewLine feature would help!

@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.RegularExpressions untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

3 participants