-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegexOptions.Multiline should be same as line by line #40566
Comments
What are the line endings in your test file? \n or \r\n? If the latter, in the first code snippet above, the $ will match the \n after each \r (since we don’t have the AnyNewLine setting yet). This means your pattern will capture the \r. In the second snippet, the reader removes all \r and \n so it would not match the \r. However I wouldn’t expect your pattern to match a different number of times in this case. What result do you see? Can your share a test file? |
@danmosemsft Thank you for your reply! The test file I used is actually a random binary file, but the results can be seen in a simple test case scenario:
In the "matchCountLineByLine" section, I try to simulate reading the text file line by line, which contains 2 file paths. |
Hi! OK, so I uploaded a System.Memory.dll binary test file that I did some tests on. Here is the code:
For the code the is NOT commented, I get 148 results. For code that is commented (Reading line by line), I get 166 matches. |
I suggest you creates a fixed test file then progressively reduce your test file until it is as small as possible yet still repros the problem and the reason will become clear. You may need a hex editor to see line endings. |
@danmosemsft I think the reason why is in binary file like System.Memory.dll there are some lone \r line endings without \r\n. It then is unable to match it. |
Aha yes! In StreamReader i see it will match a lone \r. That is unusual in real text, I believe it was used by old iMacs. It will not be matched by the ^ and $ (again, until we have AnyNewLine) I think this problem is explained, can we close the issue now? |
@danmosemsft Yes sure. |
Using RegexOptions.Multiline should be the same as reading a file line by line I think.
When reading an entire file, and the Regexing it with RegexOptions.Multiline should be same as reading it line by line
Here is a code sample:
`
string path = @"PathToFile";
using (StreamReader reader = new StreamReader(path))
{
//string s = reader.ReadToEnd();
//int matchCount = Regex.Matches(s, @"^(.+)/([^/]+)$", RegexOptions.Compiled | RegexOptions.Multiline).Count; //I think this should be the same as below:
}
`
This is probably related to #25598
This may be I'm just completely doing it wrong, and I apologize if this is very stupid.
The text was updated successfully, but these errors were encountered: