-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: Eliminate Regex overhead in AvoidTrailingWhitespace -> Speedup of 5% (PowerShell 5.1) or 2.5 % (PowerShell 7.1-preview.2) #1465
Changes from 5 commits
9a6db4c
095c2f1
eab6312
172ebdd
9456321
c35ec92
aedbc13
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,52 +36,67 @@ public IEnumerable<DiagnosticRecord> AnalyzeScript(Ast ast, string fileName) | |
|
||
var diagnosticRecords = new List<DiagnosticRecord>(); | ||
|
||
string[] lines = Regex.Split(ast.Extent.Text, @"\r?\n"); | ||
string[] lines = ast.Extent.Text.Split(new[] { "\r\n", "\r", "\n" }, StringSplitOptions.None); | ||
|
||
for (int lineNumber = 0; lineNumber < lines.Length; lineNumber++) | ||
{ | ||
var line = lines[lineNumber]; | ||
|
||
var match = Regex.Match(line, @"\s+$"); | ||
if (match.Success) | ||
if (line.Length == 0) | ||
{ | ||
var startLine = lineNumber + 1; | ||
var endLine = startLine; | ||
var startColumn = match.Index + 1; | ||
var endColumn = startColumn + match.Length; | ||
|
||
var violationExtent = new ScriptExtent( | ||
new ScriptPosition( | ||
ast.Extent.File, | ||
startLine, | ||
startColumn, | ||
line | ||
), | ||
new ScriptPosition( | ||
ast.Extent.File, | ||
endLine, | ||
endColumn, | ||
line | ||
)); | ||
|
||
var suggestedCorrections = new List<CorrectionExtent>(); | ||
suggestedCorrections.Add(new CorrectionExtent( | ||
violationExtent, | ||
string.Empty, | ||
ast.Extent.File | ||
)); | ||
continue; | ||
} | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if (line[line.Length - 1] != ' ' && | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this be better as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rjmholt because of
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My thinking here is actually just that PowerShell uses that API to see whitespace. Given how we split the string already, it's possibly dangerous to go by unicode whitespace, but possibly not... I suspect that really this won't make much difference; leaving non-ASCII whitespace at the ends of lines isn't something I can imagine being an issue for anyone really. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so that sounds more like a tendency to use |
||
line[line.Length - 1] != '\t') | ||
{ | ||
continue; | ||
} | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
int startColumnOfTrailingWhitespace = 1; | ||
for (int i = line.Length - 2; i > 0; i--) | ||
{ | ||
if (line[i] != ' ' && line[i] != '\t') | ||
{ | ||
startColumnOfTrailingWhitespace = i + 2; | ||
break; | ||
} | ||
} | ||
|
||
diagnosticRecords.Add( | ||
new DiagnosticRecord( | ||
String.Format(CultureInfo.CurrentCulture, Strings.AvoidTrailingWhitespaceError), | ||
var startLine = lineNumber + 1; | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
var endLine = startLine; | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
var startColumn = startColumnOfTrailingWhitespace; | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
var endColumn = line.Length + 1; | ||
bergmeister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
var violationExtent = new ScriptExtent( | ||
new ScriptPosition( | ||
ast.Extent.File, | ||
startLine, | ||
startColumn, | ||
line | ||
), | ||
new ScriptPosition( | ||
ast.Extent.File, | ||
endLine, | ||
endColumn, | ||
line | ||
)); | ||
|
||
var suggestedCorrections = new List<CorrectionExtent>(); | ||
suggestedCorrections.Add(new CorrectionExtent( | ||
violationExtent, | ||
GetName(), | ||
GetDiagnosticSeverity(), | ||
ast.Extent.File, | ||
null, | ||
suggestedCorrections | ||
string.Empty, | ||
ast.Extent.File | ||
)); | ||
} | ||
|
||
diagnosticRecords.Add( | ||
new DiagnosticRecord( | ||
String.Format(CultureInfo.CurrentCulture, Strings.AvoidTrailingWhitespaceError), | ||
violationExtent, | ||
GetName(), | ||
GetDiagnosticSeverity(), | ||
ast.Extent.File, | ||
null, | ||
suggestedCorrections | ||
)); | ||
} | ||
|
||
return diagnosticRecords; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder: if we're just trying to find the extents of trailing whitespace, there's no need to split the string at all; we should just read through ourselves without allocating all these strings... But too much burden for this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yh, I hear what you say, I guess for perf what counts is the 80-20 rule :-) Technically speaking
string.IndexOf
would probably the fastest way of finding the indices where\s\r
or\s\n
occurs....I'm aware of lot's of other small micro optimisations that one can make and even tried some but they didn't have a measurable outcome. Therefore I am focussed on just fixing what gives at least a measurable return.