-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise Scanner performance #226
Conversation
1ff1fcb
to
7d480bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 😄👍
Would you be able to add a changelog entry and reference the original author as well?
0d23104
to
de4bc65
Compare
@yonaskolb PR rebased and entry added to the changelog |
So I did some benchmarking using SwagGen as a test harness and using XCTest to Results in seconds:
So while this PR improves things slightly it seems something was added after the original #207 that slowed down things even more. After some more investigation it seems #167 caused the slowdown |
@ilyapuchka as the author of #167 do you have any other insights here? |
Interestingly just timed #225 and it was 0.37s |
@yonaskolb I suppose something around here might have caused performance issues https://github.com/stencilproject/Stencil/pull/167/files#diff-91493e167a3f6240a7cc916d9504d439R25 |
I noticed recently that parsing things like |
I've tried to tokenize
So the new lexer does solve the issue and introduces a new problem. The fact that we did not have a testing context showing the improvement in #207 makes it hard to see this as a working optimisation. And since the original author does not seem to use Stencil anymore maybe we should just drop this entirely, see if this reported again and just address the existing performance issue. What do you think? |
I agree, we should better address performance regression introduced by #167 |
@yonaskolb can you create a PR with your performance tests? |
|
@yonaskolb You've mentioned a few times that you timed the tests. Are you just running them locally and checking how long yourself (with |
I've just been timing locally using an XCTest in SwagGen while pointing to specific Stencil commit. I don't know of a way to run performance tests on CI as to fail on performance regressions you need to set a baseline and that won't be stable across different machines |
I don't think we need a test that "fails", more a test that might report the change, either in CI, or here in the PR. For example, SwiftLint reports performance changes in the PR: realm/SwiftLint#2396. |
Now that whole process would be a separate PR, but just for this commit, so we can test locally, a simple test that runs a few times, and logs the average time would be nice. |
We definitely need to configure Danger on those repos at some point ❤️ |
@Liquidsoul I created #235 for mentioned issue |
So, going through the comments again:
|
@djbe the |
I get what you mean, but a crash is an order of magnitude worse. And like I said, it’s not really an error unless you define what the syntax should be for such edge cases.
|
de4bc65
to
3b5c995
Compare
cbf2a5a
to
0bfe195
Compare
Sources/Lexer.swift
Outdated
content = substring | ||
return (string, result) | ||
} | ||
for (index, char) in zip(0..., content.unicodeScalars) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a performance difference between this and content.unicodeScalars.enumerated
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's bad practice in Swift to use .enumerated()
just to be able to iterate with an index.
The new test will have to be updated to the Spectre 9.0 format |
0bfe195
to
3c34933
Compare
@yonaskolb What do you mean the new format? The only difference that I saw in that PR, was just an |
Sources/Lexer.swift
Outdated
return result | ||
} else if char == tokenChar { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else {
foundChar = char == tokenChar
}
let startIndex = content.index(content.startIndex, offsetBy: index) | ||
let result = String(content[..<startIndex]) | ||
content = String(content[content.index(after: startIndex)...]) | ||
range = range.upperBound..<originalContent.index(range.upperBound, offsetBy: index + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ah, it's a state mutation 🤦♂️content
and range
are needed when result
is returned right after they are calculated? I can't see them being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use ...
instead of + 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it took me a while before I grokked the whole thing 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ...
gives a ClosedRange
instead of a Range
, so we can't use that unfortunately.
Sources/Lexer.swift
Outdated
if foundChar && char == Scanner.tokenEndDelimiter { | ||
let startIndex = content.index(content.startIndex, offsetBy: index) | ||
let result = String(content[..<startIndex]) | ||
content = String(content[content.index(after: startIndex)...]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just use prefix
/suffix
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if it impacts performance.
Sources/Lexer.swift
Outdated
content = String(content[startIndex...]) | ||
range = range.upperBound..<originalContent.index(range.upperBound, offsetBy: index - 1) | ||
return (char, result) | ||
} else if char == Scanner.tokenStartDelimiter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else {
foundBrace = char == Scanner.tokenStartDelimiter
}
Can we add a performance test? |
I don't know, you can only really test it with really (really) large templates, or a whole bunch of templates. Maybe leave that for a separate PR? |
we could just add whatever project you are using right now to measure improvement and use its templates in a test? |
I'm using SwaGen (what @yonaskolb was using), it's a whole project, with a whole bunch of templates, and the test isn't perfect as it's a full render, not just the lexer. |
c44698c
to
f0eca34
Compare
I've added a performance test based on the template in #224, repeated a few times 😆. There's no reporting yet, we'll leave that for a different PR (Danger). |
@yonaskolb @ilyapuchka any final thoughts? I'd be ideal to have this in for 0.13. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would definitely need a PR dedicated to add even a little doc-comment on that code; always confusing especially for someone who doesn't work on it daily like me 😉
…ment under Ubuntu Cleanup readability a little bit Rewrite original scan function so it's available. Syntax improvements Fix deprecation warnings in Lexer Cleanup some syntax issues lexer t t
ca5d2b2
to
cb4e514
Compare
This is a rebase of #207
✅ What I did:
❌What I did not do: