Optimise Scanner performance #226

Liquidsoul · 2018-08-24T12:40:20Z

This is a rebase of #207

✅ What I did:

squash of the commits
integrate the upstream changes about range reporting to fix the tests during the rebase process
remove Swift 3 from the build matrix

❌What I did not do:

benchmark if there is an actual performance improvement

yonaskolb

LGTM 😄👍
Would you be able to add a changelog entry and reference the original author as well?

Liquidsoul · 2018-09-01T10:16:31Z

@yonaskolb PR rebased and entry added to the changelog

yonaskolb · 2018-09-02T05:55:46Z

So I did some benchmarking using SwagGen as a test harness and using XCTest to measure just the stencil file generation

Results in seconds:

0.11.0 - 0.24
Optimise Scanner performance #207 - 0.15
0.12.0 - 5.9
Swift 4.1 #228 (this pr) - 4.5

So while this PR improves things slightly it seems something was added after the original #207 that slowed down things even more.

After some more investigation it seems #167 caused the slowdown
before: 0.25
after: 6.0

yonaskolb · 2018-09-02T06:13:33Z

@ilyapuchka as the author of #167 do you have any other insights here?

yonaskolb · 2018-09-02T06:18:22Z

Interestingly just timed #225 and it was 0.37s

ilyapuchka · 2018-09-06T21:39:26Z

@yonaskolb I suppose something around here might have caused performance issues https://github.com/stencilproject/Stencil/pull/167/files#diff-91493e167a3f6240a7cc916d9504d439R25

ilyapuchka · 2018-09-06T21:45:01Z

I noticed recently that parsing things like func some() {{% some tag %} fails due to usage of {{%. I see that there are some changes to the parsing of starting tokens here and wonder if this is still the case with this change and, if so, if it can be addressed @Liquidsoul

Liquidsoul · 2018-09-07T14:12:42Z

I've tried to tokenize func some() {{% if %} using this updated Lexer and the existing one. Here is the results:

the new lexer does not handle this well and crashes even.
the old lexer fails to parse the if tag but does not stop the execution.

So the new lexer does solve the issue and introduces a new problem.
Instead of failing to capture the tag as in the old one the new lexer crashes.
I see that as a blocking regression.

The fact that we did not have a testing context showing the improvement in #207 makes it hard to see this as a working optimisation. And since the original author does not seem to use Stencil anymore maybe we should just drop this entirely, see if this reported again and just address the existing performance issue.

What do you think?

ilyapuchka · 2018-09-07T14:25:11Z

I agree, we should better address performance regression introduced by #167

ilyapuchka · 2018-09-08T13:05:35Z

@yonaskolb can you create a PR with your performance tests?

djbe · 2018-09-10T11:37:30Z

Swift 4.1 #228 has been merged so this should be next.
Swift 4.1 #228 introduced a performance loss in this commit: 93ccc56
Investigate if String/Substring is the performance issue, and how to solve this.

djbe · 2018-09-10T11:38:23Z

@yonaskolb You've mentioned a few times that you timed the tests. Are you just running them locally and checking how long yourself (with time diff or something), or do you have an actual test case we could run on CI?

yonaskolb · 2018-09-10T11:45:05Z

I've just been timing locally using an XCTest in SwagGen while pointing to specific Stencil commit. I don't know of a way to run performance tests on CI as to fail on performance regressions you need to set a baseline and that won't be stable across different machines

djbe · 2018-09-10T11:47:04Z

I don't think we need a test that "fails", more a test that might report the change, either in CI, or here in the PR. For example, SwiftLint reports performance changes in the PR: realm/SwiftLint#2396.

djbe · 2018-09-10T11:47:50Z

Now that whole process would be a separate PR, but just for this commit, so we can test locally, a simple test that runs a few times, and logs the average time would be nice.

AliSoftware · 2018-09-10T13:22:28Z

We definitely need to configure Danger on those repos at some point ❤️

ilyapuchka · 2018-09-11T17:27:50Z

@Liquidsoul I created #235 for mentioned issue

djbe · 2018-09-11T21:58:08Z

So, going through the comments again:

Errors logs improvements #167 (error log improvements) ~~added a massive slowdown, we should try to figure out why and solve this in a separate PR.~~ Solved with Performance improvements #230.
This PR seems to have a net positive effect, similar to the old Optimise Scanner performance #207.
Both this and the old Optimise Scanner performance #207 PR add a bug with {{%, this needs to be solved before merging this.
We need to decide what {{% ... should do, and add a test for it (see Parsing {{% is incorrectly parsed #235).
We need some form of performance tests to keep track of things.

Liquidsoul · 2018-09-12T08:02:21Z

@djbe the {{% error already exists in the current code base, this is not something that's introduced by this change.
However, as I said here, this PR introduces a crash instead of a parsing failure when there is a syntax error in the stencil file.
So as I said, if we want to implement this optimisation, we may wish to rewrite this to ensure that no regressions are introduced.

djbe · 2018-09-12T08:39:18Z

I get what you mean, but a crash is an order of magnitude worse. And like I said, it’s not really an error unless you define what the syntax should be for such edge cases.

yonaskolb · 2018-09-23T00:21:32Z

Sources/Lexer.swift

-          content = substring
-          return (string, result)
-        }
+    for (index, char) in zip(0..., content.unicodeScalars) {


Is there a performance difference between this and content.unicodeScalars.enumerated?

It's bad practice in Swift to use .enumerated() just to be able to iterate with an index.

yonaskolb · 2018-09-23T06:56:48Z

The new test will have to be updated to the Spectre 9.0 format

djbe · 2018-09-23T10:34:04Z

@yonaskolb What do you mean the new format? The only difference that I saw in that PR, was just an XCTestCase class added in each file, which this PR already has since it's been rebased on master.

ilyapuchka · 2018-09-23T10:48:40Z

Sources/Lexer.swift

        return result
+      } else if char == tokenChar {


else { foundChar = char == tokenChar }

ilyapuchka · 2018-09-23T10:50:20Z

Sources/Lexer.swift

+        let startIndex = content.index(content.startIndex, offsetBy: index)
+        let result = String(content[..<startIndex])
+        content = String(content[content.index(after: startIndex)...])
+        range = range.upperBound..<originalContent.index(range.upperBound, offsetBy: index + 1)


~~why content and range are needed when result is returned right after they are calculated? I can't see them being used.~~ ah, it's a state mutation 🤦‍♂️

can use ... instead of + 1 ?

Yeah, it took me a while before I grokked the whole thing 😄

Using ... gives a ClosedRange instead of a Range, so we can't use that unfortunately.

ilyapuchka · 2018-09-23T10:53:57Z

Sources/Lexer.swift

+      if foundChar && char == Scanner.tokenEndDelimiter {
+        let startIndex = content.index(content.startIndex, offsetBy: index)
+        let result = String(content[..<startIndex])
+        content = String(content[content.index(after: startIndex)...])


why not just use prefix/suffix?

I'll see if it impacts performance.

ilyapuchka · 2018-09-23T10:55:44Z

Sources/Lexer.swift

+        content = String(content[startIndex...])
+        range = range.upperBound..<originalContent.index(range.upperBound, offsetBy: index - 1)
+        return (char, result)
+      } else if char == Scanner.tokenStartDelimiter {


else { foundBrace = char == Scanner.tokenStartDelimiter }

ilyapuchka · 2018-09-23T11:00:32Z

Can we add a performance test?

djbe · 2018-09-23T11:01:44Z

I don't know, you can only really test it with really (really) large templates, or a whole bunch of templates. Maybe leave that for a separate PR?

ilyapuchka · 2018-09-23T11:05:36Z

we could just add whatever project you are using right now to measure improvement and use its templates in a test?

djbe · 2018-09-23T11:07:57Z

I'm using SwaGen (what @yonaskolb was using), it's a whole project, with a whole bunch of templates, and the test isn't perfect as it's a full render, not just the lexer.

djbe · 2018-09-23T22:17:13Z

I've added a performance test based on the template in #224, repeated a few times 😆.

There's no reporting yet, we'll leave that for a different PR (Danger).

djbe · 2018-09-25T00:49:14Z

@yonaskolb @ilyapuchka any final thoughts? I'd be ideal to have this in for 0.13.

AliSoftware

Would definitely need a PR dedicated to add even a little doc-comment on that code; always confusing especially for someone who doesn't work on it daily like me 😉

Sources/Lexer.swift

…ment under Ubuntu Cleanup readability a little bit Rewrite original scan function so it's available. Syntax improvements Fix deprecation warnings in Lexer Cleanup some syntax issues lexer t t

This was referenced Aug 24, 2018

[WIP] Faster scanner rebase #225

Closed

Optimise Scanner performance #207

Closed

Liquidsoul force-pushed the faster-scanner branch from 1ff1fcb to 7d480bd Compare August 27, 2018 08:52

yonaskolb mentioned this pull request Aug 30, 2018

Use SwiftStencilKit yonaskolb/SwagGen#111

Merged

yonaskolb previously approved these changes Sep 1, 2018

View reviewed changes

yonaskolb mentioned this pull request Sep 1, 2018

Swift 4.1 #228

Merged

Liquidsoul force-pushed the faster-scanner branch from 0d23104 to de4bc65 Compare September 1, 2018 10:15

yonaskolb mentioned this pull request Sep 2, 2018

Critical performance issue #224

Closed

djbe mentioned this pull request Sep 11, 2018

Add Package.swift for Swift 4.2 #233

Closed

djbe mentioned this pull request Sep 21, 2018

Allow conditions in variable node #243

Merged

djbe force-pushed the faster-scanner branch from de4bc65 to 3b5c995 Compare September 21, 2018 22:55

djbe mentioned this pull request Sep 22, 2018

Parsing {{% is incorrectly parsed #235

Closed

djbe force-pushed the faster-scanner branch from cbf2a5a to 0bfe195 Compare September 23, 2018 01:55

yonaskolb reviewed Sep 23, 2018

View reviewed changes

djbe force-pushed the faster-scanner branch from 0bfe195 to 3c34933 Compare September 23, 2018 10:31

ilyapuchka reviewed Sep 23, 2018

View reviewed changes

djbe force-pushed the faster-scanner branch 2 times, most recently from c44698c to f0eca34 Compare September 23, 2018 11:44

djbe mentioned this pull request Sep 24, 2018

Release 0.13.0 (or 0.12.2) #250

Closed

djbe mentioned this pull request Sep 25, 2018

Release 0.13.0 #251

Merged

AliSoftware approved these changes Sep 25, 2018

View reviewed changes

Sources/Lexer.swift Show resolved Hide resolved

ethorpe and others added 6 commits September 26, 2018 00:33

Rewrites scanner for better performance. This is primarily an improve…

07a6b2a

…ment under Ubuntu Cleanup readability a little bit Rewrite original scan function so it's available. Syntax improvements Fix deprecation warnings in Lexer Cleanup some syntax issues lexer t t

Add test for crashing

4f84627

Add changelog entry

e77bd22

Add lexer test for escape sequence

652dcd2

Add performance test (no reporting yet)

fff93f1

Code documentation

cb4e514

djbe force-pushed the faster-scanner branch from ca5d2b2 to cb4e514 Compare September 25, 2018 22:33

djbe merged commit 6f9bb3e into stencilproject:master Sep 25, 2018

Liquidsoul deleted the faster-scanner branch June 4, 2019 08:57

Optimise Scanner performance #226

Optimise Scanner performance #226

Conversation

Liquidsoul commented Aug 24, 2018

yonaskolb left a comment

Choose a reason for hiding this comment

Liquidsoul commented Sep 1, 2018

yonaskolb commented Sep 2, 2018 • edited Loading

yonaskolb commented Sep 2, 2018

yonaskolb commented Sep 2, 2018

ilyapuchka commented Sep 6, 2018

ilyapuchka commented Sep 6, 2018

Liquidsoul commented Sep 7, 2018

ilyapuchka commented Sep 7, 2018

ilyapuchka commented Sep 8, 2018

djbe commented Sep 10, 2018

djbe commented Sep 10, 2018

yonaskolb commented Sep 10, 2018

djbe commented Sep 10, 2018

djbe commented Sep 10, 2018

AliSoftware commented Sep 10, 2018

ilyapuchka commented Sep 11, 2018

djbe commented Sep 11, 2018 • edited Loading

Liquidsoul commented Sep 12, 2018

djbe commented Sep 12, 2018 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yonaskolb commented Sep 23, 2018

djbe commented Sep 23, 2018

Choose a reason for hiding this comment

ilyapuchka Sep 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilyapuchka commented Sep 23, 2018

djbe commented Sep 23, 2018

ilyapuchka commented Sep 23, 2018

djbe commented Sep 23, 2018

djbe commented Sep 23, 2018

djbe commented Sep 25, 2018

AliSoftware left a comment

Choose a reason for hiding this comment

yonaskolb commented Sep 2, 2018 •

edited

Loading

djbe commented Sep 11, 2018 •

edited

Loading

ilyapuchka Sep 23, 2018 •

edited

Loading