Speeeeed #1486

sebmck · 2015-05-09T00:34:28Z

Babel could be much faster! Here are some possible areas to look into:

Check list:

Code generator

While this isn't a massive issue, and usually doesn't impact most projects, the code generators performance is pretty poor on large outputs. This is partly due to the copious amounts of lookbehinds and attempts to make the code as "pretty" as possible. There is a lot of room for improvement and this is an area where micro-optimisations pay off at a huge scale. Relevant folder is here.

Grouping more transformers

#1472 bought the ability to merge multiple transformers into the same AST traversal. This impacts the way internal transformers are written but after a bit of getting used to I think I actually prefer this way. Regardless, transformers are now split into 6 groups. The internal transformers can be viewed here.

Reducing these groups may be complicated due to the various concerns that each one may have. ie. es3.memberExpressionLiterals needs to be ran on the entire tree, it needs to visit every single MemberExpression even if they've been modified or dynamically inserted.

Optimising existing transformers

Some transformers spawn "subtraversals". This is problematic as this negates the intention of minimising traversals signifcantly. For example, previously the es6.constants transformer would visit every single node that has it's "own" scope. It then spawns another "subtraversal" that checks all the child nodes for reassignments. This means that there are a lot of unnecessary visiting. Instead, with a75af0a, the transformer traversal is used and a scope binding lookup is done.

This technique (not that ingenius since the previous way was crap) could be used on the es6.blockScoping and _shadowFunctions (this does the arrow function this and arguments aliasing) transformers.

Optimising scope tracking

Similar to optimising existing transformers, the current scope tracking does multiple passes and has room for optimisation. This could all be done in a single pass and then when hitting a binding it could look up the tree for the current scope to attach itself to.

Attach comments in parser

Currently estraverse is used to attach comments. This isn't great, it means that an entire traversal is required in order to attach comments. This can be moved to the parser, similar to espree that's used in ESLint.

Include comments in token stream

Currently tokens and comments are concatenated together and sorted. This is so newlines can be retained between nodes. This is relatively inefficient since you're sorting a large array of possibly millions of elements.

Address regenerator

Regenerator uses ast-types, this means it has it's own set of scope tracking and traversal logic. It's slower than Babel's and is actually relatively heavy, especially on large node trees. There has been a lot of controversy about merging it into Babel and iterating on it from there. Nonetheless, it's a soloution that needs to be considered if all other avenues are unavailable.

I welcome contributions of any kind so any help is extremely appreciated!

cc @amasad @DmitrySoshnikov @gaearon @stefanpenner @babel/contributors

The text was updated successfully, but these errors were encountered:

mrjoelkemp · 2015-05-09T00:54:38Z

Out of curiosity, how did you conclude that these described parts were the bottlenecks? Just runtime complexity analysis? Or was a profiling tool used?

sebmck · 2015-05-09T01:01:49Z

@mrjoelkemp

Basically been a combo of:

Profiling the compilation time of various large ES6 codebases (Ember, Traceur etc)
Benchmarking against competing tools/libraries. eg. comparing internal Babel code generator to escodegen etc.
Complexity analysis. I know certain patterns are slow and when they're repeated, even more so.

Also you can set the DEBUG environment variable to babel and get a bunch of debugging output about how long parsing, transforming and generating takes which is where I've noticed some of the hot points.

$ DEBUG=babel babel script.js

amasad · 2015-05-09T01:14:23Z

It might be worthwhile to check-in the benchmark program to run after major features to make sure we're not regressing. And to also have something to point to when doing performance work. I don't have experience with automated benchmarks. Maybe someone from @lodash or other perf-focused projects can help.

sebmck · 2015-05-09T01:15:44Z

Any advice/references @jdalton? Performance regression tests would actually be amazing.

monsanto · 2015-05-09T01:27:28Z

For me the biggest problem is Babel's startup time. Perhaps this isn't a problem for people who use JavaScript build tools like Gulp, but for the rest of us, it is. babel --version takes like 530ms for me--for comparison, it used to be like <200ms for 6to5.

You can shave off about 100-150ms (don't remember, sorry) by using browserify --bare --ignore-missing to bundle all of the dependencies into a single file. Don't know where the rest is coming from.

sebmck · 2015-05-09T01:30:50Z

Are you using an npm release or a locally linked copy? npm releases are going to be far quicker since the internal templates will be precompiled.

monsanto · 2015-05-09T01:50:33Z

@sebmck npm. Here's babel's startup time over versions (run several times):

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
1.15.0
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.22s user 0.01s system 104% cpu 0.217 total

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
2.13.7
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.28s user 0.02s system 104% cpu 0.284 total

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
3.6.5
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.34s user 0.02s system 106% cpu 0.333 total

time /usr/local/bin/iojs node_modules/.bin/babel --version
4.7.16
/usr/local/bin/iojs node_modules/.bin/babel --version  0.36s user 0.04s system 106% cpu 0.373 total

time /usr/local/bin/iojs node_modules/.bin/babel --version
5.2.17
/usr/local/bin/iojs node_modules/.bin/babel --version  0.43s user 0.04s system 105% cpu 0.440 total

The 530ms figure is for a virtual machine, not sure why it's slower, but whatever. Anyway--we've doubled our startup time since the early days. Here's the time bundled:

time /usr/local/bin/iojs ./babel-bundle.js --version
5.2.17
/usr/local/bin/iojs ./babel-bundle.js --version  0.30s user 0.04s system 102% cpu 0.326 total

monsanto · 2015-05-09T01:54:44Z

One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server. Similar to Nailgun for Java. Would also let us take advantage of the JIT since any optimization is thrown away each invocation currently.

sebmck · 2015-05-09T01:56:37Z

Probably has largely to do with just the additional lines of code/dependencies and that a lot of stuff is done at runtime (visitor normalisation etc).

One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server.

I believe @thejameskyle had some thoughts on this. It's something TypeScript does and it gives IDEs access to internal TypeScript information to improve integration. Not sure how relevant it'd be for Babel but it's possibly something worth looking into.

amasad · 2015-05-09T01:58:03Z

We already do that in the React Packager (currently hidden in React Native, but we have plans on releasing it as a standalone thing). And I think webpack does it for you when you use the dev server? If not I think it might webpack might be a good place to add that.

monsanto · 2015-05-09T02:05:03Z

@amasad Yeah, plenty of JS build tools do this already. It would be nice of our CLI application did too for those of us who don't use a JS tool as the top-level driver in the build. Maybe this isn't so common in the JS world so I am the only person who has this problem?

sebmck · 2015-05-09T15:56:18Z

So after I merged #1472, compiling the Ember source went from 50s to 44s and Traceur went from 30s to 24s.

After commit f657598, Ember now compiles in 35s and Traceur in 19s. It was relatively small and had a huge improvement, hopefully there are places where more of the same optimisations can be done. Any help in finding them is much appreciated!

megawac · 2015-05-11T23:20:38Z

Performance regression tests would actually be amazing.

A simple way to do it is to have babel registered as a npm dep. Then just run benchmarks on master and the previous version and throw errors when the speed is x% slower. This can be done pretty easily with benchmark.js.

If anyone wants to wire up some tests under test/benchmark using benchmark.js, I'd be happy to modify them up to test against previous/multiple versions

sebmck · 2015-05-11T23:21:40Z

@megawac The issue is that simple benchmarks are pretty useless. It depends completely on feature use and AST structure so getting realistic and practical benchmarks is extremely difficult.

megawac · 2015-05-11T23:28:17Z

The issue is that simple benchmarks are pretty useless.

Sure but it makes it simplifies the process of detecting regression in certain parts of operations such as parsing, transformations, regeneration, etc. Further it makes retesting the perf effects of a changeset easier than profiling manually. OFC changing how any operations works may change performance, just makes it easier to determine by how much

sebmck · 2015-05-11T23:31:03Z

Large codebases are the only real productive type of benchmarking that you can do (at least that I've found, happy to be proven wrong). Even though the perf work I've done has increased performance by ~40% I've noticed no change in the time it takes to run the Babel tests for example.

phpnode · 2015-05-12T22:00:56Z

The benchmark could simply compare the compile time of babel, regenerator, bluebird and lodash (to pick 3 largish random examples that are already dependencies) with the current master as @megawac said. It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless, and babel uses enough babel features itself to make the measurements meaningful.

sebmck · 2015-05-12T22:02:16Z

@phpnode

It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless

Nope. It has to do significantly more work when certain ES6 features are used. In fact sometimes additional passes of the entire AST are done when you use a specific feature.

phpnode · 2015-05-12T22:16:54Z

@sebmck ok sure, but this is just a starting point. As time goes on larger babel powered codebases will become available, and babel itself will presumably start using more and more of its own features now that it is self hosted.

sebmck · 2015-05-13T03:01:02Z

src/babel/transformation/transformers/spec/proto-to-assign.js: Parse start +8ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Parse stop +3ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start set AST +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start scope building +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: End scope building +6ms
src/babel/transformation/transformers/spec/proto-to-assign.js: End set AST +2ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start module formatter init +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: End module formatter init +2ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-setup +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-setup +1ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-basic +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-basic +2ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-advanced +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-advanced +1ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer regenerator +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer regenerator +31ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-modules +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-modules +6ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-trailing +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-trailing +2ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Generation start +0ms
src/babel/transformation/transformers/spec/proto-to-assign.js: Generation end +3ms

I really don't want to have to gut regenerator 😢 @benjamn any ideas/suggestions?

amasad · 2015-05-13T03:32:10Z

@phpnode if you want to contribute an automated benchmark that'd be sweet. @sebmck has been using Traceur's codebase and it seems to give a good enough signal.

benjamn · 2015-05-13T04:01:22Z

I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator?

benjamn · 2015-05-13T04:05:50Z

Scope tracking is used pretty sparingly in regenerator, so I suspect it might be feasible to switch to babel's traversal and scope logic, if that seems best.

amasad · 2015-05-15T20:58:47Z

😮 awesome!

kevinbarabash · 2015-05-30T20:41:27Z

@sebmck How much time is spent parsing relative to the total time? The reason I ask is that I started working on a project to update an existing AST based on edit operations like insert "a" at line 5, column 10 and delete character at line 20, column 6. This would help with the use case of people editing large files using a watcher. Unfortunately, it would require some sort of integration with editors.

sebmck · 2015-05-30T20:58:21Z

@KevinB7 Parsing time is insignificant and relatively easy to optimise compared to everything else.

sebmck · 2015-05-31T02:14:33Z

Following ba19bd3, the scope tracking has been optimised into a single traversal (instead of multiple). Ember core now compiles in 24s (from 28s).

sebmck · 2015-06-15T09:49:57Z

Following #1753 and #1752 (thanks @loganfsmyth!) and some other previous performance patches since the last update, Ember core now compiles in 18s and Traceur in 10s. ✨

RReverser · 2015-07-04T14:19:41Z

I tried to play with parsing performance for big codebases, and while could improve it by ~25%, the actual difference in seconds is pretty low so not sure whether it will affect the overall result. https://twitter.com/RReverser/status/617334262086410240

amasad · 2015-07-04T16:20:40Z

@RReverser it might add up. I'm not sure how much time is spent on parsing on our codebase now, but we can probably find out. Are there any trade offs to checking in your optimization?

cc @DmitrySoshnikov

RReverser · 2015-07-04T16:24:59Z

I'm currently somewhat blocked by #1920 for this commit :(

In any case, I just optimized case of large codebases (in sense of count of files) like MaterialUI (which has 980 files). And, as you can see from tweet screenshot, difference is not that huge.

I'll let you know as soon as I get issue fixed and this thing commited.

michaelBenin · 2015-08-04T20:16:31Z

Related: gulpjs/gulp#1188

loganfsmyth · 2015-08-04T20:40:49Z

@michaelBenin If I had to guess, that's more likely to be #626. Transpilation speeds are actually pretty good these days, and gulpfiles are generally pretty small.

babel-bot · 2016-09-07T20:04:35Z

Comment originally made by @thejameskyle on 2015-11-20T19:34:13.000Z

@sebmck how do you want to handle this? Do you want to clean this issue up a bit with the latest, close it in favor of separate tickets, or close it as something Babel will just always be working on?

danez · 2016-10-02T13:12:23Z

Closing this, as a lot has happened since the last comment. If there are still areas that should be looked at we should create separate issues for them.

sebmck added help wanted area: perf labels May 9, 2015

sebmck changed the title ~~Speeeeeeeeeeeeeeeeeed~~ Speeeeed May 9, 2015

kondi mentioned this issue May 29, 2015

Arrow function has wrong this context when applying jscript transformer #1651

Closed

sebmck added a commit that referenced this issue May 31, 2015

optimise scope tracking into a single pass - #1486

ba19bd3

sebmck added a commit that referenced this issue May 31, 2015

optimise module metadata retrieval - #1486

2f7743c

sebmck added a commit that referenced this issue May 31, 2015

push comments to token stream to avoid having to re-sort - #1486

82a0851

hzoo mentioned this issue Jun 15, 2015

space-in-parens error for code in a comment block babel/babel-eslint#124

Closed

hzoo mentioned this issue Jul 2, 2015

Better comment attachment #989

Closed

mathieumg mentioned this issue Jul 2, 2015

no-empty error false positives in empty blocks with comments babel/babel-eslint#141

Closed

hzoo mentioned this issue Jul 3, 2015

Comment in empty argument list is misplaced in compiled output #1911

Closed

cvan mentioned this issue Jul 23, 2015

much faster gulp build + reloads (fixes #238) MozillaReality/horizon#240

Merged

sebmck mentioned this issue Aug 5, 2015

6.0 #2168

Closed

hzoo removed transformation labels Jan 20, 2016

hzoo removed the generator label Feb 3, 2016

danez closed this as completed Oct 2, 2016

lock bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label May 6, 2018

lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Speeeeed #1486

Speeeeed #1486

Comments

sebmck commented May 9, 2015

Code generator

Grouping more transformers

Optimising existing transformers

Optimising scope tracking

Attach comments in parser

Include comments in token stream

Address regenerator

mrjoelkemp commented May 9, 2015

sebmck commented May 9, 2015

amasad commented May 9, 2015

sebmck commented May 9, 2015

monsanto commented May 9, 2015

sebmck commented May 9, 2015

monsanto commented May 9, 2015

monsanto commented May 9, 2015

sebmck commented May 9, 2015

amasad commented May 9, 2015

monsanto commented May 9, 2015

sebmck commented May 9, 2015

megawac commented May 11, 2015

sebmck commented May 11, 2015

megawac commented May 11, 2015

sebmck commented May 11, 2015

phpnode commented May 12, 2015

sebmck commented May 12, 2015

phpnode commented May 12, 2015

sebmck commented May 13, 2015

amasad commented May 13, 2015

benjamn commented May 13, 2015

benjamn commented May 13, 2015

amasad commented May 15, 2015

kevinbarabash commented May 30, 2015

sebmck commented May 30, 2015

sebmck commented May 31, 2015

sebmck commented Jun 15, 2015

RReverser commented Jul 4, 2015

amasad commented Jul 4, 2015

RReverser commented Jul 4, 2015

michaelBenin commented Aug 4, 2015

loganfsmyth commented Aug 4, 2015

babel-bot commented Sep 7, 2016

danez commented Oct 2, 2016