-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeeeed #1486
Comments
Out of curiosity, how did you conclude that these described parts were the bottlenecks? Just runtime complexity analysis? Or was a profiling tool used? |
Basically been a combo of:
Also you can set the $ DEBUG=babel babel script.js |
It might be worthwhile to check-in the benchmark program to run after major features to make sure we're not regressing. And to also have something to point to when doing performance work. I don't have experience with automated benchmarks. Maybe someone from @lodash or other perf-focused projects can help. |
Any advice/references @jdalton? Performance regression tests would actually be amazing. |
For me the biggest problem is Babel's startup time. Perhaps this isn't a problem for people who use JavaScript build tools like Gulp, but for the rest of us, it is. You can shave off about 100-150ms (don't remember, sorry) by using |
Are you using an npm release or a locally linked copy? npm releases are going to be far quicker since the internal templates will be precompiled. |
@sebmck npm. Here's babel's startup time over versions (run several times):
The 530ms figure is for a virtual machine, not sure why it's slower, but whatever. Anyway--we've doubled our startup time since the early days. Here's the time bundled:
|
One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server. Similar to Nailgun for Java. Would also let us take advantage of the JIT since any optimization is thrown away each invocation currently. |
Probably has largely to do with just the additional lines of code/dependencies and that a lot of stuff is done at runtime (visitor normalisation etc).
I believe @thejameskyle had some thoughts on this. It's something TypeScript does and it gives IDEs access to internal TypeScript information to improve integration. Not sure how relevant it'd be for Babel but it's possibly something worth looking into. |
We already do that in the React Packager (currently hidden in React Native, but we have plans on releasing it as a standalone thing). And I think webpack does it for you when you use the dev server? If not I think it might webpack might be a good place to add that. |
@amasad Yeah, plenty of JS build tools do this already. It would be nice of our CLI application did too for those of us who don't use a JS tool as the top-level driver in the build. Maybe this isn't so common in the JS world so I am the only person who has this problem? |
So after I merged #1472, compiling the Ember source went from 50s to 44s and Traceur went from 30s to 24s. After commit f657598, Ember now compiles in 35s and Traceur in 19s. It was relatively small and had a huge improvement, hopefully there are places where more of the same optimisations can be done. Any help in finding them is much appreciated! |
A simple way to do it is to have If anyone wants to wire up some tests under |
@megawac The issue is that simple benchmarks are pretty useless. It depends completely on feature use and AST structure so getting realistic and practical benchmarks is extremely difficult. |
Sure but it makes it simplifies the process of detecting regression in certain parts of operations such as parsing, transformations, regeneration, etc. Further it makes retesting the perf effects of a changeset easier than profiling manually. OFC changing how any operations works may change performance, just makes it easier to determine by how much |
Large codebases are the only real productive type of benchmarking that you can do (at least that I've found, happy to be proven wrong). Even though the perf work I've done has increased performance by ~40% I've noticed no change in the time it takes to run the Babel tests for example. |
The benchmark could simply compare the compile time of babel, regenerator, bluebird and lodash (to pick 3 largish random examples that are already dependencies) with the current master as @megawac said. It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless, and babel uses enough babel features itself to make the measurements meaningful. |
Nope. It has to do significantly more work when certain ES6 features are used. In fact sometimes additional passes of the entire AST are done when you use a specific feature. |
@sebmck ok sure, but this is just a starting point. As time goes on larger babel powered codebases will become available, and babel itself will presumably start using more and more of its own features now that it is self hosted. |
I really don't want to have to gut regenerator 😢 @benjamn any ideas/suggestions? |
I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator? |
Scope tracking is used pretty sparingly in regenerator, so I suspect it might be feasible to switch to babel's traversal and scope logic, if that seems best. |
😮 awesome! |
@sebmck How much time is spent parsing relative to the total time? The reason I ask is that I started working on a project to update an existing AST based on edit operations like |
@KevinB7 Parsing time is insignificant and relatively easy to optimise compared to everything else. |
Following ba19bd3, the scope tracking has been optimised into a single traversal (instead of multiple). Ember core now compiles in 24s (from 28s). |
Following #1753 and #1752 (thanks @loganfsmyth!) and some other previous performance patches since the last update, Ember core now compiles in 18s and Traceur in 10s. ✨ |
I tried to play with parsing performance for big codebases, and while could improve it by ~25%, the actual difference in seconds is pretty low so not sure whether it will affect the overall result. https://twitter.com/RReverser/status/617334262086410240 |
@RReverser it might add up. I'm not sure how much time is spent on parsing on our codebase now, but we can probably find out. Are there any trade offs to checking in your optimization? |
I'm currently somewhat blocked by #1920 for this commit :( In any case, I just optimized case of large codebases (in sense of count of files) like MaterialUI (which has 980 files). And, as you can see from tweet screenshot, difference is not that huge. I'll let you know as soon as I get issue fixed and this thing commited. |
Related: gulpjs/gulp#1188 |
@michaelBenin If I had to guess, that's more likely to be #626. Transpilation speeds are actually pretty good these days, and gulpfiles are generally pretty small. |
@sebmck how do you want to handle this? Do you want to clean this issue up a bit with the latest, close it in favor of separate tickets, or close it as something Babel will just always be working on? |
Closing this, as a lot has happened since the last comment. If there are still areas that should be looked at we should create separate issues for them. |
Babel could be much faster! Here are some possible areas to look into:
Check list:
es6.tailCall
es6.blockScoping
es6.objectSuper
es6.classes
es6.constants
_shadowFunctions
Code generator
While this isn't a massive issue, and usually doesn't impact most projects, the code generators performance is pretty poor on large outputs. This is partly due to the copious amounts of lookbehinds and attempts to make the code as "pretty" as possible. There is a lot of room for improvement and this is an area where micro-optimisations pay off at a huge scale. Relevant folder is here.
Grouping more transformers
#1472 bought the ability to merge multiple transformers into the same AST traversal. This impacts the way internal transformers are written but after a bit of getting used to I think I actually prefer this way. Regardless, transformers are now split into 6 groups. The internal transformers can be viewed here.
Reducing these groups may be complicated due to the various concerns that each one may have. ie.
es3.memberExpressionLiterals
needs to be ran on the entire tree, it needs to visit every singleMemberExpression
even if they've been modified or dynamically inserted.Optimising existing transformers
Some transformers spawn "subtraversals". This is problematic as this negates the intention of minimising traversals signifcantly. For example, previously the
es6.constants
transformer would visit every single node that has it's "own" scope. It then spawns another "subtraversal" that checks all the child nodes for reassignments. This means that there are a lot of unnecessary visiting. Instead, with a75af0a, the transformer traversal is used and a scope binding lookup is done.This technique (not that ingenius since the previous way was crap) could be used on the
es6.blockScoping
and_shadowFunctions
(this does the arrow functionthis
andarguments
aliasing) transformers.Optimising scope tracking
Similar to optimising existing transformers, the current scope tracking does multiple passes and has room for optimisation. This could all be done in a single pass and then when hitting a binding it could look up the tree for the current scope to attach itself to.
Attach comments in parser
Currently
estraverse
is used to attach comments. This isn't great, it means that an entire traversal is required in order to attach comments. This can be moved to the parser, similar to espree that's used in ESLint.Include comments in token stream
Currently tokens and comments are concatenated together and sorted. This is so newlines can be retained between nodes. This is relatively inefficient since you're sorting a large array of possibly millions of elements.
Address regenerator
Regenerator uses
ast-types
, this means it has it's own set of scope tracking and traversal logic. It's slower than Babel's and is actually relatively heavy, especially on large node trees. There has been a lot of controversy about merging it into Babel and iterating on it from there. Nonetheless, it's a soloution that needs to be considered if all other avenues are unavailable.I welcome contributions of any kind so any help is extremely appreciated!
cc @amasad @DmitrySoshnikov @gaearon @stefanpenner @babel/contributors
The text was updated successfully, but these errors were encountered: