-
Notifications
You must be signed in to change notification settings - Fork 660
perf(rome_cli): replace global allocator with jemalloc #3237
Conversation
✅ Deploy Preview for rometools canceled.
|
Parser conformance results on ubuntu-latestjs/262
jsx/babel
symbols/microsoft
ts/babel
ts/microsoft
|
Have you compared performance with mimalloc? Depending on different allocators for every platform increases the risk of platforms specific issues and requires us to test every change on every platform. That's why I would prefer to use the default system allocator for all (maybe except windows) platforms or replace the default for all platforms. |
You know... this will make it more difficult for me to prove that any of my work reducing allocation is any good :P |
Would you mind explaining why replacing an allocator will make it harder to prove reducing allocation ? Forgive me this stupid question. |
If every single allocations becomes cheaper than reducing the number of allocations has less effect on the overall performance. @ematipico Would you mind also comparing the memory consumption? Do you have links that compare the different allocators? |
Thanks |
Technically we are already depending on different allocators for different platforms, since we use whatever allocator is provided by the OS. I don't know if Also, what's the impact of this change on the size of the |
What was your reasoning why you preferred |
The reason why I tried I will give it a whirl in terms of memory usages once I manage to download again all the necessary software. We can close it if you think it's not worth it. |
For Windows the reason was simply that |
The results seem very promising! My main concern is that I want to avoid having one custom allocator for every platform that we support. Using |
So, should we just switch to use this allocator for all the OS expect windows? |
Probably. But it would still be interesting to get some more numbers:
|
Comparing perf(rome_cli): replace global allocator with jemalloc for macos Snapshot #3 to median since last deploy of rome.tools.
1 page testedHomeBrowser previews
Most significant changes28 other significant changes: JS Parse & Compile on Chrome Desktop, First Contentful Paint on Motorola Moto G Power, 3G connection, Largest Contentful Paint on Motorola Moto G Power, 3G connection, Speed Index on Motorola Moto G Power, 3G connection, Total Page Size in Bytes on Chrome Desktop, Total Page Size in Bytes on iPhone, 4G LTE, Total Page Size in Bytes on Motorola Moto G Power, 3G connection, Number of Requests on Motorola Moto G Power, 3G connection, Number of Requests on Chrome Desktop, Number of Requests on iPhone, 4G LTE, Time to Interactive on Motorola Moto G Power, 3G connection, First Contentful Paint on Chrome Desktop, First Contentful Paint on iPhone, 4G LTE, Time to Interactive on Chrome Desktop, Speed Index on Chrome Desktop, Speed Index on iPhone, 4G LTE, Largest Contentful Paint on iPhone, 4G LTE, Time to Interactive on iPhone, 4G LTE, Total Blocking Time on Motorola Moto G Power, 3G connection, Largest Contentful Paint on Chrome Desktop, Total Image Size in Bytes on Chrome Desktop, Total Image Size in Bytes on iPhone, 4G LTE, Total Image Size in Bytes on Motorola Moto G Power, 3G connection, Total HTML Size in Bytes on Chrome Desktop, Total HTML Size in Bytes on iPhone, 4G LTE, Total HTML Size in Bytes on Motorola Moto G Power, 3G connection, Lighthouse Performance Score on Motorola Moto G Power, 3G connection, Lighthouse Performance Score on Chrome Desktop Calibre: Site dashboard | View this PR | Edit settings | View documentation |
7806668
to
3e557de
Compare
!bench_formatter |
Analyzer Benchmark Results
|
Parser Benchmark Results
|
Formatter Benchmark Results
|
Uhm, what happened with typescript? |
Something went wrong during the download of the file:
|
@rome/staff I updated the description of the issue with some findings around memory consumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the allocation numbers are representative because they're only measuring how much memory was requested and freed by the system allocator which is expected to be less if you use a custom allocator that re-uses allocations before requesting new pages from the system allocator.
However, you shared some numbers over slack which are promising.
I checked with activity monitory. With jemalloc the memory usage is around 20MB top. main instead is 45MB top
I think the results overall are very promising and I verified them on my local machine. There are a few cases where performance regresses or does not improve, but the numbers greatly improve for all benchmarks traversing the CST.
group jemalloc main
----- -------- ----
analyzer/css.js 1.00 1026.2±11.93µs 11.3 MB/sec 1.24 1268.7±5.37µs 9.2 MB/sec
analyzer/index.js 1.00 2.9±0.05ms 11.1 MB/sec 1.16 3.4±0.01ms 9.6 MB/sec
analyzer/lint.ts 1.00 1541.7±7.90µs 27.0 MB/sec 1.16 1782.6±239.92µs 23.3 MB/sec
analyzer/parser.ts 1.00 3.6±0.05ms 13.5 MB/sec 1.15 4.2±0.01ms 11.7 MB/sec
analyzer/router.ts 1.00 2.4±0.02ms 25.4 MB/sec 1.23 3.0±0.01ms 20.7 MB/sec
analyzer/statement.ts 1.00 3.3±0.06ms 10.9 MB/sec 1.30 4.2±0.01ms 8.4 MB/sec
analyzer/typescript.ts 1.00 5.1±0.05ms 10.6 MB/sec 1.25 6.4±0.11ms 8.5 MB/sec
formatter/checker.ts 1.00 220.9±1.81ms 11.8 MB/sec 1.15 253.8±4.23ms 10.2 MB/sec
formatter/compiler.js 1.00 123.4±3.05ms 8.5 MB/sec 1.13 139.6±0.90ms 7.5 MB/sec
formatter/d3.min.js 1.00 102.9±0.91ms 2.5 MB/sec 1.10 112.7±1.26ms 2.3 MB/sec
formatter/dojo.js 1.00 6.7±0.02ms 10.2 MB/sec 1.16 7.8±0.03ms 8.8 MB/sec
formatter/ios.d.ts 1.00 140.1±0.76ms 13.3 MB/sec 1.17 164.1±2.29ms 11.4 MB/sec
formatter/jquery.min.js 1.00 28.9±0.34ms 2.9 MB/sec 1.10 31.8±0.85ms 2.6 MB/sec
formatter/math.js 1.00 207.7±1.46ms 3.1 MB/sec 1.17 243.5±4.20ms 2.7 MB/sec
formatter/parser.ts 1.00 4.7±0.02ms 10.4 MB/sec 1.13 5.3±0.07ms 9.3 MB/sec
formatter/pixi.min.js 1.00 115.9±0.86ms 3.8 MB/sec 1.12 130.0±1.78ms 3.4 MB/sec
formatter/react-dom.production.min.js 1.00 34.8±0.13ms 3.3 MB/sec 1.12 39.1±0.38ms 2.9 MB/sec
formatter/react.production.min.js 1.00 1710.6±5.75µs 3.6 MB/sec 1.13 1940.7±8.91µs 3.2 MB/sec
formatter/router.ts 1.00 3.7±0.02ms 16.8 MB/sec 1.13 4.1±0.05ms 14.9 MB/sec
formatter/tex-chtml-full.js 1.00 267.5±1.42ms 3.4 MB/sec 1.10 295.0±3.07ms 3.1 MB/sec
formatter/three.min.js 1.00 129.2±0.59ms 4.5 MB/sec 1.11 143.5±0.84ms 4.1 MB/sec
formatter/typescript.js 1.00 842.4±13.77ms 11.3 MB/sec 1.15 972.7±15.36ms 9.8 MB/sec
formatter/vue.global.prod.js 1.00 44.5±0.48ms 2.7 MB/sec 1.12 49.6±0.84ms 2.4 MB/sec
parser/checker.ts 1.09 61.2±1.92ms 42.5 MB/sec 1.00 56.1±0.49ms 46.3 MB/sec
parser/compiler.js 1.00 33.3±0.39ms 31.5 MB/sec 1.06 35.1±0.19ms 29.8 MB/sec
parser/d3.min.js 1.00 21.6±0.75ms 12.1 MB/sec 1.01 21.7±0.07ms 12.1 MB/sec
parser/dojo.js 1.00 1879.6±25.53µs 36.5 MB/sec 1.08 2.0±0.01ms 33.8 MB/sec
parser/ios.d.ts 1.10 51.4±0.91ms 36.3 MB/sec 1.00 46.9±0.72ms 39.8 MB/sec
parser/jquery.min.js 1.00 5.7±0.09ms 14.5 MB/sec 1.04 5.9±0.01ms 13.9 MB/sec
parser/math.js 1.05 42.4±1.74ms 15.3 MB/sec 1.00 40.5±1.97ms 16.0 MB/sec
parser/parser.ts 1.00 1311.6±11.85µs 37.1 MB/sec 1.10 1444.8±12.49µs 33.7 MB/sec
parser/pixi.min.js 1.05 26.6±0.52ms 16.5 MB/sec 1.00 25.4±0.09ms 17.2 MB/sec
parser/react-dom.production.min.js 1.00 7.9±0.04ms 14.6 MB/sec 1.00 7.9±0.07ms 14.6 MB/sec
parser/react.production.min.js 1.00 386.0±5.42µs 15.9 MB/sec 1.12 433.3±16.39µs 14.2 MB/sec
parser/router.ts 1.00 1108.6±12.05µs 55.4 MB/sec 1.10 1214.3±23.91µs 50.6 MB/sec
parser/tex-chtml-full.js 1.10 58.7±1.90ms 15.5 MB/sec 1.00 53.4±0.21ms 17.1 MB/sec
parser/three.min.js 1.06 30.9±1.11ms 19.0 MB/sec 1.00 29.1±0.06ms 20.2 MB/sec
parser/typescript.js 1.03 240.5±2.73ms 39.5 MB/sec 1.00 234.5±17.22ms 40.5 MB/sec
parser/vue.global.prod.js 1.00 9.9±0.16ms 12.2 MB/sec 1.00 9.9±0.03ms 12.2 MB/sec
Summary
This PR replaces the default global allocator to use
jemallocator
on all operative systems, exception for Windows because it's not supported.Test Plan
Here's some benchmarks from my machine (I use macOS)
parser
formatter
analyzer
Memory consumption
I used
cargo instruments
and I am not an expert at using the trace tool provided by XCode. I share here the screenshots and the trace file in case someone has more knowledge than me.I run the formatter on the prettier repository, their
src
folder. No--write
.jemalloc
main