-
Notifications
You must be signed in to change notification settings - Fork 326
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Better
release
build time; new maximum-performance production
pro…
…file. (#3498) ### Pull Request Description Using the new tooling (#3491), I investigated the **performance / compile-time tradeoff** of different codegen options for release mode builds. By scripting the testing procedure, I was able to explore many possible combinations of options, which is important because their interactions (on both application performance and build time) are complex. I found **two candidate profiles** that offer specific advantages over the current `release` settings (`baseline`): - `thin16`: Supports incremental compiles in 1/3 the time of `baseline` in common cases. Application runs about 2% slower than `baseline`. - `fat1-O4`: Application performs 13% better than `baseline`. Compile time is almost 3x `baseline`, and non-incremental. (See key in first chart for the settings defining these profiles.) We can build faster or run faster, though not in the same build. Because the effect sizes are large enough to be impactful to developer and user experience, respectively, I think we should consider having it both ways. We could **split the `release` profile** into two profiles to serve different purposes: - `release`: A profile that supports fast developer iteration, while offering realistic performance. - `production`: A maximally-optimized profile, for nightly builds and actual releases. Since `wasm-pack` doesn't currently support custom profiles (rustwasm/wasm-pack#1111), we can't use a Cargo profile for `production`; however, we can implement our own profile by overriding rustc flags. ### Performance details ![perf](https://user-images.githubusercontent.com/1047859/170788530-ab6d7910-5253-4a2b-b432-8bfa0b4735ba.png) As you can see, `thin16` is slightly slower than `baseline`; `fat1-O4` is dramatically faster. <details> <summary>Methodology (click to show)</summary> I developed a procedure for benchmarking "whole application" performance, using the new "open project" workflow (which opens the IDE and loads a complex project), and some statistical analysis to account for variance. To gather this data: Build the application with profiling: `./run.sh ide build --profiling-level=debug` Run the `open_project` workflow repeatedly: `for i in $(seq 0 9); do dist/ide/linux-unpacked/enso --entry-point profile --workflow open_project --save-profile open_project_thin16_${i}.json; done` For each profile recorded, take the new `total_self_time` output of the `intervals` tool; gather into CSV: `echo $(for i in $(seq 0 9); do target/rust/debug/intervals < open_project_thin16_${i}.json | tail -n1 | awk '{print $2}'; do` (Note that the output of intervals should not be considered stable; this command may need modification in the future. Eventually it would be nice to support formatted outputs...) The data is ready to graph. I used the `boxplot` method of the [seaborn](https://seaborn.pydata.org/index.html) package, in order to show the distribution of data. </details> #### Build times ![thin16](https://user-images.githubusercontent.com/1047859/170788539-1578e41b-bc30-4f30-9b71-0b0181322fa5.png) In the case of changing a file in `enso-prelude`, with the current `baseline` settings rebuilding takes over 3 minutes. With the `thin16` settings, the same rebuild completes in 40 seconds. (To gather this data on different hardware or in the future, just run the new `bench-build.sh` script for each case to be measured.)
- Loading branch information
Showing
7 changed files
with
244 additions
and
83 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.