Apply BOLT optimizations without rebuilding LLVM #107723

Kobzol · 2023-02-06T12:33:46Z

This PR adds an explicit BOLT bootstrap step which applies BOLT on the fly when LLVM artifacts are copied to a sysroot (it only does this once per bootstrap invocation, the result is cached). This avoids one LLVM rebuild in the Linux CI dist build.

r? @jyn514

Kobzol · 2023-02-06T12:33:51Z

@bors try

bors · 2023-02-06T12:34:01Z

⌛ Trying commit e686fc13c0b16035839e374ead4ff9fef0d68cc1 with merge 5866a9a1a511dcca9566a367b0ccdf69e2b0aedf...

jyn514 · 2023-02-06T15:05:48Z

I don't have time for reviews right now.

r? @Mark-Simulacrum cc @nikic

bors · 2023-02-06T15:19:05Z

☀️ Try build successful - checks-actions
Build commit: 5866a9a1a511dcca9566a367b0ccdf69e2b0aedf (5866a9a1a511dcca9566a367b0ccdf69e2b0aedf)

Kobzol · 2023-02-06T15:26:51Z

Hmm, the build has worked, but BOLT was executed multiple times and the whole build wasn't very fast. We should probably add information about the individual bootstrap step durations into the CI timer first.

jyn514 · 2023-02-06T15:52:20Z

We should probably add information about the individual bootstrap step durations into the CI timer first.

Doesn't that already exist? I remember seeing RUSTC-TIMER log output or something like that

Kobzol · 2023-02-06T15:54:02Z

Yes, it's printed during the build, and stored into metrics.json, but it's not parsed by the Python PGO script yet, so we cannot see the aggregated information in the nice timing table at the end. I'll send a PR soon that will fix that.

Kobzol · 2023-02-07T11:53:03Z

@bors try

bors · 2023-02-07T11:53:11Z

⌛ Trying commit 250528f2590671e8d865d663c38aab0620a66916 with merge 6b1d08dcfe231d72d8f2310c341771dd724d43fa...

bors · 2023-02-07T14:03:28Z

☀️ Try build successful - checks-actions
Build commit: 6b1d08dcfe231d72d8f2310c341771dd724d43fa (6b1d08dcfe231d72d8f2310c341771dd724d43fa)

bors · 2023-02-07T14:03:28Z

☀️ Try build successful - checks-actions
Build commit: 6b1d08dcfe231d72d8f2310c341771dd724d43fa (6b1d08dcfe231d72d8f2310c341771dd724d43fa)

Kobzol · 2023-02-07T14:22:19Z

It seems that with this optimization (however it's implemented in the end), we can get to ~2h 5m Linux dist time. I'll check if perf hasn't regressed first though.

@rust-timer build 6b1d08dcfe231d72d8f2310c341771dd724d43fa

rust-timer · 2023-02-07T17:14:59Z

Finished benchmarking commit (6b1d08dcfe231d72d8f2310c341771dd724d43fa): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[0.4%, 1.4%]	3
Regressions ❌ (secondary)	3.2%	[1.7%, 4.3%]	8
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.0%	[0.4%, 1.4%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.1%	[-3.1%, -3.1%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-3.1%	[-3.1%, -3.1%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.1%	[-2.1%, -2.0%]	2
All ❌✅ (primary)	-	-	0

lqd · 2023-02-07T17:23:50Z

these benchmarks are currently noisy

nikic · 2023-02-07T17:43:51Z

Any thoughts on the variant described in #107521 (comment)? I think this approach (and #107521 as well) is going to be something of a dead end, because it leaves behind some problems that can't really be addressed while keeping the general approach of reusing LLVM artifacts from a previous build. Apart from the unnecessary final rustc rebuild (which is at least fairly fast), we also do the previous rustc build with a BOLT-instrumented LLVM, so that build ends up being slow. The other consideration is how this is going to generalize to optimizing rustc itself with BOLT, where sharing artifacts from previous builds would be less straightforward.

Kobzol · 2023-02-07T18:12:28Z

I discussed your idea with @jyn514 (https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/bootstrap.20LLVM.20postprocess.20step), and based on that discussion, I implemented a BOLT bootstrap step (I pushed it now) that applies BOLT changes on-the-fly when LLVM is copied to sysroot. But I'm not sure if it's exactly what you had in mind.

Apart from the unnecessary final rustc rebuild (which is at least fairly fast), we also do the previous rustc build with a BOLT-instrumented LLVM, so that build ends up being slow.

This could be solved simply by only performing the BOLT steps in stage2/dist, right? We want to apply BOLT when we copy LLVM to stage2 sysroot, but not before.

The other consideration is how this is going to generalize to optimizing rustc itself with BOLT, where sharing artifacts from previous builds would be less straightforward.

BOLT for rustc has been quite lackluster for now, so I wouldn't worry about it at the moment, first I'd like to focus on LLVM to reduce the CI time as soon as possible. But I suppose that for rustc it wouldn't be that different, we can apply the BOLT step after stage2 build finishes, we'll just need to make sure that we will bust the cache before the next build.

Kobzol · 2023-02-07T22:05:39Z

@bors try @rust-timer queue

nikic · 2023-02-16T08:20:25Z

This looks reasonable to me. Thanks for working on it!

Mark-Simulacrum

r=me unless these cleanups seem like improvements worth making

src/bootstrap/dist.rs

…ame file

Kobzol · 2023-03-04T16:37:07Z

@bors try @rust-timer queue

bors · 2023-03-04T16:37:15Z

⌛ Trying commit 9aad2ad with merge 341579aa2fdf71f726bc9f49d02b0160e27a9edf...

bors · 2023-03-04T18:45:29Z

☀️ Try build successful - checks-actions
Build commit: 341579aa2fdf71f726bc9f49d02b0160e27a9edf (341579aa2fdf71f726bc9f49d02b0160e27a9edf)

rust-timer · 2023-03-04T20:56:13Z

Finished benchmarking commit (341579aa2fdf71f726bc9f49d02b0160e27a9edf): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Kobzol · 2023-03-04T22:05:34Z

Perf. looks good. Rebased on master and fixed nits.

@rustbot ready

Mark-Simulacrum · 2023-03-04T22:26:43Z

@bors r+

Thanks!

bors · 2023-03-04T22:26:45Z

📌 Commit 9aad2ad has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2023-03-05T05:15:53Z

⌛ Testing commit 9aad2ad with merge 35636f9...

bors · 2023-03-05T07:56:17Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 35636f9 to master...

rust-timer · 2023-03-05T09:31:40Z

Finished benchmarking commit (35636f9): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.9%	[3.9%, 3.9%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.8%	[0.8%, 0.8%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.8%	[0.8%, 0.8%]	1

rustbot assigned jyn514 Feb 6, 2023

rustbot assigned Mark-Simulacrum and unassigned jyn514 Feb 6, 2023

Kobzol force-pushed the bootstrap-bolt branch from e686fc1 to 250528f Compare February 7, 2023 11:52

This comment has been minimized.

Sign in to view

rustbot added the perf-regression Performance regression. label Feb 7, 2023

This comment has been minimized.

Sign in to view

Mark-Simulacrum approved these changes Feb 25, 2023

View reviewed changes

src/bootstrap/dist.rs Outdated Show resolved Hide resolved

src/bootstrap/dist.rs Outdated Show resolved Hide resolved

Mark-Simulacrum added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 27, 2023

Kobzol added 4 commits March 4, 2023 16:37

Apply BOLT optimizations without rebuilding LLVM

c5d65aa

Create BOLT build steps to avoid running BOLT multiple times on the s…

bfc220a

…ame file

Try to avoid the last rustc rebuild

91bb563

Add check for dry run

9aad2ad

Kobzol force-pushed the bootstrap-bolt branch from 8cefd27 to 9aad2ad Compare March 4, 2023 16:36

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 4, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 4, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 5, 2023

bors merged commit 35636f9 into rust-lang:master Mar 5, 2023

rustbot added this to the 1.70.0 milestone Mar 5, 2023

Kobzol deleted the bootstrap-bolt branch March 5, 2023 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply BOLT optimizations without rebuilding LLVM #107723

Apply BOLT optimizations without rebuilding LLVM #107723

Kobzol commented Feb 6, 2023 •

edited

Loading

Kobzol commented Feb 6, 2023

bors commented Feb 6, 2023

jyn514 commented Feb 6, 2023

bors commented Feb 6, 2023

Kobzol commented Feb 6, 2023

jyn514 commented Feb 6, 2023

Kobzol commented Feb 6, 2023

Kobzol commented Feb 7, 2023

bors commented Feb 7, 2023

bors commented Feb 7, 2023

bors commented Feb 7, 2023

Kobzol commented Feb 7, 2023 •

edited

Loading

This comment has been minimized.

rust-timer commented Feb 7, 2023

lqd commented Feb 7, 2023

nikic commented Feb 7, 2023

Kobzol commented Feb 7, 2023

Kobzol commented Feb 7, 2023

This comment has been minimized.

nikic commented Feb 16, 2023

Mark-Simulacrum left a comment

Kobzol commented Mar 4, 2023

This comment has been minimized.

bors commented Mar 4, 2023

bors commented Mar 4, 2023

This comment has been minimized.

rust-timer commented Mar 4, 2023

Kobzol commented Mar 4, 2023

Mark-Simulacrum commented Mar 4, 2023

bors commented Mar 4, 2023

bors commented Mar 5, 2023

bors commented Mar 5, 2023

rust-timer commented Mar 5, 2023

Apply BOLT optimizations without rebuilding LLVM #107723

Apply BOLT optimizations without rebuilding LLVM #107723

Conversation

Kobzol commented Feb 6, 2023 • edited Loading

Kobzol commented Feb 6, 2023

bors commented Feb 6, 2023

jyn514 commented Feb 6, 2023

bors commented Feb 6, 2023

Kobzol commented Feb 6, 2023

jyn514 commented Feb 6, 2023

Kobzol commented Feb 6, 2023

Kobzol commented Feb 7, 2023

bors commented Feb 7, 2023

bors commented Feb 7, 2023

bors commented Feb 7, 2023

Kobzol commented Feb 7, 2023 • edited Loading

This comment has been minimized.

rust-timer commented Feb 7, 2023

Overall result: ❌ regressions - ACTION NEEDED

lqd commented Feb 7, 2023

nikic commented Feb 7, 2023

Kobzol commented Feb 7, 2023

Kobzol commented Feb 7, 2023

This comment has been minimized.

nikic commented Feb 16, 2023

Mark-Simulacrum left a comment

Choose a reason for hiding this comment

Kobzol commented Mar 4, 2023

This comment has been minimized.

bors commented Mar 4, 2023

bors commented Mar 4, 2023

This comment has been minimized.

rust-timer commented Mar 4, 2023

Overall result: no relevant changes - no action needed

Kobzol commented Mar 4, 2023

Mark-Simulacrum commented Mar 4, 2023

bors commented Mar 4, 2023

bors commented Mar 5, 2023

bors commented Mar 5, 2023

rust-timer commented Mar 5, 2023

Overall result: no relevant changes - no action needed

Kobzol commented Feb 6, 2023 •

edited

Loading

Kobzol commented Feb 7, 2023 •

edited

Loading