Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly save a byte representation of the dep-graph and work-product index #83322

Closed
wants to merge 6 commits into from

Conversation

cjgillot
Copy link
Contributor

Those files are internal to the incremental engine. They are not meant to be portable.

r? @ghost

@cjgillot
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 20, 2021
@bors
Copy link
Contributor

bors commented Mar 20, 2021

⌛ Trying commit de6aaf996d41dddc68ba0f1e5702c42ecd316236 with merge 6a12039846406dd55e03e6296ebaaab7cd258012...

@bors
Copy link
Contributor

bors commented Mar 20, 2021

☀️ Try build successful - checks-actions
Build commit: 6a12039846406dd55e03e6296ebaaab7cd258012 (6a12039846406dd55e03e6296ebaaab7cd258012)

@rust-timer
Copy link
Collaborator

Queued 6a12039846406dd55e03e6296ebaaab7cd258012 with parent 41b315a, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (6a12039846406dd55e03e6296ebaaab7cd258012): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 20, 2021
@cjgillot
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 20, 2021
@bors
Copy link
Contributor

bors commented Mar 20, 2021

⌛ Trying commit a1ea84fbd3fd67d65505013e7d031cf30c6f5d10 with merge 13d82bed22285425f21da9852b33173df9108b4e...

@bors
Copy link
Contributor

bors commented Mar 20, 2021

☀️ Try build successful - checks-actions
Build commit: 13d82bed22285425f21da9852b33173df9108b4e (13d82bed22285425f21da9852b33173df9108b4e)

@rust-timer
Copy link
Collaborator

Queued 13d82bed22285425f21da9852b33173df9108b4e with parent 61edfd5, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (13d82bed22285425f21da9852b33173df9108b4e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2021
@cjgillot cjgillot marked this pull request as ready for review March 21, 2021 11:30
@cjgillot
Copy link
Contributor Author

r? @michaelwoerister

@michaelwoerister
Copy link
Member

Thanks for the PR, @cjgillot!

What's the difference between "raw" and "opaque"? The opaque encoder/decoder are already meant to be the "raw" serialization framework. Is there anything that the "raw" version does that the opaque version could not be made to do?

@cjgillot
Copy link
Contributor Author

The opaque serialization uses LEB128 encoding, so as to have an architecture independent file. The raw encoding dumps bytes as they are.

@michaelwoerister
Copy link
Member

Ah, that makes sense. How does that affect the size of the dep-graph file?

@cjgillot
Copy link
Contributor Author

It's difficult to say. LEB128 is variable-length, so encoding small numbers takes less room. In the best case, the LEB128 reduces size by 75%, but in the worst case LEB128 overhead is 15%.
I expect the dep-graph file to get slightly larger with this change. I shall run some benchmarks to get a relevant size estimate.


macro_rules! write_raw {
($enc:expr, $value:expr, $int_ty:ty) => {{
let bytes = $value.to_ne_bytes();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use to_le_bytes instead? This is just as fast on little endian systems, but for example makes it possible to move the incr cache to a big endian system and use then --target.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an actual use case? This PR makes the file format unportable (because of isize/usize mainly), with the objective to memmap part of it. If portability is required, I need to change the implementation to make sure of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I can't name any. I do remember some talk about maybe changing crate metadata to be an incr comp snapshot or something like that in the future. I can't remember the details. In that case portability is very important for cross-compilation.

@michaelwoerister
Copy link
Member

@rust-lang/wg-compiler-performance How far are we away from collecting file sizes via self-profiling and perf.rlo? It should be quite easy to add the recording part to measureme.

@rylev
Copy link
Member

rylev commented Mar 26, 2021

How far are we away from collecting file sizes via self-profiling and perf.rlo? It should be quite easy to add the recording part to measureme.

No one is currently working on this as far as I know. However, I'm also not aware of any reason why this would not be possible to add.

@bors
Copy link
Contributor

bors commented Nov 20, 2021

⌛ Trying commit bf32e3f8ff612fba7ad7e860a661575a1d91b293 with merge c16d991f9b3d8ee4b2b0f31533933924006b013e...

@bors
Copy link
Contributor

bors commented Nov 20, 2021

☀️ Try build successful - checks-actions
Build commit: c16d991f9b3d8ee4b2b0f31533933924006b013e (c16d991f9b3d8ee4b2b0f31533933924006b013e)

@rust-timer
Copy link
Collaborator

Queued c16d991f9b3d8ee4b2b0f31533933924006b013e with parent 93542a8, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c16d991f9b3d8ee4b2b0f31533933924006b013e): comparison url.

Summary: This change led to very large relevant mixed results 🤷 in compiler performance.

  • Very large improvement in instruction counts (up to -5.1% on incr-unchanged builds of stm32f4)
  • Moderate regression in instruction counts (up to 0.6% on full builds of diesel)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 20, 2021
@cjgillot
Copy link
Contributor Author

As expected, there is a ~30% hit on file sizes. This is sizeable. What should be the decision process?
Pros:

  • reduction of instruction count;
  • this is a prerequisite for memmapping those files.

Cons:

  • larger disk usage;
  • slower load time from disk.

We can also wait for a preliminary implementation of memmap dep-graph to benchmark it.

@Mark-Simulacrum
Copy link
Member

A few questions:

Do we have a sense of whether the disk usage is caused by some types/queries in particular? I'm primarily wondering if we can apply some optimizations to shrink the file size impact, since it seems somewhat non-obvious that the impact is strictly necessary to get the mmap-ability.

Do we know what the instruction count improvements are primarily coming from? Can we get those benefits without the disk space increase, or at least, at smaller cost? (e.g., is it due to avoiding variable-sized integer encoding/decoding?)

Also, is the impact limited just to incremental artifacts (I presume so?) -- if so, then the 30-40% impact is likely to be less for smaller workspaces, since their target directory size is probably dominated by compiled dependencies, not the leaf crate's incremental data. For rustc and other large workspaces (where incremental is applied to many crates), though, a 30-40% increase is going to be rather painful -- I'm worried that it may make it harder for folks to use incremental.

@michaelwoerister
Copy link
Member

Do we have a sense of whether the disk usage is caused by some types/queries in particular?

If I'm reading the code correctly this is only about the DepGraph in particular. So what we encode here are mostly node indices, fingerprints, and DepNodes (which internally are a pair of discriminant and fingerprint).

slower load time from disk.

Is this observable somewhere?

Some ideas on reducing the file size:

  • Do we know how DepNode discriminants are encoded? Two bytes should certainly be enough, but worst case they might be encoded as eight bytes.
  • The performance data seems to suggest that we encode about 3.2 million DepNodes for the largest crates (see incr_comp_encode_dep_graph execution count of incr-full build). We might get away with encoding DepNode indices with just 3 bytes, allowing for up to ~16 million nodes.

Regarding the decision process: I don't know :) My thoughts:

  • The instruction count reduction is nice. But it is in a range where it's not entirely clear if it is worth the increased disk space usage.
  • The increased disk space usage is significant. But it isn't so big that it's completely out of the question that we accept it (IMO).
  • The instruction count reduction doesn't really translate to wall-time improvements because we already load the graph in the background.

Having a memmapped implementation that doesn't require any up-front decoding might indeed be helpful for making a decision here.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 12, 2021
@michaelwoerister michaelwoerister added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2021
@bors
Copy link
Contributor

bors commented Feb 20, 2022

☔ The latest upstream changes (presumably #94174) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    Checking rustc_serialize v0.0.0 (/checkout/compiler/rustc_serialize)
    Checking petgraph v0.5.1
    Checking object v0.28.1
    Checking gimli v0.26.1
error: expected `;`, found keyword `self`
   --> compiler/rustc_serialize/src/raw.rs:468:42
    |
468 |         self.emit_raw_bytes(v.as_bytes())
    |                                          ^ help: add `;` here
469 |         self.emit_u8(STR_SENTINEL)
    |         ---- unexpected token
    Checking rustc_macros v0.1.0 (/checkout/compiler/rustc_macros)
error[E0308]: mismatched types
   --> compiler/rustc_serialize/src/raw.rs:468:9
    |
    |
468 |         self.emit_raw_bytes(v.as_bytes())
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^- help: consider using a semicolon here
    |         expected `()`, found enum `Result`
    |
    = note: expected unit type `()`
                    found enum `Result<(), std::io::Error>`

@bors
Copy link
Contributor

bors commented May 20, 2022

☔ The latest upstream changes (presumably #95418) made this pull request unmergeable. Please resolve the merge conflicts.

@cjgillot cjgillot closed this Jul 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.