Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler performance when compiling built-in derives is worse than desired #80118

Open
rylev opened this issue Dec 17, 2020 · 0 comments
Open
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@rylev
Copy link
Member

rylev commented Dec 17, 2020

This is a follow-up issue to #80050 which was closed due to the implementation not being 100% correct.

Currently, adding #[derive] annotations for built-in traits such as a PartialOrd or Debug causes slower compile times than if the user implements these traits by hand. We should aim for getting these derives to be as close to "free" as possible when it comes to amount of time it takes to compile the code.

The Current State

We currently track the performance of derives in the rustc-perf "derives" benchmark.

PartialOrd is by far the most expensive to derive of the traits in std. The derive benchmark with #[derive(PartialEq, PartialOrd)] takes 9.1s on my machine while just #[derive(PartialEq)] takes 1.9s.

The following is a short experiment where we create a struct with one field 10,000 times and benchmark what happens when the structs have various derives on them. The code is generated using the following ruby script:

File.open('src/lib.rs', 'w') do |file|
  0..10_000.times do |n|
      file.write("pub struct MyType#{n} { pub field: i32 }\n")
  end
end
  • Base (i.e., no derives): 061.s
  • #[derive(Debug)]: 13.33s
  • #[derive(PartialEq)]: 9.78s
  • #[derive(PartialEq, PartialOrd)]: 47.46s
  • #[derive(Clone)]: 6.32s
  • #[derive(Clone, Copy)]: 5.32s
  • #[derive(PartialEq, Eq)]: 14.13s
  • #[derive(Default)]: 4.33s
  • #[derive(Hash)]: 6.52s

It should be noted that it does not appear that any of the overhead is coming from the code expansion itself. When compiling the traits with the code copy/pasted from output from cargo expand, the performance is comparable to deriving the traits.

However, there's definitely some wiggle room beyond "just making compilation in general faster, will make derives faster". I manually implemented Debug and ran the test again, and it compiles 10% faster.

It seems like the plurality of time (~13%) for most of these derives is being spent in type checking though a deeper investigation into each class of derive is warranted. You can see a summarize comparison of derive vs. expanded vs manually implemented here.

Discussion

This topic is already being discussed on Zulip.

Some discussion has suggested we might want to look into MIR shims as a possible solution but since MIR still needs to undergo type checking and we're spending most of the time in type checking, this probably won't help.

@jyn514 jyn514 added A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 17, 2020
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 9, 2021
Fix derived PartialOrd operators

The derived implementation of `partial_cmp` compares matching fields one
by one, stopping the computation when the result of a comparison is not
equal to `Some(Equal)`.

On the other hand the derived implementation for `lt`, `le`, `gt` and
`ge` continues the computation when the result of a field comparison is
`None`, consequently those operators are not transitive and inconsistent
with `partial_cmp`.

Fix the inconsistency by using the default implementation that fall-backs
to the `partial_cmp`. This also avoids creating very deeply nested
closures that were quite costly to compile.

Fixes rust-lang#81373.
Helps with rust-lang#81278, rust-lang#80118.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants