-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nonlinear compile time blow-up with deeply nested types #38528
Comments
This may be the effect of the projection cache. |
cc @alexcrichton This is obviously a show-stopper for futures. |
I can do some profiling. I've had some plans for improving collection/trans that I think may be related. One question to try and answer is what %age of this is just "we are making more code" vs "we are wasting time doing things in trait selection that could be cached". I have observed the latter from time to time and had some thoughts on how to fix it. |
I've also seen this before, with tokio-socks5 as well. Removing just a handful of the trait objects in that file makes the compile time of the crate shoot from 2.34s to 89.52s (!!) |
Here's a simple example of something that takes a very long time to compile: future::ok::<(),()>(()).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())).and_then(|()| Ok(())); |
Closed #40280 as a duplicate, moved some example code into the issue header. |
Just a stupid question ‒ the „translation item collection“, which is one of the two culprits here, is listed in the „MIR optimisations“. Does it really have to happen at all on non-optimised debug build? |
It's natural to read it that way, but the headers are actual after their group, so it's part of |
This seems to be hard enough to take some time fixing. So I thought of a workaround, if someone is also interested. It uses the trick with placing trait objects to split the chains of modifiers, but only on testing/debug builds where the compilation speed matters, while it keeps the complex but hopefully faster concrete types in release build: This one is for streams (that's what I needed), but can obviously work for futures or other things as well: #[cfg(debug_assertions)]
fn test_boxed<T, E, S>(s: S) -> Box<Stream<Item = T, Error = E>>
where S: Stream<Item = T, Error = E> + 'static
{
Box::new(s)
}
#[cfg(not(debug_assertions))]
fn test_boxed<S>(s: S) -> S {
s
} (I didn't find any better config option than the |
Compilation of tests is *really* slow, most likely due to rust-lang/rust#41696 or rust-lang/rust#38528
Repeating another bad case from #42941: extern crate futures;
use futures::{future, IntoFuture, Future};
fn main() {
let t: std::result::Result<(), ()> = Ok(());
let f = t
.into_future()
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()))
.and_then(|_| future::ok(()));
f.wait();
} The code above takes ~750s to compile on my laptop (you can make it shorter/longer by removing/adding
The extra ~250s is spent between |
Another observation from #42941 about why this is becoming even more important is that the newly-released hyper |
triage: P-high We are raising to "high priority" to at least do some investigation and try to determine whether revised trait solving strategies will be of use here. |
This commit should be reverted once that issue has been resolved.
status: waiting for niko |
Bump. This really hurts |
@arielb1 You seem to have created this FIXME. Can you give me some hints on implementing cross-infcx cache? |
I have launched a PoC in #48296 and I confirm that the exponential part should be resolved. I still observe some degree of polynomial time algorithms, which makes rustc chokes at about 50 |
@hcpl I suspect you copied the wrong code for the specialized version above, as I observe the same time-passes behavior and the code lacks Can you check again, and if possible, provide the correct specialized version code? |
Update: although I have fixed the normalization part, the typeck still seems to be exponential on time, and very memory consuming. I'm investigating the cause of this second issue. |
I have implemented a fix for the typeck part too, and I believe there's no more exponential algorithms in rustc anymore. |
@ishitatsuyuki the code is correct, but both I've changed filenames to |
Fix exponential projection complexity on nested types This implements solution 1 from #38528 (comment). The code quality is currently extremely poor, but we can improve them during review. Blocking issues: - we probably don't want a quadratic deduplication for obligations. - is there an alternative to deduplication? Based on #48315. Needs changelog. Noticable improvement on compile time is expected. Fix #38528 Close #39684 Close #43757
…sakis Fix exponential projection complexity on nested types This implements solution 1 from rust-lang#38528 (comment). The code quality is currently extremely poor, but we can improve them during review. Blocking issues: - we probably don't want a quadratic deduplication for obligations. - is there an alternative to deduplication? Based on rust-lang#48315. Needs changelog. Noticable improvement on compile time is expected. Fix rust-lang#38528 Close rust-lang#39684 Close rust-lang#43757
Compilation of tests is *really* slow, most likely due to rust-lang/rust#41696 or rust-lang/rust#38528
This commit should be reverted once that issue has been resolved.
Consider changing assert! to debug_assert! when it calls visit_with The perf run from rust-lang#52956 revealed that there were 3 benchmarks that benefited most from changing `assert!`s to `debug_assert!`s: - issue rust-lang#46449: avg -4.7% for -check - deeply-nested (AKA rust-lang#38528): avg -3.4% for -check - regression rust-lang#31157: avg -3.2% for -check I analyzed their fixing PRs and decided to look for potentially heavy assertions in the files they modified. I noticed that all of the non-trivial ones contained indirect calls to `visit_with()`. It might be a good idea to consider changing `assert!` to `debug_assert!` in those places in order to get the performance wins shown by the benchmarks.
Consider changing assert! to debug_assert! when it calls visit_with The perf run from rust-lang#52956 revealed that there were 3 benchmarks that benefited most from changing `assert!`s to `debug_assert!`s: - issue rust-lang#46449: avg -4.7% for -check - deeply-nested (AKA rust-lang#38528): avg -3.4% for -check - regression rust-lang#31157: avg -3.2% for -check I analyzed their fixing PRs and decided to look for potentially heavy assertions in the files they modified. I noticed that all of the non-trivial ones contained indirect calls to `visit_with()`. It might be a good idea to consider changing `assert!` to `debug_assert!` in those places in order to get the performance wins shown by the benchmarks.
remove obligation dedup from `impl_or_trait_obligations` Looking at the examples from rust-lang#38528 they all seem to compile fine even without this and it seems like this might be unnecessary effort
- we need buffers in the deserialization methods, so i want ahead and added them to the serialization methods too - `boxed` in the `and_then` chain apparently makes both runtime performance and compile time usable, see: [here]()rust-lang/rust#38528
Compiling the postgres-tokio crate at sfackler/rust-postgres@d27518b goes from 5 seconds to 45 seconds on nightly if the two
.boxed()
calls in the middle of this call chain are removed: https://github.com/sfackler/rust-postgres/blob/d27518ba76d76ccaa59b3ccd63e981bd8bd0ef33/postgres-tokio/src/lib.rs#L342-L408.Looks like 15 seconds is spent in translation item collection, and 39 seconds is spent in translation:
Things are significantly worse on 1.13 - 2 minutes in translation!
Some discussion in IRC: https://botbot.me/mozilla/rust-internals/2016-12-22/?msg=78294648&page=1
cc @aturon
UPDATE: #40280 was closed as a duplicate of this. It had the following sample code:
--nmatsakis
The text was updated successfully, but these errors were encountered: