-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing caching for HRTB projection equality bounds (for<'x> T: Trait<'x, Assoc = ...>
).
#99188
Comments
This is weird. I think this is one of those cases where I wish we had between logging tools for capturing how traits are solved. What are the projection predicates that we try to cache? Probably add a debug call to I wonder if this is caused by the Then, the caching might not be in projection predicates, but instead around proving |
I've noticed a lot of tracing usage in When I started looking into this, I adjusted the |
Well, I think we're getting better at adding in I have wanted to try out https://pernos.co/ for myself. Niko has tried it and liked it. Haven't convinced myself it's worth the expense right now. Anyways, if you'd like to find some time to chat in sync about this, maybe that could work. It's hard to know exactly where to start probing without a |
I don't really know for sure when I'll be available for sync chat in the next couple weeks but I did just post a full write-up about looking for that bug, that does include some tracing work, in case that's helpful: https://fasterthanli.me/articles/when-rustc-explodes Happy to share any of the materials I've collected there anywhere that's helpful. |
Just quickly skimmed to the RUSTC_LOG section. Those |
I was looking into why we get all that messy output, and it turns out that So really, the problem is all in rust/compiler/rustc_infer/src/traits/structural_impls.rs Lines 18 to 30 in c80dde4
An easy thing to try is to just change Also, there have to be better ways to print something than |
I would expect those to be created in the leaves, by the uncached calls. If you look further down, there's a list of
My rough understanding is that if However, what that list is missing is the exact nesting of Frankly, we should be able to just use a text-based nesting, and just read off the log without separate tools, once the printing is fixed, none of this has any reason to be as noisy as it is today. |
Related to #20304? |
I applied this suggestion and this made the I generated outputs for levels 1 through 4 with commands like:
The depth 4 output is 25MB so it wouldn't upload on GitHub Gist. |
Okay this is a bit confusing, but AFAICT the problem only starts to show up at depth 3. And "depth 2" is still pretty limited in that it contains Depth 3 investigation(click to open the cleaned up output for depth 3)$ curl -s https://gist.githubusercontent.com/fasterthanlime/c3f5ab67fd6b12d198ef2dd841c31115/raw/756847298cd40474e6df1c8440805c8e884a0995/depth3.txt | rg 'evaluate_predicate_recursively' | sed -E 's/([0-9].* DEBUG )?rustc_\S*::\S* //;s/obligation=Obligation\(//;s/, depth=.*//;s/[┐│├─]/ /g;s/^ //'
(): Sized
(): Sized
(): Sized
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
for<'x> (): Trait<'x>
(): Termination
for<'x> <&&() as Trait<'x>>::A == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x>
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&&() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x>
for<'x> <&&() as Trait<'x>>::C == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x>
for<'x> <&&() as Trait<'x>>::D == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x>
&&(): Sized
for<'x> &&(): Trait<'x>
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x>
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
for<'x> <&() as Trait<'x>>::D == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::C == ()
for<'x> <() as Trait<'x>>::D == ()
(): Sized
for<'x> (): Trait<'x>
&(): Sized
for<'x> &(): Trait<'x> We can further simplify that output by removing non-projection bounds (and their children), and for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&&() as Trait<'x>>::A == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&&() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == ()
for<'x> <&() as Trait<'x>>::B == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::B == () Now that's still very messy, but mostly because of the syntax, and we can strip most of it out: <()>::{A,B}
<&&()>::A
<&()>::A
<()>::{A,B}
<&()>::B
<()>::{A,B}
<&&()>::B
<&()>::A
<()>::{A,B}
<&()>::B
<()>::{A,B}
<&()>::A
<()>::{A,B}
<&()>::B
<()>::{A,B} By removing all the leaves, we can find the "uncached" set (in the sense of <&&()>::A
<&()>::A
<&()>::B
<&&()>::B
<&()>::A
<&()>::B
<&()>::A
<&()>::B And that's the heart of the problem: there are 3 instances of each of Now this is all very messy and frankly hard to keep track of in the full log. What about 1 associated type? for<'x> <() as Trait<'x>>::A == ()
for<'x> <&&() as Trait<'x>>::A == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == ()
for<'x> <&() as Trait<'x>>::A == ()
for<'x> <() as Trait<'x>>::A == () Now hang on a minute, that's triangular! (i.e. quadratic) (As of posting this I'm still trying to demo that the single associated type case is non-linear, but it's tricky) |
Single associated type example: #![recursion_limit = "100000"]
trait Trait<'a> {
type Assoc;
fn method() {}
}
impl<T> Trait<'_> for (T,)
where
for<'x> T: Trait<'x, Assoc = ()>,
{
type Assoc = ();
}
impl Trait<'_> for () {
type Assoc = ();
}
type _4<T> = ((((T,),),),);
type _16<T> = _4<_4<_4<_4<T>>>>;
type _64<T> = _16<_16<_16<_16<T>>>>;
type _128<T> = _64<_64<T>>;
pub fn main() {
// Something that takes long enough to be measurable.
type X<T> = _128<T>;
#[cfg(factor = "1")]
<X<()> as Trait>::method();
#[cfg(factor = "2")]
<X<X<()>> as Trait>::method();
#[cfg(factor = "3")]
<X<X<X<()>>> as Trait>::method();
#[cfg(factor = "4")]
<X<X<X<X<()>>>> as Trait>::method();
} $ curl -O https://gist.githubusercontent.com/eddyb/d17d5303c544c19f78569392b635c813/raw/0aa1c947c91450f3d2e227b9d58d9ea19fe6ebf3/proj-tri.rs
$ command time -f 'took %Us' rustc proj-tri.rs --emit=metadata --cfg 'factor = "1"'
took 0.24s
$ command time -f 'took %Us' rustc proj-tri.rs --emit=metadata --cfg 'factor = "2"'
took 1.02s
$ command time -f 'took %Us' rustc proj-tri.rs --emit=metadata --cfg 'factor = "3"'
took 2.83s
$ command time -f 'took %Us' rustc proj-tri.rs --emit=metadata --cfg 'factor = "4"'
took 6.48s
So not only is it at least quadratic, but other effects accumulate and it could plausibly reach cubic (though it may take minutes/hours) and higher powers (not sure if this makes it "super-polynomial" - hard to know without figuring out the asymptote, i.e. do the powers converge?). (Out of curiosity I also tried |
Okay I kept missing it but I was using some of the gnarly printing as an example for why we should replace So That's why the hack to remove the restriction does nothing: it's not getting cached in the polymorphic form! I wonder if this is a fundamental limitation of normalizing under binders (cc @jackh726), or if we can actually have EDIT: so the code I had been looking at before was this: rust/compiler/rustc_trait_selection/src/traits/select/mod.rs Lines 647 to 655 in 1c7b36d
But that does nothing (other than print an rust/compiler/rustc_infer/src/traits/project.rs Lines 196 to 217 in 052495d
|
So I made a few changes (branch is at eddyb@1ac2dca / eddyb@218932c / eddyb@63f5e68 at the time of posting), and then used For viewing it I recommend further filtering, e.g.: Locally with You can see how Note that because we currently hide |
Pushed another commit (eddyb@20e2693 - note that if you want to cherry-pick you'll also need eddyb@63f5e68), with a hacky fix attempt (also cache under the original key if the result has no placeholders anywhere). before: $ command time -f 'took %Us' rustc +A proj-exp.rs --emit=metadata --cfg 'depth = "8"'
took 1.72s
$ command time -f 'took %Us' rustc +A proj-exp.rs --emit=metadata --cfg 'depth = "9"'
took 7.32s after: $ command time -f 'took %Us' rustc +A proj-exp.rs --emit=metadata --cfg 'depth = "8"'
took 0.05s
$ command time -f 'took %Us' rustc +A proj-exp.rs --emit=metadata --cfg 'depth = "9"'
took 0.05s So finally we have a confirmed hypothesis! (though no idea if we can actually land such a fix) |
I tried this on the real-world codebase where this bug was originally discovered. cargo checkWithout the fix (nightly 2022-06-08):
With @eddyb's "hacky fix":
Note: the post-workaround version of that code (just don't have tower services that borrow anything) typechecks in 1.28s. cargo buildWithout the fix (nightly 2022-06-08):
With @eddyb's "hacky fix":
The post-workaround version of the code builds in 4.1s. Maybe other unfortunate cache interactions during codegen? |
Removing So it's not just codegen in general, but something else in your crate that is somehow codegen-only. (This may be the first time that e.g. Oh and one last thing: unless you build a "release"
|
Could this be related to #95402? I worked a while back on minimizing it, but I wasn't able to interpret the tracing logs. To repeat my example code: pub trait Foo: Circular {}
pub trait Circular {
type Unit;
}
impl<'a> Foo for &'a () {}
impl<'a> Circular for &'a ()
where
&'a (): Circular<Unit = ()>,
{
type Unit = ();
} |
@LegionMammal978 I don't think so, you're not using |
Originally reduced from a production application using the
tower
crate - see @fasterthanlime's reduced repro repo for more background (though note that it exhibits several other failure modes as well)The above example currently takes an exponential amount of time to compile, based on the type depth:
With every extra type layer, the time increases by ~4x, and that aligns well with there being 4 associated types.
While this example is a bit silly, it doesn't take more than two associated types (both constrained at once) to cause the issue (although at higher depth or with a larger constant factor from having additional bounds).
And I'm guessing it might be one of the main causes for applications built with the
tower
crate to experience long compile times (its main trait,Service
, hasResponse
andError
as two associated types), in large part because that's the kind of codebase this was originally reduced from (as per the note at the top of this issue).The reason I suspect caching is a combination of factors:
evaluate_predicate_recursively
found an exponential ramp in terms of duplicates (i.e. the number of times each unique obligation shows up), and many of them wereProjectionPredicate
s-Z self-profile
, thoughRUSTC_LOG
might also work (and not require compiler changes)ProjectionPredicate
s' had a bound lifetime listed in theirBinder
ProjectionCacheKey
returnsNone
iff there are bound variablesProjectionCacheKey
not holding aParamEnv
is probably risky long-term)for<'x> T: Trait<'x, ...
withT: Trait<'static, ...
removes the exponential slowdownSo the next step was to to try always caching (warning: this is actually an unsound quick hack,
ProjectionCacheKey
should be modified to carry aBinder<ProjectionTy>
instead of aProjectionTy
):However, the above hack does not appear to help and we're unsure why - the main other kind of
Predicate
thatevaluate_predicate_recursively
appears to process isTraitPredicate
, which IIUC is almost always locally cached?(EDIT: turns out that the caching is more complex than initially assumed and requires further changes - see #99188 (comment) for a success)
Then again, there are other reductions that don't involve
ProjectionPredicate
and still cause their own exponential curves, so there might be several different caching issues.I primarily wanted to open this issue on its own because there's definitely something weird with
ProjectionCacheKey
(even if a proper fix might be way more involved than just it), and there's a clear correlation between the number of associated types (being constrained) and the exponential curve.cc @rust-lang/types
The text was updated successfully, but these errors were encountered: