Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typechecking memory use #31849

Closed
asajeffrey opened this issue Feb 23, 2016 · 17 comments
Closed

Typechecking memory use #31849

asajeffrey opened this issue Feb 23, 2016 · 17 comments
Labels
A-type-system Area: Type system

Comments

@asajeffrey
Copy link

Typechecking the rust library at https://github.com/asajeffrey/wasm/tree/typechecking-blowup uses more than 10G of memory. The program uses a lot of nested generic traits and lifetime polymorphism. The timing details are:

$ multirust run nightly cargo rustc -- -Z time-passes
   Compiling wasm v0.0.1 (file:///home/ajeffrey/github/asajeffrey/wasm)
time: 0.001; rss: 49MB  parsing
time: 0.000; rss: 49MB  configuration 1
time: 0.000; rss: 49MB  recursion limit
time: 0.000; rss: 49MB  gated macro checking
time: 0.000; rss: 49MB  crate injection
time: 0.003; rss: 54MB  macro loading
time: 0.000; rss: 54MB  plugin loading
time: 0.000; rss: 54MB  plugin registration
time: 0.025; rss: 56MB  expansion
time: 0.001; rss: 56MB  complete gated feature checking 1
time: 0.007; rss: 56MB  configuration 2
time: 0.000; rss: 56MB  gated configuration checking
time: 0.003; rss: 56MB  maybe building test harness
time: 0.003; rss: 56MB  prelude injection
time: 0.000; rss: 56MB  checking that all macro invocations are gone
time: 0.000; rss: 56MB  checking for inline asm in case the target doesn't support it
time: 0.000; rss: 56MB  complete gated feature checking 2
time: 0.000; rss: 56MB  const fn bodies and arguments
time: 0.002; rss: 56MB  assigning node ids
time: 0.003; rss: 62MB  lowering ast -> hir
time: 0.001; rss: 62MB  indexing hir
time: 0.000; rss: 62MB  attribute checking
time: 0.002; rss: 62MB  early lint checks
time: 0.001; rss: 60MB  external crate/lib resolution
time: 0.000; rss: 60MB  language item collection
time: 0.005; rss: 68MB  resolution
time: 0.000; rss: 68MB  lifetime resolution
time: 0.000; rss: 68MB  looking for entry point
time: 0.000; rss: 68MB  looking for plugin registrar
time: 0.001; rss: 69MB  region resolution
time: 0.000; rss: 69MB  loop checking
time: 0.000; rss: 69MB  static item recursion checking
time: 0.003; rss: 71MB  type collecting
time: 0.000; rss: 71MB  variance inference
time: 0.027; rss: 89MB  coherence checking
time: 0.014; rss: 89MB  wf checking
time: 0.004; rss: 89MB  item-types checking

at which point the process starts using large amounts of time and memory.

The same behaviour happens with nightly-2015-12-01, so this is not part of the recent rewrite of typechecking.

Discussed on irc with @eddyb, @Manishearth and @aturon.

@nikomatsakis
Copy link
Contributor

cc me

asajeffrey pushed a commit to asajeffrey/parsell that referenced this issue Feb 23, 2016
@nikomatsakis
Copy link
Contributor

On IRC, @asajeffrey posted this self-contained example that exhibits the same phenomena:

pub trait Upcast<T> {
    fn upcast(self) -> T;
}

impl<S1, S2, T1, T2> Upcast<(T1, T2)> for (S1,S2)
    where S1: Upcast<T1>,
          S2: Upcast<T2>,
{
    fn upcast(self) -> (T1, T2) { (self.0.upcast(), self.1.upcast()) }
}

impl Upcast<()> for ()
{
    fn upcast(self) -> () { () }
}

pub trait ToStatic {
    type Static: 'static;
    fn to_static(self) -> Self::Static where Self: Sized;
}

impl<T, U> ToStatic for (T, U)
    where T: ToStatic,
          U: ToStatic
{
    type Static = (T::Static, U::Static);
    fn to_static(self) -> Self::Static { (self.0.to_static(), self.1.to_static()) }
}

impl ToStatic for ()
{
    type Static = ();
    fn to_static(self) -> () { () }
}


trait Factory {
    type Output;
    fn build(&self) -> Self::Output;
}

impl<S,T> Factory for (S, T)
    where S: Factory,
          T: Factory,
          S::Output: ToStatic,
          <S::Output as ToStatic>::Static: Upcast<S::Output>,
{
    type Output = (S::Output, T::Output);
    fn build(&self) -> Self::Output { (self.0.build().to_static().upcast(), self.1.build()) }
}

impl Factory for () {
    type Output = ();
    fn build(&self) -> Self::Output { () }
}

fn main() {
    // Any more parens and it triggers the playpen timeout!
    let it = ((((((),()),()),()),()),());
    it.build();
}

@asajeffrey
Copy link
Author

You beat me to it :) https://play.rust-lang.org/?gist=5183e32b3f0fcdfccf61

@asajeffrey
Copy link
Author

There is a work-around for this example: https://play.rust-lang.org/?gist=ed22452c4eda71821a6d

@asajeffrey
Copy link
Author

Sad to say the work-around doesn't scale up completely. The problem is the interaction between this issue and #30867 (comment).

The work-around for this issue is to make what were type parameters into associated types, the work-around for #30867 (comment) is to turn what were associated types into type parameters. There is a tension here.

@asajeffrey
Copy link
Author

Chaining iterators has a similar problem: https://play.rust-lang.org/?gist=a0c05fbf5c43234fb7c3

@asajeffrey
Copy link
Author

A possible work-around for chaining iterators is to distinguish between top-down and bottom-up iterators: https://play.rust-lang.org/?gist=8532619e7f8bae0ffbef

@asajeffrey
Copy link
Author

Separating out the projection for the iterator from iterator function itself seems to do the trick... https://play.rust-lang.org/?gist=1cf9886dd074e073cd6a

@asajeffrey
Copy link
Author

I implemented the work-around for parsell. The trick was to separate out bottom-up computation of associated types from their top-down use as type parameters. You can see this, for example, in http://asajeffrey.github.io/parsell/target/doc/parsell/trait.StatefulInfer.html which used to be one trait, but is now given as the union of a trait for bottom-up calculation of an Output associated type http://asajeffrey.github.io/parsell/target/doc/parsell/trait.HasOutput.html, and a trait for top-down use of that type http://asajeffrey.github.io/parsell/target/doc/parsell/trait.Stateful.html.

It's a bit round-about, but it does the job of reducing type-checking space/time.

@asajeffrey
Copy link
Author

The work-around took the compile time for https://github.com/asajeffrey/wasm/tree/9297d84654c74d24f1f8cc0ff96e468a39c12bb2 down from 10G to about 200M. It compiles in about 45 sec, which isn't perfect but is a lot better than never terminating, which is what was happening before.

@steveklabnik steveklabnik added the A-type-system Area: Type system label Mar 11, 2016
@nikomatsakis
Copy link
Contributor

As part of some work towards lazy normalization, I added a cache to the associated type projection machinery. This achieved a 10x improvement on the test case that @asajeffrey posted a little while back (with two levels of parens added):

lunch-box. time rustc --stage0 ~/tmp/issue-31849.rs

real    0m6.787s
user    0m6.538s
sys     0m0.236s
lunch-box. time rustc --stage2 ~/tmp/issue-31849.rs

real    0m0.663s
user    0m0.601s
sys     0m0.064s

@nikomatsakis
Copy link
Contributor

Added two more levels of parens and re-ran the test, results:

lunch-box. time rustc project-cache-issue-31849.rs

real    0m1.830s
user    0m1.761s
sys     0m0.070s
lunch-box. time rustc --stage0 project-cache-issue-31849.rs

real    1m43.955s
user    1m41.408s
sys     0m2.428s

I hope to improve the cache and it'd be interesting to see if we can bring those numbers down, but obviously this is a big improvement.

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Mar 16, 2016
@asajeffrey
Copy link
Author

Yay! Let me know when there's a PR I can play with.

@eddyb
Copy link
Member

eddyb commented Mar 16, 2016

@asajeffrey The commits above suggest there is a branch.

@asajeffrey
Copy link
Author

@eddyb yes, just waiting for it to settle down before trying it out!

@asajeffrey
Copy link
Author

And there's now a PR: #32287.

bors added a commit that referenced this issue Jun 4, 2016
Projection cache and better warnings for #32330

This PR does three things:

- it lays the groundwork for the more precise subtyping rules discussed in #32330, but does not enable them;
- it issues warnings when the result of a leak-check or subtyping check relies on a late-bound region which will late become early-bound when #32330 is fixed;
- it introduces a cache for projection in the inference context.

I'm not 100% happy with the approach taken by the cache here, but it seems like a step in the right direction. It results in big wins on some test cases, but not as big as previous versions -- I think because it is caching the `Vec<Obligation>` (whereas before I just returned the normalized type with an empty vector). However, that change was needed to fix an ICE in @alexcrichton's future-rs module (I haven't fully tracked the cause of that ICE yet). Also, because trans/the collector use a fresh inference context for every call to `fulfill_obligation`, they don't profit nearly as much from this cache as they ought to.

Still, here are the results from the future-rs `retry.rs`:

```
06:26 <nmatsakis> time: 6.246; rss: 44MB  item-bodies checking
06:26 <nmatsakis> time: 54.783; rss: 63MB   translation item collection
06:26 <nmatsakis> time: 140.086; rss: 86MB    translation

06:26 <nmatsakis> time: 0.361; rss: 46MB  item-bodies checking
06:26 <nmatsakis> time: 5.299; rss: 63MB    translation item collection
06:26 <nmatsakis> time: 12.140; rss: 86MB translation
```

~~Another example is the example from #31849. For that, I get 34s to run item-bodies without any cache. The version of the cache included here takes 2s to run item-bodies type-checking. An alternative version which doesn't track nested obligations takes 0.2s, but that version ICEs on @alexcrichton's future-rs (and may well be incorrect, I've not fully convinced myself of that). So, a definite win, but I think there's definitely room for further progress.~~

Pushed a modified version which improves performance of the case from #31849:

```
lunch-box. time rustc --stage0 ~/tmp/issue-31849.rs  -Z no-trans
real    0m33.539s
user    0m32.932s
sys     0m0.570s
lunch-box. time rustc --stage2 ~/tmp/issue-31849.rs  -Z no-trans
real    0m0.195s
user    0m0.154s
sys     0m0.042s
```

Some sort of cache is also needed for unblocking further work on lazy normalization, since that will lean even more heavily on the cache, and will also require cycle detection.

r? @arielb1
@nikomatsakis
Copy link
Contributor

I think this particular issue is effectively solved for quite some time now. Certainly the examples that @asajeffrey give no longer cause issues in the playpen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-type-system Area: Type system
Projects
None yet
Development

No branches or pull requests

4 participants