Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler hangs after 'write metadata stage' #37311

Closed
mglagla opened this issue Oct 20, 2016 · 18 comments · Fixed by #37789
Closed

Compiler hangs after 'write metadata stage' #37311

mglagla opened this issue Oct 20, 2016 · 18 comments · Fixed by #37789
Assignees
Labels
A-typesystem Area: The type system P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mglagla
Copy link
Contributor

mglagla commented Oct 20, 2016

Hello everyone,

while trying to refactor a small toy project (Link), I've hit a bug where the compiler never finishes compiling:

calcium $ cargo rustc -Ztime-passes -Ztime-llvm-passes
   Compiling calcium v0.5.0 (file:///data/Code/calcium)
warning: the option `Z` is unstable and should only be used on the nightly compiler, but it is currently accepted for backwards compatibility; this will soon change, see issue #31847 for more details

time: 0.002; rss: 43MB  parsing
time: 0.000; rss: 43MB  configuration
time: 0.000; rss: 43MB  recursion limit
time: 0.000; rss: 43MB  crate injection
time: 0.000; rss: 43MB  plugin loading
time: 0.000; rss: 43MB  plugin registration
time: 0.038; rss: 82MB  expansion
time: 0.000; rss: 82MB  maybe building test harness
time: 0.001; rss: 82MB  assigning node ids
time: 0.000; rss: 82MB  checking for inline asm in case the target doesn't support it
time: 0.000; rss: 82MB  complete gated feature checking
time: 0.000; rss: 82MB  collecting defs
time: 0.009; rss: 94MB  external crate/lib resolution
time: 0.000; rss: 94MB  early lint checks
time: 0.000; rss: 94MB  AST validation
time: 0.004; rss: 94MB  name resolution
time: 0.001; rss: 94MB  lowering ast -> hir
time: 0.001; rss: 98MB  indexing hir
time: 0.000; rss: 98MB  attribute checking
time: 0.000; rss: 98MB  language item collection
time: 0.000; rss: 98MB  lifetime resolution
time: 0.000; rss: 98MB  looking for entry point
time: 0.000; rss: 98MB  looking for plugin registrar
time: 0.001; rss: 98MB  region resolution
time: 0.000; rss: 98MB  loop checking
time: 0.000; rss: 98MB  static item recursion checking
time: 0.000; rss: 98MB  load_dep_graph
time: 0.008; rss: 98MB  type collecting
time: 0.000; rss: 98MB  variance inference
time: 0.050; rss: 104MB coherence checking
time: 0.012; rss: 104MB wf checking
time: 0.003; rss: 104MB item-types checking
time: 0.112; rss: 106MB item-bodies checking
time: 0.000; rss: 106MB drop-impl checking
time: 0.007; rss: 106MB const checking
time: 0.001; rss: 106MB privacy checking
time: 0.000; rss: 106MB stability index
time: 0.001; rss: 106MB intrinsic checking
time: 0.000; rss: 106MB effect checking
time: 0.002; rss: 106MB match checking
time: 0.001; rss: 106MB liveness checking
time: 0.003; rss: 108MB rvalue checking
time: 0.009; rss: 111MB MIR dump
time: 0.006; rss: 114MB MIR passes
time: 0.009; rss: 114MB borrow checking
time: 0.000; rss: 114MB reachability checking
time: 0.001; rss: 114MB death checking
time: 0.001; rss: 114MB stability checking
time: 0.000; rss: 114MB unused lib feature checking
warning: variant is never used: `MismatchingParens`, #[warn(dead_code)] on by default
  --> src/parse.rs:33:5
   |
33 |     MismatchingParens,
   |     ^^^^^^^^^^^^^^^^^

time: 0.009; rss: 114MB lint checking
time: 0.000; rss: 114MB resolving dependency formats
time: 0.006; rss: 114MB Prepare MIR codegen passes
  time: 0.019; rss: 114MB   write metadata
^C
calcium $ 

I had to stop because even after a minute it did not finish.

This is the offending commit on branch "no-alloc", the previous commit on branch master finished very quickly.

The only difference to master is this code in parse.rs, starting on line 97:

let mut ref_c = 1;
let mut sub_tokens = Vec::new();

while ref_c > 0 {
    let token = if let Some(t) = it.next() {
        t.clone()
    } else { return Err(ParseError::MismatchingParens) };

        match token {
            Token::Operator(Symbol::OpenParen) => {
                ref_c += 1;
            },
            Token::Operator(Symbol::CloseParen) => {
                ref_c -= 1;
            },
            _ => {},
        }

    if ref_c > 0 {
        sub_tokens.push(token);
    }
}

let mut sub_iter = sub_tokens.iter().peekable();
parse_expr(&mut sub_iter, 0)              

has been replaced with this code:

let mut ref_c = 1;
let mut sub_iter = it.filter_map(|t| {
    match *t {
        Token::Operator(Symbol::OpenParen) => {
            ref_c += 1;
        },
        Token::Operator(Symbol::CloseParen) => {
            ref_c -= 1;
        },
        _ => {},
    };

    if ref_c > 0 {
        Some(t)
    } else {
        None
    }
})
.fuse()
.peekable();

parse_expr(&mut sub_iter, 0)

rustc version is 1.12.0 (3191fba 2016-09-23)

@pnkfelix pnkfelix added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-nominated labels Oct 20, 2016
@eddyb
Copy link
Member

eddyb commented Oct 20, 2016

parse_expr(&mut sub_iter, 0)   

This is wrong because it results in wrapping It in an iterator, and when recursing back to the same line, it wraps that type again, etc. ad infinitum. I believe this is usually called "polymorphic recursion".
The simplest fix would be to cast the iterator (at the cost of dynamic dispatch):

parse_expr(&mut sub_iter as &mut Peekable<Iterator<Item=&'a Token>>, 0)

A better solution would involve rewriting all of this code, but I'm not sure exactly how that can be done.

However, the compiler is supposed to error here, so the bug is that it goes into a loop instead.

@nikomatsakis
Copy link
Contributor

Discussed in the compiler team meeting. This is likely a cycle in trans (possibly the collector?), quite likely triggered by a fn recursively calling itself with a derived closure, or otherwise growing infinitely. Perhaps we are missing a recursion detector in the cycle collector or elsewhere to detect infinite monomorphization problems?

Ah, @eddyb wrote that. =)

@arielb1 arielb1 self-assigned this Oct 20, 2016
@nikomatsakis
Copy link
Contributor

triage: P-high

@rust-highfive rust-highfive added P-high High priority and removed I-nominated labels Oct 20, 2016
@arielb1
Copy link
Contributor

arielb1 commented Oct 20, 2016

This looks like another case of "slow exponential recursion", except instead of a function calling 2 different functions, it creates a type of double the size:

parse_prefix::<τ> requires
    parse_prefix::<Fuse<FilterMap<&mut Peekable<τ>, [parse_prefix closure]::<τ>>>>

This type, when interned, has O(n) size, but trait operations fold it, which requires O(2^n) time. That... actually looks like it could be a part of some realistic example. Not sure what to do.

@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2016

Minified:

trait Mirror {
    type Image;
}

impl<T> Mirror for T { type Image = T; }

#[allow(unconditional_recursion)]
fn recurse<T>() { 
    recurse::<<(T, T) as Mirror>::Image>();
}

fn main() {
    recurse::<()>();
}

@TimNN
Copy link
Contributor

TimNN commented Oct 21, 2016

This is a regression from 1.8.0 to 1.9.0.

On nightly, this was introduced between nightly-2016-03-18 and nightly-2016-03-20 (Changes).

@alexcrichton alexcrichton added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Oct 21, 2016
@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2016

What did 1.8.0 do then?

@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2016

#32080 is the probable suspect - it did replace half of the old trans bugs with new ones.

@TimNN
Copy link
Contributor

TimNN commented Oct 21, 2016

Reported a recursion limit:

hang.rs:8:1: 10:2 error: reached the recursion limit during monomorphization
hang.rs: 8 fn recurse<T>() {
hang.rs: 9     recurse::<<(T, T) as Mirror>::Image>();
hang.rs:10 }

@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2016

A variant using traits hangs on all versions:

trait Mirror {
    type Image;
}

impl<T> Mirror for T { type Image = T; }

trait Foo {
    fn recurse(&self);
}

impl<T> Foo for T {
    #[allow(unconditional_recursion)]
    fn recurse(&self) { 
        (self, self).recurse();
    }
}

fn main() {
    ().recurse();
}

@mglagla
Copy link
Contributor Author

mglagla commented Oct 21, 2016

@eddyb

A better solution would involve rewriting all of this code, but I'm not sure exactly how that can be done.

Yeah, it was my first attempt at refactoring that part, so i expected the compiler to give me some errors. Instead it hung.

@TimNN

This is a regression from 1.8.0 to 1.9.0.

Did you mean @arielb1 first minified example? I just tried to build my project on 1.8.0 and it still hung (i.e. compiled for half an hour without finishing).

@TimNN
Copy link
Contributor

TimNN commented Oct 21, 2016

@mglagla: Yes, that was @arielb1's first example.

@arielb1
Copy link
Contributor

arielb1 commented Oct 21, 2016

Sure enough. All of these examples create an exponentially-long type, and if we do anything to it before the recursion limit is reached, we hang (in fact, my "reduced" example hangs while printing the overflow error message).

@arielb1 arielb1 removed the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Oct 27, 2016
@brson brson added the A-typesystem Area: The type system label Nov 3, 2016
@nikomatsakis
Copy link
Contributor

I wonder if we need another kind of limit -- something that caps the maximum size a type is allowed to be (measured...somehow). Then we can abort when we hit that limit instead?

@brson brson added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Nov 3, 2016
@brson
Copy link
Contributor

brson commented Nov 3, 2016

Retagging as a regression. I don't see a reason it's not. This specific symptom did not used to occur. Now it does.

@eddyb
Copy link
Member

eddyb commented Nov 3, 2016

@nikomatsakis We can easily track "total number of children" for each Ty, that's one option.

@arielb1
Copy link
Contributor

arielb1 commented Nov 3, 2016

This can be fixed by adding a limit to type sizes at the entry to monomorphization. Will do that.

@brson
Copy link
Contributor

brson commented Dec 1, 2016

Just waiting on #37789. Thanks @arielb1 !

bors added a commit that referenced this issue Dec 2, 2016
limit the length of types in monomorphization

This adds the new insta-stable `#![type_size_limit]` crate attribute to control
the limit, and is obviously a [breaking-change] fixable by that.

Fixes #37311.

r? @nikomatsakis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-typesystem Area: The type system P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants