Instantiate fewer copies of a closure inside a generic function #46477

dtolnay · 2017-12-03T20:15:58Z

In serde-rs/json#386 we observed that a disproportionately large amount of serde_json lines of LLVM IR and compile time are due to a tiny closure inside a generic function. In fact this closure contributes more LLVM IR than all but 5 significantly larger functions.

The generic function needs to be instantiated lots of times, but the closure does not capture anything that would be affected by the type parameter.

Simplified example:

// cargo rustc -- --emit=llvm-ir
pub fn f() {
    g::<bool>();
    g::<usize>();
}

fn g<T>() -> usize {
    let n = 1;
    let closure = || n;
    closure()
}

This gives the expected 1 copy of f and 2 copies of g, but unexpectedly 2 copies of g::{{closure}} in the IR.

@Mark-Simulacrum

The text was updated successfully, but these errors were encountered:

Mark-Simulacrum · 2017-12-03T22:21:57Z

cc @michaelwoerister @eddyb @arielb1

Seems like this has potential for some fairly large wins across Rust.

eddyb · 2017-12-03T22:45:30Z

This is a subset of being able to detect parameter dependence from MIR, and sharing instances on the monomorphization collector based on it.
Should be relatively straight-forward nowadays.

EDIT: in fact, I think all you need is to implement TypeVisitor::visit_ty and put MIR through it, accumulating a bitset of "does this type parameter appear", at least on the analysis side.

michaelwoerister · 2017-12-04T09:24:00Z

Interesting find!

jonhoo · 2018-01-18T05:13:46Z

Just to leave a breadcrumb for later, there are other good suggestions for similar kinds of optimizations that can be done in this internals thread.

Eh2406 · 2019-09-20T02:44:48Z

This came up in conversation at a meetup recently. Several of us thought it would be interesting to see how big in impact it makes. None of us have any experience working on the compiler. How hard is this for a new contributor? Is there mentorship available? Alternatively does someone want to do some kind of remote presentation for our meetup, guiding us on this?

davidtwco · 2019-10-24T16:56:37Z

Assigning this to myself, going to be working on this optimisation as my master’s thesis.

@rustbot claim

cuviper · 2020-02-25T19:19:11Z

in fact, I think all you need is to implement TypeVisitor::visit_ty and put MIR through it, accumulating a bitset of "does this type parameter appear", at least on the analysis side.

Would this see through associated types?

The question of type sizes came up again in the users forum, akin to #62429. I gave this example of how things can go bad, and how to manually fix it:

fn multiply<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    iter.map(move |item| x * item.into())
}

fn multiply2<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    fn mul<T: Into<f64>>(x: f64) -> impl Fn(T) -> f64 {
        move |item| x * item.into()
    }
    iter.map(mul(x))
}

fn iter() -> impl Iterator<Item = i32> {
    (0..10).map(|i| i * 42)
}

pub fn foo() {
    let _ = multiply(iter(), 2.0);
    let _ = multiply2(iter(), 2.0);
}

This creates expanded types like this:

; playground::foo
; Function Attrs: nonlazybind uwtable
define void @_ZN10playground3foo17hbce2de427f35bc00E() unnamed_addr #1 !dbg !161 {
start:
  %_3 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply2::mul::{{closure}}<i32>>", align 8
  %_1 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply::{{closure}}<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>>>", align 8
...

It would be nice if multiply::{{closure}} could automatically be reduced like I did manually for multiply2::mul::{{closure}}. It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

eddyb · 2020-02-25T19:44:13Z

It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

How would this work, replace I::Item with a generic parameter?
You can do it manually like this, but it seems harder for the compiler:

fn multiply3(
    iter: impl Iterator<Item = impl Into<f64>>,
    x: f64,
) -> impl Iterator<Item = f64> {
    iter.map(move |item| x * item.into())
}

for the record, that is equivalent to:

fn multiply3<I, T>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator<Item = T>,
    T: Into<f64>,
{
    iter.map(move |item| x * item.into())
}

I think we need to wait for @davidtwco's work to be merged before we can even consider something like this.

You might also want to consider iter.map(Into::into).map(move |y| x * y).

Keep in mind that monomorphization happens based on generics, so you'd have to come up with some generics that still encapsulate the fact that there's a type which is needed by the Into::into call, and that's much harder when they're not the type-checking generics (for which you can simply not replace some params with their args).

Actually, there is probably a trick we can use: we can have the same generics as if I was unused, but then add a <I as Item>::Item == X bound to the ParamEnv for every X we monomorphize on.
That way the compiler doesn't have to invent generics, and IMO that's also the way I would want to handle monomorphizing only based on the size/align of a type but nothing else.
(we'd have bounds in the ParamEnv describing those properties)

cuviper · 2020-02-25T20:07:35Z

It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

How would this work, replace I::Item with a generic parameter?

Something like that, yes. (Internal only to the construction of the closure -- we wouldn't want to silently affect the user's API.) I'm sure it is a harder request for the compiler, but this issue was cited as a possible solution to replace #62429 -- in a lot of those cases, the whole point was to be generic on the Item type rather than the broader iterator.

Your Item = impl ... trick is neat for my specific example, but I don't think that will always apply. Those real cases on Iterators are dealing with parameters of Self (like Map<I, F>) and then doing something in a closure with I::Item or Self::Item. We also can't change the API of those methods to add new type parameters, whether explicit or impl ....

You might also want to consider iter.map(Into::into).map(move |y| x * y).

Sure, but that was already an artificial example, just trying to show the scope of generics.

Actually, there is probably a trick we can use: we can have the same generics as if I was unused, but then add a <I as Item>::Item == X bound to the ParamEnv for every X we monomorphize on.
That way the compiler doesn't have to invent generics, and IMO that's also the way I would want to handle monomorphizing only based on the size/align of a type but nothing else.
(we'd have bounds in the ParamEnv describing those properties)

I don't know enough of these details, but it sounds plausible to me! :)

eddyb · 2020-02-25T20:34:29Z

To expand a bit, the monomorphization is keyed today on:

fn multiply::<Map<Range<i32>, iter::{closure#0}>>::{closure#0};
fn multiply2::mul::<i32>::{closure#0};
fn multiply3::<Map<Range<i32>, iter::{closure#0}>, i32>::{closure#0};

with @davidtwco's work, it should look like this:

fn multiply::<Map<Range<i32>, iter::{closure#0}>>::{closure#0};
fn multiply2::mul::<i32>::{closure#0};
fn multiply3::<I, i32>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == i32,
    i32: Sized,
;

(I'm using the version of multiply3 with a named I just to make things clearer)

Now, that where clause I wrote there is the ParamEnv, i.e. how the compiler tracks "bounds" in scope, and we might have it from the start because e.g. &mut I only has a known layout if I: Sized is known (makes more sense for e.g. Vec::len I guess).

If T would also be unused, you'd get the fully generic ParamEnv, i.e.:

fn multiply3::<I, T>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == T,
    T: Sized,
;

And you can see there that the T = i32 version I had at first is literally the same except with i32 instead of T, effectively a "partial substitution".

Anyway, the neat thing is that you get the <I as Iterator>::Item == i32 bound in scope "for free" with multiply3 + @davidtwco's initial approach, meaning the MIR body of the closure could actually use I::Item instead of T and it would still resolve as i32.

So codegen wouldn't need to be changed in order to do this monomorphization:

fn multiply::<I>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == i32,
;

As you can see, it's literally multiply3 minus the second type parameter and the redundant-after-substitution i32: Sized bound.

But the fully generic form is this (note the lack of any mention of I::Item):

fn multiply::<I>::{closure#0}
where
    I: Sized,
    I: Iterator,
;

So you have to come up with that extra bound and inject it into the ParamEnv.

The good news is that you would "just" need the analysis that I isn't used, only <I as Iterator>::Item is (which isn't that hard, ty::layout also has some special-casing for "type parameter or associated type projection"), and then generate this bound:

<I as Iterator>::Item == <Map<Range<i32>, iter::{closure#0}> as Iterator>::Item

which normalizes to (note that the type after == has no generics):

<I as Iterator>::Item == i32

davidtwco · 2020-03-06T10:40:50Z

For those following along at home, there's a PR up for my work so far - #69749.

Mark-Simulacrum added I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 3, 2017

dtolnay mentioned this issue Dec 5, 2017

Option::map_or is being instantiated too much serde-rs/json#391

Closed

TimNN added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Dec 5, 2017

michaelwoerister mentioned this issue Jan 22, 2018

Tracking Issue for Incremental Compilation #47660

Open

32 tasks

bjorn3 mentioned this issue Feb 11, 2018

[WIP] Deduplicate instances #48139

Closed

4 tasks

michaelwoerister mentioned this issue Feb 26, 2018

Compiler Performance Tracking Issue #48547

Open

ishitatsuyuki added the WG-compiler-performance Working group: Compiler Performance label May 9, 2018

est31 mentioned this issue Jul 8, 2019

Rust platform size #61978

Closed

eddyb mentioned this issue Aug 18, 2019

Reduce the genericity of closures in the iterator traits #62429

Merged

Centril mentioned this issue Aug 18, 2019

Deduplicate closures #63660

Closed

comex mentioned this issue Sep 22, 2019

Allow "ABI agnostic" generics in FFI imports. rust-lang/rfcs#2770

Open

bluss mentioned this issue Oct 1, 2019

use try_fold instead of try_for_each to reduce compile time #64885

Merged

rustbot assigned davidtwco Oct 24, 2019

est31 mentioned this issue Dec 9, 2019

Don't evaluate promoteds for each monomorphization if it does not depend on generic parameters #67176

Open

the8472 mentioned this issue Jan 22, 2020

perf: Use for_each in Vec::extend #68046

Closed

eddyb mentioned this issue Mar 6, 2020

Polymorphization #69749

Merged

stuhood mentioned this issue May 1, 2020

Port the bulk of the process_execution crate to async/await pantsbuild/pants#9676

Merged

bors closed this as completed in b52522a Jul 21, 2020

lcnr mentioned this issue Aug 5, 2020

Update details of polymorphization working group rust-lang/compiler-team#342

Merged

danielhenrymantilla mentioned this issue Oct 7, 2020

(Lack of) Polymorphization can lead to an unnecessarily recursive type & make compilation fail #77664

Open

panstromek mentioned this issue Mar 15, 2021

Closures in generic code can create duplicate monomorphizations #83010

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instantiate fewer copies of a closure inside a generic function #46477

Instantiate fewer copies of a closure inside a generic function #46477

dtolnay commented Dec 3, 2017 •

edited by rustbot

Loading

Mark-Simulacrum commented Dec 3, 2017

eddyb commented Dec 3, 2017 •

edited

Loading

michaelwoerister commented Dec 4, 2017

jonhoo commented Jan 18, 2018

Eh2406 commented Sep 20, 2019

davidtwco commented Oct 24, 2019

cuviper commented Feb 25, 2020

eddyb commented Feb 25, 2020 •

edited

Loading

cuviper commented Feb 25, 2020 •

edited

Loading

eddyb commented Feb 25, 2020

davidtwco commented Mar 6, 2020 •

edited

Loading

Instantiate fewer copies of a closure inside a generic function #46477

Instantiate fewer copies of a closure inside a generic function #46477

Comments

dtolnay commented Dec 3, 2017 • edited by rustbot Loading

Mark-Simulacrum commented Dec 3, 2017

eddyb commented Dec 3, 2017 • edited Loading

michaelwoerister commented Dec 4, 2017

jonhoo commented Jan 18, 2018

Eh2406 commented Sep 20, 2019

davidtwco commented Oct 24, 2019

cuviper commented Feb 25, 2020

eddyb commented Feb 25, 2020 • edited Loading

cuviper commented Feb 25, 2020 • edited Loading

eddyb commented Feb 25, 2020

davidtwco commented Mar 6, 2020 • edited Loading

dtolnay commented Dec 3, 2017 •

edited by rustbot

Loading

eddyb commented Dec 3, 2017 •

edited

Loading

eddyb commented Feb 25, 2020 •

edited

Loading

cuviper commented Feb 25, 2020 •

edited

Loading

davidtwco commented Mar 6, 2020 •

edited

Loading