Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instantiate fewer copies of a closure inside a generic function #46477

Closed
dtolnay opened this issue Dec 3, 2017 · 11 comments · Fixed by #69749
Closed

Instantiate fewer copies of a closure inside a generic function #46477

dtolnay opened this issue Dec 3, 2017 · 11 comments · Fixed by #69749
Assignees
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-performance Working group: Compiler Performance

Comments

@dtolnay
Copy link
Member

dtolnay commented Dec 3, 2017

In serde-rs/json#386 we observed that a disproportionately large amount of serde_json lines of LLVM IR and compile time are due to a tiny closure inside a generic function. In fact this closure contributes more LLVM IR than all but 5 significantly larger functions.

The generic function needs to be instantiated lots of times, but the closure does not capture anything that would be affected by the type parameter.

Simplified example:

// cargo rustc -- --emit=llvm-ir
pub fn f() {
    g::<bool>();
    g::<usize>();
}

fn g<T>() -> usize {
    let n = 1;
    let closure = || n;
    closure()
}

This gives the expected 1 copy of f and 2 copies of g, but unexpectedly 2 copies of g::{{closure}} in the IR.

@Mark-Simulacrum

@Mark-Simulacrum
Copy link
Member

cc @michaelwoerister @eddyb @arielb1

Seems like this has potential for some fairly large wins across Rust.

@Mark-Simulacrum Mark-Simulacrum added I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 3, 2017
@eddyb
Copy link
Member

eddyb commented Dec 3, 2017

This is a subset of being able to detect parameter dependence from MIR, and sharing instances on the monomorphization collector based on it.
Should be relatively straight-forward nowadays.

EDIT: in fact, I think all you need is to implement TypeVisitor::visit_ty and put MIR through it, accumulating a bitset of "does this type parameter appear", at least on the analysis side.

@michaelwoerister
Copy link
Member

Interesting find!

@TimNN TimNN added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Dec 5, 2017
@jonhoo
Copy link
Contributor

jonhoo commented Jan 18, 2018

Just to leave a breadcrumb for later, there are other good suggestions for similar kinds of optimizations that can be done in this internals thread.

@Eh2406
Copy link
Contributor

Eh2406 commented Sep 20, 2019

This came up in conversation at a meetup recently. Several of us thought it would be interesting to see how big in impact it makes. None of us have any experience working on the compiler. How hard is this for a new contributor? Is there mentorship available? Alternatively does someone want to do some kind of remote presentation for our meetup, guiding us on this?

@davidtwco
Copy link
Member

Assigning this to myself, going to be working on this optimisation as my master’s thesis.

@rustbot claim

@cuviper
Copy link
Member

cuviper commented Feb 25, 2020

in fact, I think all you need is to implement TypeVisitor::visit_ty and put MIR through it, accumulating a bitset of "does this type parameter appear", at least on the analysis side.

Would this see through associated types?

The question of type sizes came up again in the users forum, akin to #62429. I gave this example of how things can go bad, and how to manually fix it:

fn multiply<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    iter.map(move |item| x * item.into())
}

fn multiply2<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    fn mul<T: Into<f64>>(x: f64) -> impl Fn(T) -> f64 {
        move |item| x * item.into()
    }
    iter.map(mul(x))
}

fn iter() -> impl Iterator<Item = i32> {
    (0..10).map(|i| i * 42)
}

pub fn foo() {
    let _ = multiply(iter(), 2.0);
    let _ = multiply2(iter(), 2.0);
}

This creates expanded types like this:

; playground::foo
; Function Attrs: nonlazybind uwtable
define void @_ZN10playground3foo17hbce2de427f35bc00E() unnamed_addr #1 !dbg !161 {
start:
  %_3 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply2::mul::{{closure}}<i32>>", align 8
  %_1 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply::{{closure}}<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>>>", align 8
...

It would be nice if multiply::{{closure}} could automatically be reduced like I did manually for multiply2::mul::{{closure}}. It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

@eddyb
Copy link
Member

eddyb commented Feb 25, 2020

It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

How would this work, replace I::Item with a generic parameter?
You can do it manually like this, but it seems harder for the compiler:

fn multiply3(
    iter: impl Iterator<Item = impl Into<f64>>,
    x: f64,
) -> impl Iterator<Item = f64> {
    iter.map(move |item| x * item.into())
}

for the record, that is equivalent to:

fn multiply3<I, T>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator<Item = T>,
    T: Into<f64>,
{
    iter.map(move |item| x * item.into())
}

I think we need to wait for @davidtwco's work to be merged before we can even consider something like this.

You might also want to consider iter.map(Into::into).map(move |y| x * y).

Keep in mind that monomorphization happens based on generics, so you'd have to come up with some generics that still encapsulate the fact that there's a type which is needed by the Into::into call, and that's much harder when they're not the type-checking generics (for which you can simply not replace some params with their args).

Actually, there is probably a trick we can use: we can have the same generics as if I was unused, but then add a <I as Item>::Item == X bound to the ParamEnv for every X we monomorphize on.
That way the compiler doesn't have to invent generics, and IMO that's also the way I would want to handle monomorphizing only based on the size/align of a type but nothing else.
(we'd have bounds in the ParamEnv describing those properties)

@cuviper
Copy link
Member

cuviper commented Feb 25, 2020

It seems to me that "does this type parameter appear" would have to see through to the associated type I::Item, and not count that as an appearance of I itself.

How would this work, replace I::Item with a generic parameter?

Something like that, yes. (Internal only to the construction of the closure -- we wouldn't want to silently affect the user's API.) I'm sure it is a harder request for the compiler, but this issue was cited as a possible solution to replace #62429 -- in a lot of those cases, the whole point was to be generic on the Item type rather than the broader iterator.

Your Item = impl ... trick is neat for my specific example, but I don't think that will always apply. Those real cases on Iterators are dealing with parameters of Self (like Map<I, F>) and then doing something in a closure with I::Item or Self::Item. We also can't change the API of those methods to add new type parameters, whether explicit or impl ....

You might also want to consider iter.map(Into::into).map(move |y| x * y).

Sure, but that was already an artificial example, just trying to show the scope of generics.

Actually, there is probably a trick we can use: we can have the same generics as if I was unused, but then add a <I as Item>::Item == X bound to the ParamEnv for every X we monomorphize on.
That way the compiler doesn't have to invent generics, and IMO that's also the way I would want to handle monomorphizing only based on the size/align of a type but nothing else.
(we'd have bounds in the ParamEnv describing those properties)

I don't know enough of these details, but it sounds plausible to me! :)

@eddyb
Copy link
Member

eddyb commented Feb 25, 2020

To expand a bit, the monomorphization is keyed today on:

fn multiply::<Map<Range<i32>, iter::{closure#0}>>::{closure#0};
fn multiply2::mul::<i32>::{closure#0};
fn multiply3::<Map<Range<i32>, iter::{closure#0}>, i32>::{closure#0};

with @davidtwco's work, it should look like this:

fn multiply::<Map<Range<i32>, iter::{closure#0}>>::{closure#0};
fn multiply2::mul::<i32>::{closure#0};
fn multiply3::<I, i32>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == i32,
    i32: Sized,
;

(I'm using the version of multiply3 with a named I just to make things clearer)

Now, that where clause I wrote there is the ParamEnv, i.e. how the compiler tracks "bounds" in scope, and we might have it from the start because e.g. &mut I only has a known layout if I: Sized is known (makes more sense for e.g. Vec::len I guess).

If T would also be unused, you'd get the fully generic ParamEnv, i.e.:

fn multiply3::<I, T>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == T,
    T: Sized,
;

And you can see there that the T = i32 version I had at first is literally the same except with i32 instead of T, effectively a "partial substitution".

Anyway, the neat thing is that you get the <I as Iterator>::Item == i32 bound in scope "for free" with multiply3 + @davidtwco's initial approach, meaning the MIR body of the closure could actually use I::Item instead of T and it would still resolve as i32.

So codegen wouldn't need to be changed in order to do this monomorphization:

fn multiply::<I>::{closure#0}
where
    I: Sized,
    I: Iterator,
    <I as Iterator>::Item == i32,
;

As you can see, it's literally multiply3 minus the second type parameter and the redundant-after-substitution i32: Sized bound.

But the fully generic form is this (note the lack of any mention of I::Item):

fn multiply::<I>::{closure#0}
where
    I: Sized,
    I: Iterator,
;

So you have to come up with that extra bound and inject it into the ParamEnv.

The good news is that you would "just" need the analysis that I isn't used, only <I as Iterator>::Item is (which isn't that hard, ty::layout also has some special-casing for "type parameter or associated type projection"), and then generate this bound:

<I as Iterator>::Item == <Map<Range<i32>, iter::{closure#0}> as Iterator>::Item

which normalizes to (note that the type after == has no generics):

<I as Iterator>::Item == i32

@davidtwco
Copy link
Member

davidtwco commented Mar 6, 2020

For those following along at home, there's a PR up for my work so far - #69749.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-performance Working group: Compiler Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants