-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal inlining decisions #49541
Comments
Random thought: maybe this calls for an annotation to force-inline at the caller level. |
Cc @rust-lang/wg-codegen |
I'd rather not consider more annotations. We already need to tell ppl that, yes there is We could probably more aggressively inline in the presence of constant information. |
I wonder how that interacts with #49479. |
MergeFunctions could help in the case where the entire callers are identical (edit:) and it runs before inlining. Which is not a very interesting case IMO for something like I believe the root cause of this issue is that LLVM's inlining cost heuristic special cases functions with internal linkage and just one call site (because in those cases, inlining doesn't increase code size since you can eliminate the function entirely). If there are multiple call sites, IIRC it doesn't do anything to account for the fact that inlining all call sites would still allow eliminating the function. Probably because it can't actually know whether all call sites will inline it. Seems difficult to solve in general. |
If inlining decreases the code size of the caller, we should always inline regardless of heuristics. Idk how to detect such cases before actually doing the inlining |
That too is part of the inlining heuristic. Of course, it's only an heuristic, and it could always be better, but tuning it further is notoriously fickle. |
All the functions involved in conversions between |
The problem is that there is no scalpel for edge cases when the called function is not annotated at all, which the case here.
One way to look at it is that part of the problem is this all or nothing property of inlining. Either the function is always inlined or not (AIUI). I'd argue it should be decided case by case. There may be both cases where it makes sense for the function not to be inlined and cases where it doesn't, in the same codebase. |
This helps with the specific problem described in rust-lang#49541, obviously without making any large change to how inlining works in the general case. Everything involved in the conversions is made `#[inline]`, except for the `<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]` after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with `#[inline]`. For the record, that function was: ```rust pub fn foo() -> Box<[u8]> { vec![0].into_boxed_slice() } ``` To help the inliner's job, we also hoist a `self.capacity() != self.len` check in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
@glandium You misunderstand, inlining in LLVM is a per-call-site decision (though it is true that most of the heuristic only looks at the function to inline, not at the call site). |
Inline most of the code paths for conversions with boxed slices This helps with the specific problem described in rust-lang#49541, obviously without making any large change to how inlining works in the general case. Everything involved in the conversions is made `#[inline]`, except for the `<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]` after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with `#[inline]`. For the record, that function was: ```rust pub fn foo() -> Box<[u8]> { vec![0].into_boxed_slice() } ``` To help the inliner's job, we also hoist a `self.capacity() != self.len` check in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
Triage; no change |
I suspect this can happen in more cases, but here is how I observed this:
This compiles to:
Which is pretty much to the point.
Now duplicate the function, so that you now have two functions calling
into_boxed_slice()
, and the compiler decides not to inline it at all anymore. Which:Vec::into_boxed_slice
implementation (63 lines of assembly)ptr::drop_in_place
The threshold to stop inlining seems pretty low for this particular case, and even if it might make sense for some uses across the codebase to not be inlined, when the result of inlining is clearly beneficial, it would be good if we could still inline the calls where it's a win.
The text was updated successfully, but these errors were encountered: