-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
At opt-levels <= 1 the arithmetic operation methods do not get inlined, preventing other optimisations #75598
Comments
Relatedly, #74362 is an instance where not inlining a function that wraps an intrinsic inhibits DCE. |
My naive assumption is that |
Using the compiler explorer link, it seems like simply tagging the |
This is not true for |
Example of prevented optimization. |
…nline-always-arithmetic, r=nagisa Add some #[inline(always)] to arithmetic methods of integers I tried to add it only to methods which return results of intrinsics and don't have any branching. Branching could made performance of debug builds (`-Copt-level=0`) worse. Main goal of changes is allowing wider optimizations in `-Copt-level=1`. Closes: rust-lang#75598 r? `@nagisa`
This is a follow-up change to the fix for rust-lang#75598. It simplifies the implementation of wrapping_neg() for all integer types by just calling 0.wrapping_sub(self) and always inlines it. This leads to much less assembly code being emitted for opt-level≤1.
…m-ou-se Make wrapping_neg() use wrapping_sub(), #[inline(always)] This is a follow-up change to the fix for rust-lang#75598. It simplifies the implementation of wrapping_neg() for all integer types by just calling 0.wrapping_sub(self) and always inlines it. This leads to much less assembly code being emitted for opt-level≤1 and thus much better performance for debug-compiled code. Background is [this discussion on the internals forum](https://internals.rust-lang.org/t/why-does-rust-generate-10x-as-much-unoptimized-assembly-as-gcc/14930).
Consider code like this:
compiler explorer
at
-Copt-level=1
and lower, the generated assembly for versions with the method call will generate a function call, rather than direct operation. Once the inlining fails, other optimisations that could be done are inhibited, especially at -Copt-level=1`.More generally speaking, I wonder if we might want to make these methods have a special annotation that would make the compiler generate the instructions directly much like it does for intrinsics right now. These seem like basic enough that
#[inline(always)]
might not be good enough (it being just a hint) and also possibly more expensive than necessary (something needs to do the inlining still).The text was updated successfully, but these errors were encountered: