Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mul nuw+lshr exact should fold to a single multiplication (when the latter is a factor) #54824

Closed
scottmcm opened this issue Apr 9, 2022 · 2 comments

Comments

@scottmcm
Copy link

scottmcm commented Apr 9, 2022

Alive for the specific example, with opt+llc: https://alive2.llvm.org/ce/z/7oofsh
+label: llvm:optimizations

I tried the following:

define i64 @src(i64 noundef %0) {
start:
  %1 = mul nuw nsw i64 %0, 52
  %2 = lshr exact i64 %1, 2
  ret i64 %2
}

Since 52 = 4 * 13 I expected to see that fold down to just a single mul:

define i64 @tgt(i64 noundef %0) {
start:
  %1 = mul nuw nsw i64 %0, 13
  ret i64 %1
}

But it doesn't -- the mul and lshr are left in the optimized IR, and aren't folded by the target either:

src:                                    # @src
        imul    rax, rdi, 52
        shr     rax, 2
        ret

Whereas the single-multiplication version ends up not even needing an imul:

tgt:                                    # @tgt
        lea     rax, [rdi + 2*rdi]
        lea     rax, [rdi + 4*rax]
        ret

So in general, %1 = mul nuw %0, CONST1; %2 = lshr exact %1, CONST2 should simplify to a single mul nuw whenever CONST1 is a multiple of 1 << CONST2.

In fact, it looks like this already happens in opt if written with div: https://alive2.llvm.org/ce/z/L-VouQ

  %1 = mul nuw i64 %0, 52
  %2 = udiv exact i64 %1, 4
=>
  %1 = mul nuw nsw i64 %0, 13

So that code path just needs to take into account places where the udiv -> lshr transformation had already happened.

@bcl5980
Copy link
Contributor

bcl5980 commented Apr 9, 2022

if (match(Op0, m_NUWMul(m_Value(X), m_APInt(MulC))) &&

I think we can add the combination here.

@bcl5980
Copy link
Contributor

bcl5980 commented Apr 9, 2022

I'm trying to fix the issue on https://reviews.llvm.org/D123453

bcl5980 added a commit that referenced this issue Apr 20, 2022
Baseline tests for D123453(issue #54824)
bors added a commit to rust-lang-ci/rust that referenced this issue Sep 18, 2022
…, r=Mark-Simulacrum

Add a codegen test for `slice::from_ptr_range`

I noticed back in rust-lang#95579 that this didn't optimize as well as it should.

It's better now, after rust-lang#95837 changed the code in `from_ptr_range` and llvm/llvm-project#54824 was fixed in LLVM 15.

So here's a test to keep it generating the good version.
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
…atter is a factor

if c is divisible by (1 << ShAmtC), we can fold this pattern:
lshr (mul nuw x, c), ShAmtC -> mul nuw x, (c >> ShAmtC)

https://alive2.llvm.org/ce/z/ox4wAt

Fix llvm/llvm-project#54824

Reviewed By: spatel, lebedev.ri, craig.topper

Differential Revision: https://reviews.llvm.org/D123453
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants