Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported #85066

Merged
merged 5 commits into from
May 29, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Mar 13, 2024

This patch fold shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes #84763.

Copy link

github-actions bot commented Mar 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@nikic
Copy link
Contributor

nikic commented Mar 13, 2024

Does this need to happen in CGP for some reason, or would DAGCombine also work?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 13, 2024

Does this need to happen in CGP for some reason, or would DAGCombine also work?

I implement this in CGP to avoid duplicating the logic in GISel.
See also the comment #81404 (comment).

@nikic
Copy link
Contributor

nikic commented Mar 13, 2024

CGP should only be used for transforms that require cross-block reasoning, which does not seem to be the case here. Aspirationally GlobalISel does not need CGP at all, because it can perform those optimizations itself. (Realistically, we are far from that...)

@dtcxzyw dtcxzyw requested a review from arsenm March 13, 2024 15:26
Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be done in DAGCombiner.

llvm/test/CodeGen/RISCV/shl-cttz.ll Outdated Show resolved Hide resolved
llvm/lib/CodeGen/CodeGenPrepare.cpp Outdated Show resolved Hide resolved
// shl X, cttz(Y) -> mul (Y & -Y), X if cttz is unsupported on the target.
Value *Y;
if (match(I->getOperand(1),
m_OneUse(m_Intrinsic<Intrinsic::cttz>(m_Value(Y))))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can match an intrinsic without specifying a match for all operands? That's surprising.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 14, 2024

I agree this should be done in DAGCombiner.

@arsenm Any comments?

@arsenm
Copy link
Contributor

arsenm commented Mar 14, 2024

I agree this should be done in DAGCombiner.

@arsenm Any comments?

Yes, this is straightforward combining. The downside is then you have to do it twice, in the DAG and GISel

@dtcxzyw dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 7a9c015 to 43e36d8 Compare March 21, 2024 14:42
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this into DAGCombine (and GISel if you want to handle both)

@dtcxzyw dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 43e36d8 to 41b7713 Compare March 22, 2024 17:53
@dtcxzyw dtcxzyw changed the title [CodeGenPrepare] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported [DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported Mar 22, 2024
@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Mar 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 22, 2024

@llvm/pr-subscribers-llvm-selectiondag

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch fold shl X, cttz(Y) to mul (Y &amp; -Y), X if cttz is unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes #84763.


Patch is 28.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85066.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+12)
  • (added) llvm/test/CodeGen/RISCV/shl-cttz.ll (+807)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index dcd0310734ad72..a77054d1e33d61 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9962,6 +9962,18 @@ SDValue DAGCombiner::visitSHL(SDNode *N) {
     if (SDValue NewSHL = visitShiftByConstant(N))
       return NewSHL;
 
+  // fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
+  // target.
+  if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
+      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&
+      TLI.isOperationLegalOrCustom(ISD::MUL, VT)) {
+    SDValue Y = N1.getOperand(0);
+    SDLoc DL(N);
+    SDValue NegY = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT), Y);
+    SDValue And = DAG.getNode(ISD::AND, DL, VT, Y, NegY);
+    return DAG.getNode(ISD::MUL, DL, VT, And, N0);
+  }
+
   if (SimplifyDemandedBits(SDValue(N, 0)))
     return SDValue(N, 0);
 
diff --git a/llvm/test/CodeGen/RISCV/shl-cttz.ll b/llvm/test/CodeGen/RISCV/shl-cttz.ll
new file mode 100644
index 00000000000000..e3ed16d4971410
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/shl-cttz.ll
@@ -0,0 +1,807 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -mtriple=riscv32 -mattr=+m -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=RV32I
+; RUN: llc -mtriple=riscv32 -mattr=+m,+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=RV32ZBB
+; RUN: llc -mtriple=riscv64 -mattr=+m -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64I,RV64IILLEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m,+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64ZBB,RV64ZBBILLEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m -riscv-experimental-rv64-legal-i32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64I,RV64ILEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m,+zbb -riscv-experimental-rv64-legal-i32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64ZBB,RV64ZBBLEGALI32
+
+define i8 @shl_cttz_i8(i8 %x, i8 %y) {
+; RV32I-LABEL: shl_cttz_i8:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a2, a1, -1
+; RV32I-NEXT:    not a1, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    srli a2, a1, 1
+; RV32I-NEXT:    andi a2, a2, 85
+; RV32I-NEXT:    sub a1, a1, a2
+; RV32I-NEXT:    andi a2, a1, 51
+; RV32I-NEXT:    srli a1, a1, 2
+; RV32I-NEXT:    andi a1, a1, 51
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    srli a2, a1, 4
+; RV32I-NEXT:    add a1, a1, a2
+; RV32I-NEXT:    andi a1, a1, 15
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i8:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_i8:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a2, a1, -1
+; RV64IILLEGALI32-NEXT:    not a1, a1
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 1
+; RV64IILLEGALI32-NEXT:    andi a2, a2, 85
+; RV64IILLEGALI32-NEXT:    subw a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a2, a1, 51
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 2
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 51
+; RV64IILLEGALI32-NEXT:    add a1, a2, a1
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 4
+; RV64IILLEGALI32-NEXT:    add a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 15
+; RV64IILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_i8:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a1, a1
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_i8:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a2, a1, -1
+; RV64ILEGALI32-NEXT:    not a1, a1
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 1
+; RV64ILEGALI32-NEXT:    andi a2, a2, 85
+; RV64ILEGALI32-NEXT:    subw a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a2, a1, 51
+; RV64ILEGALI32-NEXT:    srliw a1, a1, 2
+; RV64ILEGALI32-NEXT:    andi a1, a1, 51
+; RV64ILEGALI32-NEXT:    add a1, a2, a1
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 4
+; RV64ILEGALI32-NEXT:    add a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a1, a1, 15
+; RV64ILEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_i8:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a1, a1
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i8 @llvm.cttz.i8(i8 %y, i1 true)
+  %res = shl i8 %x, %cttz
+  ret i8 %res
+}
+
+define i8 @shl_cttz_constant_i8(i8 %y) {
+; RV32I-LABEL: shl_cttz_constant_i8:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a1, a0, -1
+; RV32I-NEXT:    not a0, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    andi a1, a1, 85
+; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    andi a1, a0, 51
+; RV32I-NEXT:    srli a0, a0, 2
+; RV32I-NEXT:    andi a0, a0, 51
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    srli a1, a0, 4
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    andi a0, a0, 15
+; RV32I-NEXT:    li a1, 4
+; RV32I-NEXT:    sll a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i8:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a1, a0, -1
+; RV64IILLEGALI32-NEXT:    not a0, a0
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 1
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 85
+; RV64IILLEGALI32-NEXT:    subw a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a1, a0, 51
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 2
+; RV64IILLEGALI32-NEXT:    andi a0, a0, 51
+; RV64IILLEGALI32-NEXT:    add a0, a1, a0
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 4
+; RV64IILLEGALI32-NEXT:    add a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a0, a0, 15
+; RV64IILLEGALI32-NEXT:    li a1, 4
+; RV64IILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a0, a0
+; RV64ZBBILLEGALI32-NEXT:    li a1, 4
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a1, a0, -1
+; RV64ILEGALI32-NEXT:    not a0, a0
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 1
+; RV64ILEGALI32-NEXT:    andi a1, a1, 85
+; RV64ILEGALI32-NEXT:    subw a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a1, a0, 51
+; RV64ILEGALI32-NEXT:    srliw a0, a0, 2
+; RV64ILEGALI32-NEXT:    andi a0, a0, 51
+; RV64ILEGALI32-NEXT:    add a0, a1, a0
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 4
+; RV64ILEGALI32-NEXT:    add a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a0, a0, 15
+; RV64ILEGALI32-NEXT:    li a1, 4
+; RV64ILEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a0, a0
+; RV64ZBBLEGALI32-NEXT:    li a1, 4
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i8 @llvm.cttz.i8(i8 %y, i1 true)
+  %res = shl i8 4, %cttz
+  ret i8 %res
+}
+
+define i16 @shl_cttz_i16(i16 %x, i16 %y) {
+; RV32I-LABEL: shl_cttz_i16:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a2, a1, -1
+; RV32I-NEXT:    not a1, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    srli a2, a1, 1
+; RV32I-NEXT:    lui a3, 5
+; RV32I-NEXT:    addi a3, a3, 1365
+; RV32I-NEXT:    and a2, a2, a3
+; RV32I-NEXT:    sub a1, a1, a2
+; RV32I-NEXT:    lui a2, 3
+; RV32I-NEXT:    addi a2, a2, 819
+; RV32I-NEXT:    and a3, a1, a2
+; RV32I-NEXT:    srli a1, a1, 2
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    add a1, a3, a1
+; RV32I-NEXT:    srli a2, a1, 4
+; RV32I-NEXT:    add a1, a1, a2
+; RV32I-NEXT:    andi a2, a1, 15
+; RV32I-NEXT:    slli a1, a1, 20
+; RV32I-NEXT:    srli a1, a1, 28
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i16:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_i16:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a2, a1, -1
+; RV64IILLEGALI32-NEXT:    not a1, a1
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 1
+; RV64IILLEGALI32-NEXT:    lui a3, 5
+; RV64IILLEGALI32-NEXT:    addiw a3, a3, 1365
+; RV64IILLEGALI32-NEXT:    and a2, a2, a3
+; RV64IILLEGALI32-NEXT:    sub a1, a1, a2
+; RV64IILLEGALI32-NEXT:    lui a2, 3
+; RV64IILLEGALI32-NEXT:    addiw a2, a2, 819
+; RV64IILLEGALI32-NEXT:    and a3, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 2
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    add a1, a3, a1
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 4
+; RV64IILLEGALI32-NEXT:    add a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a2, a1, 15
+; RV64IILLEGALI32-NEXT:    slli a1, a1, 52
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 60
+; RV64IILLEGALI32-NEXT:    add a1, a2, a1
+; RV64IILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_i16:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a1, a1
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_i16:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a2, a1, -1
+; RV64ILEGALI32-NEXT:    not a1, a1
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 1
+; RV64ILEGALI32-NEXT:    lui a3, 5
+; RV64ILEGALI32-NEXT:    addi a3, a3, 1365
+; RV64ILEGALI32-NEXT:    and a2, a2, a3
+; RV64ILEGALI32-NEXT:    subw a1, a1, a2
+; RV64ILEGALI32-NEXT:    lui a2, 3
+; RV64ILEGALI32-NEXT:    addi a2, a2, 819
+; RV64ILEGALI32-NEXT:    and a3, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a1, a1, 2
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    add a1, a3, a1
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 4
+; RV64ILEGALI32-NEXT:    add a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a2, a1, 15
+; RV64ILEGALI32-NEXT:    slli a1, a1, 52
+; RV64ILEGALI32-NEXT:    srli a1, a1, 60
+; RV64ILEGALI32-NEXT:    add a1, a2, a1
+; RV64ILEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_i16:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a1, a1
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i16 @llvm.cttz.i16(i16 %y, i1 true)
+  %res = shl i16 %x, %cttz
+  ret i16 %res
+}
+
+define i16 @shl_cttz_constant_i16(i16 %y) {
+; RV32I-LABEL: shl_cttz_constant_i16:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a1, a0, -1
+; RV32I-NEXT:    not a0, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    lui a2, 5
+; RV32I-NEXT:    addi a2, a2, 1365
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    lui a1, 3
+; RV32I-NEXT:    addi a1, a1, 819
+; RV32I-NEXT:    and a2, a0, a1
+; RV32I-NEXT:    srli a0, a0, 2
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    srli a1, a0, 4
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    andi a1, a0, 15
+; RV32I-NEXT:    slli a0, a0, 20
+; RV32I-NEXT:    srli a0, a0, 28
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    li a1, 4
+; RV32I-NEXT:    sll a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i16:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a1, a0, -1
+; RV64IILLEGALI32-NEXT:    not a0, a0
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 1
+; RV64IILLEGALI32-NEXT:    lui a2, 5
+; RV64IILLEGALI32-NEXT:    addiw a2, a2, 1365
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    sub a0, a0, a1
+; RV64IILLEGALI32-NEXT:    lui a1, 3
+; RV64IILLEGALI32-NEXT:    addiw a1, a1, 819
+; RV64IILLEGALI32-NEXT:    and a2, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 2
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    add a0, a2, a0
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 4
+; RV64IILLEGALI32-NEXT:    add a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a1, a0, 15
+; RV64IILLEGALI32-NEXT:    slli a0, a0, 52
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 60
+; RV64IILLEGALI32-NEXT:    add a0, a1, a0
+; RV64IILLEGALI32-NEXT:    li a1, 4
+; RV64IILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a0, a0
+; RV64ZBBILLEGALI32-NEXT:    li a1, 4
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a1, a0, -1
+; RV64ILEGALI32-NEXT:    not a0, a0
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 1
+; RV64ILEGALI32-NEXT:    lui a2, 5
+; RV64ILEGALI32-NEXT:    addi a2, a2, 1365
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    subw a0, a0, a1
+; RV64ILEGALI32-NEXT:    lui a1, 3
+; RV64ILEGALI32-NEXT:    addi a1, a1, 819
+; RV64ILEGALI32-NEXT:    and a2, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a0, a0, 2
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    add a0, a2, a0
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 4
+; RV64ILEGALI32-NEXT:    add a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a1, a0, 15
+; RV64ILEGALI32-NEXT:    slli a0, a0, 52
+; RV64ILEGALI32-NEXT:    srli a0, a0, 60
+; RV64ILEGALI32-NEXT:    add a0, a1, a0
+; RV64ILEGALI32-NEXT:    li a1, 4
+; RV64ILEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a0, a0
+; RV64ZBBLEGALI32-NEXT:    li a1, 4
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i16 @llvm.cttz.i16(i16 %y, i1 true)
+  %res = shl i16 4, %cttz
+  ret i16 %res
+}
+
+define i32 @shl_cttz_i32(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    mul a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i32:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_i32:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    and a1, a1, a2
+; RV64I-NEXT:    lui a2, 30667
+; RV64I-NEXT:    addi a2, a2, 1329
+; RV64I-NEXT:    mul a1, a1, a2
+; RV64I-NEXT:    srliw a1, a1, 27
+; RV64I-NEXT:    lui a2, %hi(.LCPI4_0)
+; RV64I-NEXT:    addi a2, a2, %lo(.LCPI4_0)
+; RV64I-NEXT:    add a1, a2, a1
+; RV64I-NEXT:    lbu a1, 0(a1)
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_i32:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a1, a1
+; RV64ZBB-NEXT:    sllw a0, a0, a1
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 true)
+  %res = shl i32 %x, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_i32_zero_is_defined(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_i32_zero_is_defined:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    beqz a1, .LBB5_2
+; RV32I-NEXT:  # %bb.1: # %cond.false
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    lui a2, 30667
+; RV32I-NEXT:    addi a2, a2, 1329
+; RV32I-NEXT:    mul a1, a1, a2
+; RV32I-NEXT:    srli a1, a1, 27
+; RV32I-NEXT:    lui a2, %hi(.LCPI5_0)
+; RV32I-NEXT:    addi a2, a2, %lo(.LCPI5_0)
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    lbu a1, 0(a1)
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+; RV32I-NEXT:  .LBB5_2:
+; RV32I-NEXT:    li a1, 32
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i32_zero_is_defined:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_i32_zero_is_defined:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    sext.w a2, a1
+; RV64I-NEXT:    beqz a2, .LBB5_2
+; RV64I-NEXT:  # %bb.1: # %cond.false
+; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    and a1, a1, a2
+; RV64I-NEXT:    lui a2, 30667
+; RV64I-NEXT:    addi a2, a2, 1329
+; RV64I-NEXT:    mul a1, a1, a2
+; RV64I-NEXT:    srliw a1, a1, 27
+; RV64I-NEXT:    lui a2, %hi(.LCPI5_0)
+; RV64I-NEXT:    addi a2, a2, %lo(.LCPI5_0)
+; RV64I-NEXT:    add a1, a2, a1
+; RV64I-NEXT:    lbu a1, 0(a1)
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+; RV64I-NEXT:  .LBB5_2:
+; RV64I-NEXT:    li a1, 32
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_i32_zero_is_defined:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a1, a1
+; RV64ZBB-NEXT:    sllw a0, a0, a1
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 false)
+  %res = shl i32 %x, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_constant_i32(i32 %y) {
+; RV32I-LABEL: shl_cttz_constant_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    neg a1, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    slli a0, a0, 2
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i32:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_constant_i32:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    negw a1, a0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    lui a1, 30667
+; RV64I-NEXT:    addi a1, a1, 1329
+; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    srliw a0, a0, 27
+; RV64I-NEXT:    lui a1, %hi(.LCPI6_0)
+; RV64I-NEXT:    addi a1, a1, %lo(.LCPI6_0)
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    lbu a0, 0(a0)
+; RV64I-NEXT:    li a1, 4
+; RV64I-NEXT:    sllw a0, a1, a0
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_constant_i32:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a0, a0
+; RV64ZBB-NEXT:    li a1, 4
+; RV64ZBB-NEXT:    sllw a0, a1, a0
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 true)
+  %res = shl i32 4, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_multiuse_i32(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_multiuse_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    .cfi_def_cfa_offset 16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s0, 8(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s1, 4(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    .cfi_offset ra, -4
+; RV32I-NEXT:    .cfi_offset s0, -8
+; RV32I-NEXT:    .cfi_offset s1, -12
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    lui a2, 30667
+; RV32I-NEXT:    addi a2, a2, 1329
+; RV32I-NEXT:    mul a1, a1, a2
+; RV32I-NEXT:    srli a1, a1, 27
+; RV32I-NEXT:    lui a2, %hi(.LCPI7_0)
+; RV32I-NEXT:    addi a2, a2, %lo(.LCPI7_0)
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    lbu s0, 0(a1)
+; RV32I-NEXT:    mv s1, a0
+; RV32I-NEXT:    mv a0, s0
+; RV32I-NEXT:    call use32
+; RV32I-NEXT:    sll a0, s1, s0
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s0, 8(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s1, 4(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret...
[truncated]

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 22, 2024

Please move this into DAGCombine (and GISel if you want to handle both)

Done (only for DAGCombine).

// fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
// target.
if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You check cttz||cttz_zero_undef but hardcode the opcode in the legality check. Should you check for getOpcode's legality instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hardcode the opcode to avoid introducing regressions on rv64+zbb :(
Do you have better solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right condition might be isLegalOrCustom(CTTZ||CTTZ_ZERO_UNDEF)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right condition might be isLegalOrCustom(CTTZ||CTTZ_ZERO_UNDEF)

Unfortunately it doesn't work :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it suitable to add a TLI hook?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Outdated Show resolved Hide resolved
@dtcxzyw dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 41b7713 to 728dad3 Compare April 11, 2024 14:31
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also port the same to globalisel?

llvm/test/CodeGen/RISCV/shl-cttz.ll Outdated Show resolved Hide resolved
@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 27, 2024

Ping

@arsenm
Copy link
Contributor

arsenm commented May 28, 2024

Ping

This is already approved?

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit 9e8ecce into llvm:main May 29, 2024
4 checks passed
@dtcxzyw dtcxzyw deleted the perf/shl-cttz-to-mul-lsb branch May 29, 2024 10:26
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…is unsupported (llvm#85066)

This patch fold `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is
unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes llvm#84763.
@nathanchance
Copy link
Member

This patch causes a crash when building the Linux kernel for PowerPC.

A reduced C reproducer from cvise:

struct {
  short active_links;
} *iwl_mvm_exit_esr_vif;
short iwl_mvm_exit_esr_new_active_links;
void iwl_mvm_exit_esr(int link_to_keep) {
  int __trans_tmp_10;
  if (({
        int __ret_warn_on =
            iwl_mvm_exit_esr_vif->active_links & 1UL << link_to_keep;
        __asm__("");
        __builtin_expect(__ret_warn_on, 0);
      })) {
    long word = iwl_mvm_exit_esr_vif->active_links;
    __trans_tmp_10 = __builtin_ctzl(word);
    link_to_keep = __trans_tmp_10;
  }
  iwl_mvm_exit_esr_new_active_links = 1UL << link_to_keep;
}
$ clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
clang: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'link.i'.
4.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
 #0 0x00000000049d6f1c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d6f1c)
 #1 0x00000000049d4e64 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d4e64)
 #2 0x000000000495fc50 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x0000ffffa59ea810 (linux-vdso.so.1+0x810)
 #4 0x0000ffffa52b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
 #5 0x0000ffffa52659c0 gsignal (/lib64/libc.so.6+0x459c0)
 #6 0x0000ffffa5250288 abort (/lib64/libc.so.6+0x30288)
 #7 0x0000ffffa525e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
 #8 0x0000ffffa525e434 (/lib64/libc.so.6+0x3e434)
 #9 0x0000000005a020c0 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a020c0)
#10 0x00000000059e2980 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x59e2980)
#11 0x00000000058da048 (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x00000000058cbf14 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x00000000058c92c8 llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x58c92c8)
#14 0x0000000005a3761c llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a3761c)
#15 0x0000000005a35d1c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a35d1c)
#16 0x0000000005a32cb0 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a32cb0)
#17 0x00000000035e4b74 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003febc68 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x3febc68)
#19 0x0000000004527398 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527398)
#20 0x000000000452ee20 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x452ee20)
#21 0x0000000004527cb8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527cb8)
#22 0x0000000005123024 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5123024)
#23 0x0000000005143aec clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5143aec)
#24 0x00000000062f18cc clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x62f18cc)
#25 0x00000000054db43c clang::FrontendAction::Execute() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x54db43c)
#26 0x0000000005462164 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5462164)
#27 0x00000000055a62a8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x55a62a8)
#28 0x0000000002d186d0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d186d0)
#29 0x0000000002d15528 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#30 0x00000000053081b4 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#31 0x000000000495f9b8 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x495f9b8)
#32 0x0000000005307964 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5307964)
#33 0x00000000052cf6a8 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf6a8)
#34 0x00000000052cf8f4 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf8f4)
#35 0x00000000052e86b4 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52e86b4)
#36 0x0000000002d148a0 clang_main(int, char**, llvm::ToolContext const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d148a0)
#37 0x0000000002d229d8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d229d8)
#38 0x0000ffffa5250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#39 0x0000ffffa5250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#40 0x0000000002d130f0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d130f0)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
…

A reduced LLVM IR reproducer from llvm-reduce:

target datalayout = "e-m:e-Fn32-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

define void @iwl_mvm_exit_esr(i16 %0) {
entry:
  %1 = tail call i16 @llvm.cttz.i16(i16 %0, i1 false)
  %2 = zext i16 %1 to i64
  %.pre9 = shl i64 1, %2
  %conv7 = trunc i64 %.pre9 to i16
  store i16 %conv7, ptr null, align 2
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i16 @llvm.cttz.i16(i16, i1 immarg) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
$ llc -o /dev/null reduced.ll 
llc: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: llc -o /dev/null reduced.ll
1.      Running pass 'Function Pass Manager' on module 'reduced.ll'.
2.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
 #0 0x00000000048c153c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48c153c)
 #1 0x00000000048bf3e4 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48bf3e4)
 #2 0x00000000048c1c58 SignalHandler(int) Signals.cpp:0:0
 #3 0x0000ffff81a05810 (linux-vdso.so.1+0x810)
 #4 0x0000ffff812b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
 #5 0x0000ffff812659c0 gsignal (/lib64/libc.so.6+0x459c0)
 #6 0x0000ffff81250288 abort (/lib64/libc.so.6+0x30288)
 #7 0x0000ffff8125e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
 #8 0x0000ffff8125e434 (/lib64/libc.so.6+0x3e434)
 #9 0x00000000046cb000 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46cb000)
#10 0x00000000046ab460 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46ab460)
#11 0x0000000004571a8c (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x0000000004563958 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x0000000004560d0c llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4560d0c)
#14 0x0000000004702a64 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4702a64)
#15 0x0000000004701164 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4701164)
#16 0x00000000046fe0f8 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46fe0f8)
#17 0x0000000002ebf9d8 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003a85f44 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3a85f44)
#19 0x0000000003f9cc50 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9cc50)
#20 0x0000000003fa4794 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3fa4794)
#21 0x0000000003f9d600 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9d600)
#22 0x0000000002665bb8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x2665bb8)
#23 0x0000ffff81250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#24 0x0000ffff81250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#25 0x00000000026602b0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x26602b0)

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 31, 2024

This patch causes a crash when building the Linux kernel for PowerPC.

A reduced C reproducer from cvise:

struct {
  short active_links;
} *iwl_mvm_exit_esr_vif;
short iwl_mvm_exit_esr_new_active_links;
void iwl_mvm_exit_esr(int link_to_keep) {
  int __trans_tmp_10;
  if (({
        int __ret_warn_on =
            iwl_mvm_exit_esr_vif->active_links & 1UL << link_to_keep;
        __asm__("");
        __builtin_expect(__ret_warn_on, 0);
      })) {
    long word = iwl_mvm_exit_esr_vif->active_links;
    __trans_tmp_10 = __builtin_ctzl(word);
    link_to_keep = __trans_tmp_10;
  }
  iwl_mvm_exit_esr_new_active_links = 1UL << link_to_keep;
}
$ clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
clang: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'link.i'.
4.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
#0 0x00000000049d6f1c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d6f1c)
#1 0x00000000049d4e64 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d4e64)
#2 0x000000000495fc50 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
#3 0x0000ffffa59ea810 (linux-vdso.so.1+0x810)
#4 0x0000ffffa52b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
#5 0x0000ffffa52659c0 gsignal (/lib64/libc.so.6+0x459c0)
#6 0x0000ffffa5250288 abort (/lib64/libc.so.6+0x30288)
#7 0x0000ffffa525e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
#8 0x0000ffffa525e434 (/lib64/libc.so.6+0x3e434)
#9 0x0000000005a020c0 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a020c0)
#10 0x00000000059e2980 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x59e2980)
#11 0x00000000058da048 (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x00000000058cbf14 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x00000000058c92c8 llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x58c92c8)
#14 0x0000000005a3761c llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a3761c)
#15 0x0000000005a35d1c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a35d1c)
#16 0x0000000005a32cb0 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a32cb0)
#17 0x00000000035e4b74 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003febc68 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x3febc68)
#19 0x0000000004527398 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527398)
#20 0x000000000452ee20 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x452ee20)
#21 0x0000000004527cb8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527cb8)
#22 0x0000000005123024 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5123024)
#23 0x0000000005143aec clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5143aec)
#24 0x00000000062f18cc clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x62f18cc)
#25 0x00000000054db43c clang::FrontendAction::Execute() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x54db43c)
#26 0x0000000005462164 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5462164)
#27 0x00000000055a62a8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x55a62a8)
#28 0x0000000002d186d0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d186d0)
#29 0x0000000002d15528 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#30 0x00000000053081b4 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#31 0x000000000495f9b8 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x495f9b8)
#32 0x0000000005307964 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5307964)
#33 0x00000000052cf6a8 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf6a8)
#34 0x00000000052cf8f4 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf8f4)
#35 0x00000000052e86b4 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52e86b4)
#36 0x0000000002d148a0 clang_main(int, char**, llvm::ToolContext const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d148a0)
#37 0x0000000002d229d8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d229d8)
#38 0x0000ffffa5250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#39 0x0000ffffa5250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#40 0x0000000002d130f0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d130f0)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
…

A reduced LLVM IR reproducer from llvm-reduce:

target datalayout = "e-m:e-Fn32-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

define void @iwl_mvm_exit_esr(i16 %0) {
entry:
  %1 = tail call i16 @llvm.cttz.i16(i16 %0, i1 false)
  %2 = zext i16 %1 to i64
  %.pre9 = shl i64 1, %2
  %conv7 = trunc i64 %.pre9 to i16
  store i16 %conv7, ptr null, align 2
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i16 @llvm.cttz.i16(i16, i1 immarg) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
$ llc -o /dev/null reduced.ll 
llc: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: llc -o /dev/null reduced.ll
1.      Running pass 'Function Pass Manager' on module 'reduced.ll'.
2.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
#0 0x00000000048c153c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48c153c)
#1 0x00000000048bf3e4 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48bf3e4)
#2 0x00000000048c1c58 SignalHandler(int) Signals.cpp:0:0
#3 0x0000ffff81a05810 (linux-vdso.so.1+0x810)
#4 0x0000ffff812b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
#5 0x0000ffff812659c0 gsignal (/lib64/libc.so.6+0x459c0)
#6 0x0000ffff81250288 abort (/lib64/libc.so.6+0x30288)
#7 0x0000ffff8125e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
#8 0x0000ffff8125e434 (/lib64/libc.so.6+0x3e434)
#9 0x00000000046cb000 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46cb000)
#10 0x00000000046ab460 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46ab460)
#11 0x0000000004571a8c (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x0000000004563958 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x0000000004560d0c llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4560d0c)
#14 0x0000000004702a64 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4702a64)
#15 0x0000000004701164 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4701164)
#16 0x00000000046fe0f8 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46fe0f8)
#17 0x0000000002ebf9d8 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003a85f44 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3a85f44)
#19 0x0000000003f9cc50 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9cc50)
#20 0x0000000003fa4794 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3fa4794)
#21 0x0000000003f9d600 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9d600)
#22 0x0000000002665bb8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x2665bb8)
#23 0x0000ffff81250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#24 0x0000ffff81250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#25 0x00000000026602b0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x26602b0)

I will have a look.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 31, 2024

Same problem as #92753. I will post a fix later :)

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 31, 2024

@nathanchance Should be fixed by #94008.

dtcxzyw added a commit that referenced this pull request Jun 1, 2024
… X)` (#94008)

Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as #92753, the types of
LHS and RHS in shift nodes may differ.
+ When VT is smaller than ShiftVT, it is safe to use trunc.
+ When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
proofs.

Fixes issue
#85066 (comment).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiply by a power of 2 and ctz+shift should often be interchangeable
8 participants