[DAGCombine] Transform `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported #85066

dtcxzyw · 2024-03-13T12:11:45Z

This patch fold shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes #84763.

github-actions · 2024-03-13T12:15:29Z

✅ With the latest revision this PR passed the C/C++ code formatter.

nikic · 2024-03-13T12:56:51Z

Does this need to happen in CGP for some reason, or would DAGCombine also work?

dtcxzyw · 2024-03-13T13:27:06Z

Does this need to happen in CGP for some reason, or would DAGCombine also work?

I implement this in CGP to avoid duplicating the logic in GISel.
See also the comment #81404 (comment).

nikic · 2024-03-13T15:25:25Z

CGP should only be used for transforms that require cross-block reasoning, which does not seem to be the case here. Aspirationally GlobalISel does not need CGP at all, because it can perform those optimizations itself. (Realistically, we are far from that...)

wangpc-pp

I agree this should be done in DAGCombiner.

llvm/test/CodeGen/RISCV/shl-cttz.ll

llvm/lib/CodeGen/CodeGenPrepare.cpp

topperc · 2024-03-14T04:01:26Z

llvm/lib/CodeGen/CodeGenPrepare.cpp

+    // shl X, cttz(Y) -> mul (Y & -Y), X if cttz is unsupported on the target.
+    Value *Y;
+    if (match(I->getOperand(1),
+              m_OneUse(m_Intrinsic<Intrinsic::cttz>(m_Value(Y))))) {


You can match an intrinsic without specifying a match for all operands? That's surprising.

dtcxzyw · 2024-03-14T08:22:49Z

I agree this should be done in DAGCombiner.

@arsenm Any comments?

arsenm · 2024-03-14T08:25:43Z

I agree this should be done in DAGCombiner.

@arsenm Any comments?

Yes, this is straightforward combining. The downside is then you have to do it twice, in the DAG and GISel

RKSimon

Please move this into DAGCombine (and GISel if you want to handle both)

llvmbot · 2024-03-22T17:54:08Z

@llvm/pr-subscribers-llvm-selectiondag

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch fold shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported by the target.
Alive2: https://alive2.llvm.org/ce/z/AtLN5Y
Fixes #84763.

Patch is 28.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85066.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+12)
(added) llvm/test/CodeGen/RISCV/shl-cttz.ll (+807)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index dcd0310734ad72..a77054d1e33d61 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9962,6 +9962,18 @@ SDValue DAGCombiner::visitSHL(SDNode *N) {
     if (SDValue NewSHL = visitShiftByConstant(N))
       return NewSHL;
 
+  // fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
+  // target.
+  if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
+      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&
+      TLI.isOperationLegalOrCustom(ISD::MUL, VT)) {
+    SDValue Y = N1.getOperand(0);
+    SDLoc DL(N);
+    SDValue NegY = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT), Y);
+    SDValue And = DAG.getNode(ISD::AND, DL, VT, Y, NegY);
+    return DAG.getNode(ISD::MUL, DL, VT, And, N0);
+  }
+
   if (SimplifyDemandedBits(SDValue(N, 0)))
     return SDValue(N, 0);
 
diff --git a/llvm/test/CodeGen/RISCV/shl-cttz.ll b/llvm/test/CodeGen/RISCV/shl-cttz.ll
new file mode 100644
index 00000000000000..e3ed16d4971410
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/shl-cttz.ll
@@ -0,0 +1,807 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -mtriple=riscv32 -mattr=+m -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=RV32I
+; RUN: llc -mtriple=riscv32 -mattr=+m,+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=RV32ZBB
+; RUN: llc -mtriple=riscv64 -mattr=+m -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64I,RV64IILLEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m,+zbb -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64ZBB,RV64ZBBILLEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m -riscv-experimental-rv64-legal-i32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64I,RV64ILEGALI32
+; RUN: llc -mtriple=riscv64 -mattr=+m,+zbb -riscv-experimental-rv64-legal-i32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV64ZBB,RV64ZBBLEGALI32
+
+define i8 @shl_cttz_i8(i8 %x, i8 %y) {
+; RV32I-LABEL: shl_cttz_i8:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a2, a1, -1
+; RV32I-NEXT:    not a1, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    srli a2, a1, 1
+; RV32I-NEXT:    andi a2, a2, 85
+; RV32I-NEXT:    sub a1, a1, a2
+; RV32I-NEXT:    andi a2, a1, 51
+; RV32I-NEXT:    srli a1, a1, 2
+; RV32I-NEXT:    andi a1, a1, 51
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    srli a2, a1, 4
+; RV32I-NEXT:    add a1, a1, a2
+; RV32I-NEXT:    andi a1, a1, 15
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i8:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_i8:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a2, a1, -1
+; RV64IILLEGALI32-NEXT:    not a1, a1
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 1
+; RV64IILLEGALI32-NEXT:    andi a2, a2, 85
+; RV64IILLEGALI32-NEXT:    subw a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a2, a1, 51
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 2
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 51
+; RV64IILLEGALI32-NEXT:    add a1, a2, a1
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 4
+; RV64IILLEGALI32-NEXT:    add a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 15
+; RV64IILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_i8:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a1, a1
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_i8:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a2, a1, -1
+; RV64ILEGALI32-NEXT:    not a1, a1
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 1
+; RV64ILEGALI32-NEXT:    andi a2, a2, 85
+; RV64ILEGALI32-NEXT:    subw a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a2, a1, 51
+; RV64ILEGALI32-NEXT:    srliw a1, a1, 2
+; RV64ILEGALI32-NEXT:    andi a1, a1, 51
+; RV64ILEGALI32-NEXT:    add a1, a2, a1
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 4
+; RV64ILEGALI32-NEXT:    add a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a1, a1, 15
+; RV64ILEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_i8:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a1, a1
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i8 @llvm.cttz.i8(i8 %y, i1 true)
+  %res = shl i8 %x, %cttz
+  ret i8 %res
+}
+
+define i8 @shl_cttz_constant_i8(i8 %y) {
+; RV32I-LABEL: shl_cttz_constant_i8:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a1, a0, -1
+; RV32I-NEXT:    not a0, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    andi a1, a1, 85
+; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    andi a1, a0, 51
+; RV32I-NEXT:    srli a0, a0, 2
+; RV32I-NEXT:    andi a0, a0, 51
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    srli a1, a0, 4
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    andi a0, a0, 15
+; RV32I-NEXT:    li a1, 4
+; RV32I-NEXT:    sll a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i8:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a1, a0, -1
+; RV64IILLEGALI32-NEXT:    not a0, a0
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 1
+; RV64IILLEGALI32-NEXT:    andi a1, a1, 85
+; RV64IILLEGALI32-NEXT:    subw a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a1, a0, 51
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 2
+; RV64IILLEGALI32-NEXT:    andi a0, a0, 51
+; RV64IILLEGALI32-NEXT:    add a0, a1, a0
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 4
+; RV64IILLEGALI32-NEXT:    add a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a0, a0, 15
+; RV64IILLEGALI32-NEXT:    li a1, 4
+; RV64IILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a0, a0
+; RV64ZBBILLEGALI32-NEXT:    li a1, 4
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a1, a0, -1
+; RV64ILEGALI32-NEXT:    not a0, a0
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 1
+; RV64ILEGALI32-NEXT:    andi a1, a1, 85
+; RV64ILEGALI32-NEXT:    subw a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a1, a0, 51
+; RV64ILEGALI32-NEXT:    srliw a0, a0, 2
+; RV64ILEGALI32-NEXT:    andi a0, a0, 51
+; RV64ILEGALI32-NEXT:    add a0, a1, a0
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 4
+; RV64ILEGALI32-NEXT:    add a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a0, a0, 15
+; RV64ILEGALI32-NEXT:    li a1, 4
+; RV64ILEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_constant_i8:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a0, a0
+; RV64ZBBLEGALI32-NEXT:    li a1, 4
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i8 @llvm.cttz.i8(i8 %y, i1 true)
+  %res = shl i8 4, %cttz
+  ret i8 %res
+}
+
+define i16 @shl_cttz_i16(i16 %x, i16 %y) {
+; RV32I-LABEL: shl_cttz_i16:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a2, a1, -1
+; RV32I-NEXT:    not a1, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    srli a2, a1, 1
+; RV32I-NEXT:    lui a3, 5
+; RV32I-NEXT:    addi a3, a3, 1365
+; RV32I-NEXT:    and a2, a2, a3
+; RV32I-NEXT:    sub a1, a1, a2
+; RV32I-NEXT:    lui a2, 3
+; RV32I-NEXT:    addi a2, a2, 819
+; RV32I-NEXT:    and a3, a1, a2
+; RV32I-NEXT:    srli a1, a1, 2
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    add a1, a3, a1
+; RV32I-NEXT:    srli a2, a1, 4
+; RV32I-NEXT:    add a1, a1, a2
+; RV32I-NEXT:    andi a2, a1, 15
+; RV32I-NEXT:    slli a1, a1, 20
+; RV32I-NEXT:    srli a1, a1, 28
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i16:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_i16:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a2, a1, -1
+; RV64IILLEGALI32-NEXT:    not a1, a1
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 1
+; RV64IILLEGALI32-NEXT:    lui a3, 5
+; RV64IILLEGALI32-NEXT:    addiw a3, a3, 1365
+; RV64IILLEGALI32-NEXT:    and a2, a2, a3
+; RV64IILLEGALI32-NEXT:    sub a1, a1, a2
+; RV64IILLEGALI32-NEXT:    lui a2, 3
+; RV64IILLEGALI32-NEXT:    addiw a2, a2, 819
+; RV64IILLEGALI32-NEXT:    and a3, a1, a2
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 2
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    add a1, a3, a1
+; RV64IILLEGALI32-NEXT:    srli a2, a1, 4
+; RV64IILLEGALI32-NEXT:    add a1, a1, a2
+; RV64IILLEGALI32-NEXT:    andi a2, a1, 15
+; RV64IILLEGALI32-NEXT:    slli a1, a1, 52
+; RV64IILLEGALI32-NEXT:    srli a1, a1, 60
+; RV64IILLEGALI32-NEXT:    add a1, a2, a1
+; RV64IILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_i16:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a1, a1
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a0, a1
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_i16:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a2, a1, -1
+; RV64ILEGALI32-NEXT:    not a1, a1
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 1
+; RV64ILEGALI32-NEXT:    lui a3, 5
+; RV64ILEGALI32-NEXT:    addi a3, a3, 1365
+; RV64ILEGALI32-NEXT:    and a2, a2, a3
+; RV64ILEGALI32-NEXT:    subw a1, a1, a2
+; RV64ILEGALI32-NEXT:    lui a2, 3
+; RV64ILEGALI32-NEXT:    addi a2, a2, 819
+; RV64ILEGALI32-NEXT:    and a3, a1, a2
+; RV64ILEGALI32-NEXT:    srliw a1, a1, 2
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    add a1, a3, a1
+; RV64ILEGALI32-NEXT:    srliw a2, a1, 4
+; RV64ILEGALI32-NEXT:    add a1, a1, a2
+; RV64ILEGALI32-NEXT:    andi a2, a1, 15
+; RV64ILEGALI32-NEXT:    slli a1, a1, 52
+; RV64ILEGALI32-NEXT:    srli a1, a1, 60
+; RV64ILEGALI32-NEXT:    add a1, a2, a1
+; RV64ILEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_i16:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a1, a1
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a0, a1
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i16 @llvm.cttz.i16(i16 %y, i1 true)
+  %res = shl i16 %x, %cttz
+  ret i16 %res
+}
+
+define i16 @shl_cttz_constant_i16(i16 %y) {
+; RV32I-LABEL: shl_cttz_constant_i16:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi a1, a0, -1
+; RV32I-NEXT:    not a0, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    srli a1, a0, 1
+; RV32I-NEXT:    lui a2, 5
+; RV32I-NEXT:    addi a2, a2, 1365
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    lui a1, 3
+; RV32I-NEXT:    addi a1, a1, 819
+; RV32I-NEXT:    and a2, a0, a1
+; RV32I-NEXT:    srli a0, a0, 2
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    srli a1, a0, 4
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    andi a1, a0, 15
+; RV32I-NEXT:    slli a0, a0, 20
+; RV32I-NEXT:    srli a0, a0, 28
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    li a1, 4
+; RV32I-NEXT:    sll a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i16:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64IILLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64IILLEGALI32:       # %bb.0: # %entry
+; RV64IILLEGALI32-NEXT:    addi a1, a0, -1
+; RV64IILLEGALI32-NEXT:    not a0, a0
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 1
+; RV64IILLEGALI32-NEXT:    lui a2, 5
+; RV64IILLEGALI32-NEXT:    addiw a2, a2, 1365
+; RV64IILLEGALI32-NEXT:    and a1, a1, a2
+; RV64IILLEGALI32-NEXT:    sub a0, a0, a1
+; RV64IILLEGALI32-NEXT:    lui a1, 3
+; RV64IILLEGALI32-NEXT:    addiw a1, a1, 819
+; RV64IILLEGALI32-NEXT:    and a2, a0, a1
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 2
+; RV64IILLEGALI32-NEXT:    and a0, a0, a1
+; RV64IILLEGALI32-NEXT:    add a0, a2, a0
+; RV64IILLEGALI32-NEXT:    srli a1, a0, 4
+; RV64IILLEGALI32-NEXT:    add a0, a0, a1
+; RV64IILLEGALI32-NEXT:    andi a1, a0, 15
+; RV64IILLEGALI32-NEXT:    slli a0, a0, 52
+; RV64IILLEGALI32-NEXT:    srli a0, a0, 60
+; RV64IILLEGALI32-NEXT:    add a0, a1, a0
+; RV64IILLEGALI32-NEXT:    li a1, 4
+; RV64IILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64IILLEGALI32-NEXT:    ret
+;
+; RV64ZBBILLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ZBBILLEGALI32:       # %bb.0: # %entry
+; RV64ZBBILLEGALI32-NEXT:    ctz a0, a0
+; RV64ZBBILLEGALI32-NEXT:    li a1, 4
+; RV64ZBBILLEGALI32-NEXT:    sll a0, a1, a0
+; RV64ZBBILLEGALI32-NEXT:    ret
+;
+; RV64ILEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ILEGALI32:       # %bb.0: # %entry
+; RV64ILEGALI32-NEXT:    addi a1, a0, -1
+; RV64ILEGALI32-NEXT:    not a0, a0
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 1
+; RV64ILEGALI32-NEXT:    lui a2, 5
+; RV64ILEGALI32-NEXT:    addi a2, a2, 1365
+; RV64ILEGALI32-NEXT:    and a1, a1, a2
+; RV64ILEGALI32-NEXT:    subw a0, a0, a1
+; RV64ILEGALI32-NEXT:    lui a1, 3
+; RV64ILEGALI32-NEXT:    addi a1, a1, 819
+; RV64ILEGALI32-NEXT:    and a2, a0, a1
+; RV64ILEGALI32-NEXT:    srliw a0, a0, 2
+; RV64ILEGALI32-NEXT:    and a0, a0, a1
+; RV64ILEGALI32-NEXT:    add a0, a2, a0
+; RV64ILEGALI32-NEXT:    srliw a1, a0, 4
+; RV64ILEGALI32-NEXT:    add a0, a0, a1
+; RV64ILEGALI32-NEXT:    andi a1, a0, 15
+; RV64ILEGALI32-NEXT:    slli a0, a0, 52
+; RV64ILEGALI32-NEXT:    srli a0, a0, 60
+; RV64ILEGALI32-NEXT:    add a0, a1, a0
+; RV64ILEGALI32-NEXT:    li a1, 4
+; RV64ILEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ILEGALI32-NEXT:    ret
+;
+; RV64ZBBLEGALI32-LABEL: shl_cttz_constant_i16:
+; RV64ZBBLEGALI32:       # %bb.0: # %entry
+; RV64ZBBLEGALI32-NEXT:    ctzw a0, a0
+; RV64ZBBLEGALI32-NEXT:    li a1, 4
+; RV64ZBBLEGALI32-NEXT:    sllw a0, a1, a0
+; RV64ZBBLEGALI32-NEXT:    ret
+entry:
+  %cttz = call i16 @llvm.cttz.i16(i16 %y, i1 true)
+  %res = shl i16 4, %cttz
+  ret i16 %res
+}
+
+define i32 @shl_cttz_i32(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    mul a0, a1, a0
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i32:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_i32:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    and a1, a1, a2
+; RV64I-NEXT:    lui a2, 30667
+; RV64I-NEXT:    addi a2, a2, 1329
+; RV64I-NEXT:    mul a1, a1, a2
+; RV64I-NEXT:    srliw a1, a1, 27
+; RV64I-NEXT:    lui a2, %hi(.LCPI4_0)
+; RV64I-NEXT:    addi a2, a2, %lo(.LCPI4_0)
+; RV64I-NEXT:    add a1, a2, a1
+; RV64I-NEXT:    lbu a1, 0(a1)
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_i32:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a1, a1
+; RV64ZBB-NEXT:    sllw a0, a0, a1
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 true)
+  %res = shl i32 %x, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_i32_zero_is_defined(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_i32_zero_is_defined:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    beqz a1, .LBB5_2
+; RV32I-NEXT:  # %bb.1: # %cond.false
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    lui a2, 30667
+; RV32I-NEXT:    addi a2, a2, 1329
+; RV32I-NEXT:    mul a1, a1, a2
+; RV32I-NEXT:    srli a1, a1, 27
+; RV32I-NEXT:    lui a2, %hi(.LCPI5_0)
+; RV32I-NEXT:    addi a2, a2, %lo(.LCPI5_0)
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    lbu a1, 0(a1)
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+; RV32I-NEXT:  .LBB5_2:
+; RV32I-NEXT:    li a1, 32
+; RV32I-NEXT:    sll a0, a0, a1
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_i32_zero_is_defined:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a1, a1
+; RV32ZBB-NEXT:    sll a0, a0, a1
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_i32_zero_is_defined:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    sext.w a2, a1
+; RV64I-NEXT:    beqz a2, .LBB5_2
+; RV64I-NEXT:  # %bb.1: # %cond.false
+; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    and a1, a1, a2
+; RV64I-NEXT:    lui a2, 30667
+; RV64I-NEXT:    addi a2, a2, 1329
+; RV64I-NEXT:    mul a1, a1, a2
+; RV64I-NEXT:    srliw a1, a1, 27
+; RV64I-NEXT:    lui a2, %hi(.LCPI5_0)
+; RV64I-NEXT:    addi a2, a2, %lo(.LCPI5_0)
+; RV64I-NEXT:    add a1, a2, a1
+; RV64I-NEXT:    lbu a1, 0(a1)
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+; RV64I-NEXT:  .LBB5_2:
+; RV64I-NEXT:    li a1, 32
+; RV64I-NEXT:    sllw a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_i32_zero_is_defined:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a1, a1
+; RV64ZBB-NEXT:    sllw a0, a0, a1
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 false)
+  %res = shl i32 %x, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_constant_i32(i32 %y) {
+; RV32I-LABEL: shl_cttz_constant_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    neg a1, a0
+; RV32I-NEXT:    and a0, a0, a1
+; RV32I-NEXT:    slli a0, a0, 2
+; RV32I-NEXT:    ret
+;
+; RV32ZBB-LABEL: shl_cttz_constant_i32:
+; RV32ZBB:       # %bb.0: # %entry
+; RV32ZBB-NEXT:    ctz a0, a0
+; RV32ZBB-NEXT:    li a1, 4
+; RV32ZBB-NEXT:    sll a0, a1, a0
+; RV32ZBB-NEXT:    ret
+;
+; RV64I-LABEL: shl_cttz_constant_i32:
+; RV64I:       # %bb.0: # %entry
+; RV64I-NEXT:    negw a1, a0
+; RV64I-NEXT:    and a0, a0, a1
+; RV64I-NEXT:    lui a1, 30667
+; RV64I-NEXT:    addi a1, a1, 1329
+; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    srliw a0, a0, 27
+; RV64I-NEXT:    lui a1, %hi(.LCPI6_0)
+; RV64I-NEXT:    addi a1, a1, %lo(.LCPI6_0)
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    lbu a0, 0(a0)
+; RV64I-NEXT:    li a1, 4
+; RV64I-NEXT:    sllw a0, a1, a0
+; RV64I-NEXT:    ret
+;
+; RV64ZBB-LABEL: shl_cttz_constant_i32:
+; RV64ZBB:       # %bb.0: # %entry
+; RV64ZBB-NEXT:    ctzw a0, a0
+; RV64ZBB-NEXT:    li a1, 4
+; RV64ZBB-NEXT:    sllw a0, a1, a0
+; RV64ZBB-NEXT:    ret
+entry:
+  %cttz = call i32 @llvm.cttz.i32(i32 %y, i1 true)
+  %res = shl i32 4, %cttz
+  ret i32 %res
+}
+
+define i32 @shl_cttz_multiuse_i32(i32 %x, i32 %y) {
+; RV32I-LABEL: shl_cttz_multiuse_i32:
+; RV32I:       # %bb.0: # %entry
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    .cfi_def_cfa_offset 16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s0, 8(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sw s1, 4(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    .cfi_offset ra, -4
+; RV32I-NEXT:    .cfi_offset s0, -8
+; RV32I-NEXT:    .cfi_offset s1, -12
+; RV32I-NEXT:    neg a2, a1
+; RV32I-NEXT:    and a1, a1, a2
+; RV32I-NEXT:    lui a2, 30667
+; RV32I-NEXT:    addi a2, a2, 1329
+; RV32I-NEXT:    mul a1, a1, a2
+; RV32I-NEXT:    srli a1, a1, 27
+; RV32I-NEXT:    lui a2, %hi(.LCPI7_0)
+; RV32I-NEXT:    addi a2, a2, %lo(.LCPI7_0)
+; RV32I-NEXT:    add a1, a2, a1
+; RV32I-NEXT:    lbu s0, 0(a1)
+; RV32I-NEXT:    mv s1, a0
+; RV32I-NEXT:    mv a0, s0
+; RV32I-NEXT:    call use32
+; RV32I-NEXT:    sll a0, s1, s0
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s0, 8(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    lw s1, 4(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret...
[truncated]

dtcxzyw · 2024-03-22T17:54:30Z

Please move this into DAGCombine (and GISel if you want to handle both)

Done (only for DAGCombine).

arsenm · 2024-03-26T08:07:05Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+  // fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
+  // target.
+  if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
+      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&


You check cttz||cttz_zero_undef but hardcode the opcode in the legality check. Should you check for getOpcode's legality instead?

I hardcode the opcode to avoid introducing regressions on rv64+zbb :(
Do you have better solution?

Right condition might be isLegalOrCustom(CTTZ||CTTZ_ZERO_UNDEF)

Right condition might be isLegalOrCustom(CTTZ||CTTZ_ZERO_UNDEF)

Unfortunately it doesn't work :(

Is it suitable to add a TLI hook?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

…is unsupported

arsenm

Can you also port the same to globalisel?

llvm/test/CodeGen/RISCV/shl-cttz.ll

dtcxzyw · 2024-05-27T18:07:46Z

Ping

arsenm · 2024-05-28T14:00:52Z

Ping

This is already approved?

RKSimon

LGTM

…is unsupported (llvm#85066) This patch fold `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported by the target. Alive2: https://alive2.llvm.org/ce/z/AtLN5Y Fixes llvm#84763.

nathanchance · 2024-05-31T15:59:53Z

This patch causes a crash when building the Linux kernel for PowerPC.

A reduced C reproducer from cvise:

struct {
  short active_links;
} *iwl_mvm_exit_esr_vif;
short iwl_mvm_exit_esr_new_active_links;
void iwl_mvm_exit_esr(int link_to_keep) {
  int __trans_tmp_10;
  if (({
        int __ret_warn_on =
            iwl_mvm_exit_esr_vif->active_links & 1UL << link_to_keep;
        __asm__("");
        __builtin_expect(__ret_warn_on, 0);
      })) {
    long word = iwl_mvm_exit_esr_vif->active_links;
    __trans_tmp_10 = __builtin_ctzl(word);
    link_to_keep = __trans_tmp_10;
  }
  iwl_mvm_exit_esr_new_active_links = 1UL << link_to_keep;
}

$ clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
clang: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'link.i'.
4.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
 #0 0x00000000049d6f1c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d6f1c)
 #1 0x00000000049d4e64 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d4e64)
 #2 0x000000000495fc50 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x0000ffffa59ea810 (linux-vdso.so.1+0x810)
 #4 0x0000ffffa52b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
 #5 0x0000ffffa52659c0 gsignal (/lib64/libc.so.6+0x459c0)
 #6 0x0000ffffa5250288 abort (/lib64/libc.so.6+0x30288)
 #7 0x0000ffffa525e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
 #8 0x0000ffffa525e434 (/lib64/libc.so.6+0x3e434)
 #9 0x0000000005a020c0 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a020c0)
#10 0x00000000059e2980 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x59e2980)
#11 0x00000000058da048 (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x00000000058cbf14 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x00000000058c92c8 llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x58c92c8)
#14 0x0000000005a3761c llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a3761c)
#15 0x0000000005a35d1c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a35d1c)
#16 0x0000000005a32cb0 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a32cb0)
#17 0x00000000035e4b74 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003febc68 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x3febc68)
#19 0x0000000004527398 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527398)
#20 0x000000000452ee20 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x452ee20)
#21 0x0000000004527cb8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527cb8)
#22 0x0000000005123024 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5123024)
#23 0x0000000005143aec clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5143aec)
#24 0x00000000062f18cc clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x62f18cc)
#25 0x00000000054db43c clang::FrontendAction::Execute() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x54db43c)
#26 0x0000000005462164 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5462164)
#27 0x00000000055a62a8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x55a62a8)
#28 0x0000000002d186d0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d186d0)
#29 0x0000000002d15528 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#30 0x00000000053081b4 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#31 0x000000000495f9b8 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x495f9b8)
#32 0x0000000005307964 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5307964)
#33 0x00000000052cf6a8 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf6a8)
#34 0x00000000052cf8f4 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf8f4)
#35 0x00000000052e86b4 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52e86b4)
#36 0x0000000002d148a0 clang_main(int, char**, llvm::ToolContext const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d148a0)
#37 0x0000000002d229d8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d229d8)
#38 0x0000ffffa5250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#39 0x0000ffffa5250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#40 0x0000000002d130f0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d130f0)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
…

A reduced LLVM IR reproducer from llvm-reduce:

target datalayout = "e-m:e-Fn32-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

define void @iwl_mvm_exit_esr(i16 %0) {
entry:
  %1 = tail call i16 @llvm.cttz.i16(i16 %0, i1 false)
  %2 = zext i16 %1 to i64
  %.pre9 = shl i64 1, %2
  %conv7 = trunc i64 %.pre9 to i16
  store i16 %conv7, ptr null, align 2
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i16 @llvm.cttz.i16(i16, i1 immarg) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

$ llc -o /dev/null reduced.ll 
llc: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: llc -o /dev/null reduced.ll
1.      Running pass 'Function Pass Manager' on module 'reduced.ll'.
2.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
 #0 0x00000000048c153c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48c153c)
 #1 0x00000000048bf3e4 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48bf3e4)
 #2 0x00000000048c1c58 SignalHandler(int) Signals.cpp:0:0
 #3 0x0000ffff81a05810 (linux-vdso.so.1+0x810)
 #4 0x0000ffff812b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
 #5 0x0000ffff812659c0 gsignal (/lib64/libc.so.6+0x459c0)
 #6 0x0000ffff81250288 abort (/lib64/libc.so.6+0x30288)
 #7 0x0000ffff8125e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
 #8 0x0000ffff8125e434 (/lib64/libc.so.6+0x3e434)
 #9 0x00000000046cb000 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46cb000)
#10 0x00000000046ab460 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46ab460)
#11 0x0000000004571a8c (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x0000000004563958 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x0000000004560d0c llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4560d0c)
#14 0x0000000004702a64 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4702a64)
#15 0x0000000004701164 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4701164)
#16 0x00000000046fe0f8 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46fe0f8)
#17 0x0000000002ebf9d8 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003a85f44 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3a85f44)
#19 0x0000000003f9cc50 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9cc50)
#20 0x0000000003fa4794 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3fa4794)
#21 0x0000000003f9d600 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9d600)
#22 0x0000000002665bb8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x2665bb8)
#23 0x0000ffff81250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#24 0x0000ffff81250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#25 0x00000000026602b0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x26602b0)

dtcxzyw · 2024-05-31T16:13:21Z

This patch causes a crash when building the Linux kernel for PowerPC.

A reduced C reproducer from cvise:

struct {
  short active_links;
} *iwl_mvm_exit_esr_vif;
short iwl_mvm_exit_esr_new_active_links;
void iwl_mvm_exit_esr(int link_to_keep) {
  int __trans_tmp_10;
  if (({
        int __ret_warn_on =
            iwl_mvm_exit_esr_vif->active_links & 1UL << link_to_keep;
        __asm__("");
        __builtin_expect(__ret_warn_on, 0);
      })) {
    long word = iwl_mvm_exit_esr_vif->active_links;
    __trans_tmp_10 = __builtin_ctzl(word);
    link_to_keep = __trans_tmp_10;
  }
  iwl_mvm_exit_esr_new_active_links = 1UL << link_to_keep;
}

$ clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
clang: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=powerpc64le-linux-gnu -O2 -c -o /dev/null link.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'link.i'.
4.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
#0 0x00000000049d6f1c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d6f1c)
#1 0x00000000049d4e64 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x49d4e64)
#2 0x000000000495fc50 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
#3 0x0000ffffa59ea810 (linux-vdso.so.1+0x810)
#4 0x0000ffffa52b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
#5 0x0000ffffa52659c0 gsignal (/lib64/libc.so.6+0x459c0)
#6 0x0000ffffa5250288 abort (/lib64/libc.so.6+0x30288)
#7 0x0000ffffa525e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
#8 0x0000ffffa525e434 (/lib64/libc.so.6+0x3e434)
#9 0x0000000005a020c0 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a020c0)
#10 0x00000000059e2980 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x59e2980)
#11 0x00000000058da048 (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x00000000058cbf14 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x00000000058c92c8 llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x58c92c8)
#14 0x0000000005a3761c llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a3761c)
#15 0x0000000005a35d1c llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a35d1c)
#16 0x0000000005a32cb0 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5a32cb0)
#17 0x00000000035e4b74 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003febc68 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x3febc68)
#19 0x0000000004527398 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527398)
#20 0x000000000452ee20 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x452ee20)
#21 0x0000000004527cb8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x4527cb8)
#22 0x0000000005123024 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5123024)
#23 0x0000000005143aec clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5143aec)
#24 0x00000000062f18cc clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x62f18cc)
#25 0x00000000054db43c clang::FrontendAction::Execute() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x54db43c)
#26 0x0000000005462164 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5462164)
#27 0x00000000055a62a8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x55a62a8)
#28 0x0000000002d186d0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d186d0)
#29 0x0000000002d15528 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#30 0x00000000053081b4 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#31 0x000000000495f9b8 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x495f9b8)
#32 0x0000000005307964 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x5307964)
#33 0x00000000052cf6a8 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf6a8)
#34 0x00000000052cf8f4 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52cf8f4)
#35 0x00000000052e86b4 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x52e86b4)
#36 0x0000000002d148a0 clang_main(int, char**, llvm::ToolContext const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d148a0)
#37 0x0000000002d229d8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d229d8)
#38 0x0000ffffa5250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#39 0x0000ffffa5250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#40 0x0000000002d130f0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/clang-19+0x2d130f0)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
…

A reduced LLVM IR reproducer from llvm-reduce:

target datalayout = "e-m:e-Fn32-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

define void @iwl_mvm_exit_esr(i16 %0) {
entry:
  %1 = tail call i16 @llvm.cttz.i16(i16 %0, i1 false)
  %2 = zext i16 %1 to i64
  %.pre9 = shl i64 1, %2
  %conv7 = trunc i64 %.pre9 to i16
  store i16 %conv7, ptr null, align 2
  ret void
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i16 @llvm.cttz.i16(i16, i1 immarg) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

$ llc -o /dev/null reduced.ll 
llc: /home/nathan/tmp/cvise.5XHgoAhXPL/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6878: SDValue llvm::SelectionDAG::getNode(unsigned int, const SDLoc &, EVT, SDValue, SDValue, const SDNodeFlags): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: llc -o /dev/null reduced.ll
1.      Running pass 'Function Pass Manager' on module 'reduced.ll'.
2.      Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@iwl_mvm_exit_esr'
#0 0x00000000048c153c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48c153c)
#1 0x00000000048bf3e4 llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x48bf3e4)
#2 0x00000000048c1c58 SignalHandler(int) Signals.cpp:0:0
#3 0x0000ffff81a05810 (linux-vdso.so.1+0x810)
#4 0x0000ffff812b85e0 __pthread_kill_implementation (/lib64/libc.so.6+0x985e0)
#5 0x0000ffff812659c0 gsignal (/lib64/libc.so.6+0x459c0)
#6 0x0000ffff81250288 abort (/lib64/libc.so.6+0x30288)
#7 0x0000ffff8125e3c0 __assert_fail_base (/lib64/libc.so.6+0x3e3c0)
#8 0x0000ffff8125e434 (/lib64/libc.so.6+0x3e434)
#9 0x00000000046cb000 llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc const&, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDNodeFlags) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46cb000)
#10 0x00000000046ab460 llvm::SelectionDAG::getNegative(llvm::SDValue, llvm::SDLoc const&, llvm::EVT) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46ab460)
#11 0x0000000004571a8c (anonymous namespace)::DAGCombiner::visitSHL(llvm::SDNode*) DAGCombiner.cpp:0:0
#12 0x0000000004563958 (anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) DAGCombiner.cpp:0:0
#13 0x0000000004560d0c llvm::SelectionDAG::Combine(llvm::CombineLevel, llvm::AAResults*, llvm::CodeGenOptLevel) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4560d0c)
#14 0x0000000004702a64 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4702a64)
#15 0x0000000004701164 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x4701164)
#16 0x00000000046fe0f8 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x46fe0f8)
#17 0x0000000002ebf9d8 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#18 0x0000000003a85f44 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3a85f44)
#19 0x0000000003f9cc50 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9cc50)
#20 0x0000000003fa4794 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3fa4794)
#21 0x0000000003f9d600 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x3f9d600)
#22 0x0000000002665bb8 main (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x2665bb8)
#23 0x0000ffff81250b1c __libc_start_call_main (/lib64/libc.so.6+0x30b1c)
#24 0x0000ffff81250bfc __libc_start_main@GLIBC_2.17 (/lib64/libc.so.6+0x30bfc)
#25 0x00000000026602b0 _start (/home/nathan/tmp/cvise.5XHgoAhXPL/install/llvm-bad/bin/llc+0x26602b0)

I will have a look.

dtcxzyw · 2024-05-31T16:24:39Z

Same problem as #92753. I will post a fix later :)

dtcxzyw · 2024-05-31T18:48:46Z

@nathanchance Should be fixed by #94008.

… X)` (#94008) Proof: https://alive2.llvm.org/ce/z/J7GBMU Same as #92753, the types of LHS and RHS in shift nodes may differ. + When VT is smaller than ShiftVT, it is safe to use trunc. + When VT is larger than ShiftVT, it is safe to use zext iff `is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2 proofs. Fixes issue #85066 (comment).

dtcxzyw requested review from nikic, RKSimon, topperc and wangpc-pp March 13, 2024 12:11

dtcxzyw mentioned this pull request Mar 13, 2024

Test PR85066 plctlab/llvm-ci#1113

Closed

dtcxzyw requested a review from arsenm March 13, 2024 15:26

wangpc-pp reviewed Mar 14, 2024

View reviewed changes

llvm/test/CodeGen/RISCV/shl-cttz.ll Outdated Show resolved Hide resolved

llvm/lib/CodeGen/CodeGenPrepare.cpp Outdated Show resolved Hide resolved

topperc reviewed Mar 14, 2024

View reviewed changes

dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 7a9c015 to 43e36d8 Compare March 21, 2024 14:42

RKSimon requested changes Mar 22, 2024

View reviewed changes

dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 43e36d8 to 41b7713 Compare March 22, 2024 17:53

dtcxzyw changed the title ~~[CodeGenPrepare] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported~~ [DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported Mar 22, 2024

llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Mar 22, 2024

arsenm reviewed Mar 26, 2024

View reviewed changes

dtcxzyw added 3 commits April 11, 2024 22:17

[DAGCombine] Add pre-commit tests. NFC.

2bd2403

[DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz …

de06d42

…is unsupported

[DAGCombine] Address review comments

728dad3

dtcxzyw force-pushed the perf/shl-cttz-to-mul-lsb branch from 41b7713 to 728dad3 Compare April 11, 2024 14:31

arsenm approved these changes Apr 17, 2024

View reviewed changes

arsenm reviewed Apr 17, 2024

View reviewed changes

llvm/test/CodeGen/RISCV/shl-cttz.ll Outdated Show resolved Hide resolved

dtcxzyw added 2 commits April 22, 2024 15:39

Merge branch 'main' into perf/shl-cttz-to-mul-lsb

dad17f1

[DAGCombiner] Address review comments. NFC.

c11644b

arsenm approved these changes Apr 22, 2024

View reviewed changes

arsenm requested a review from RKSimon April 22, 2024 08:03

dtcxzyw mentioned this pull request May 7, 2024

1 << cttz(z) should be folded into z & -z even on machines with cttz built-in #91305

Open

RKSimon approved these changes May 29, 2024

View reviewed changes

dtcxzyw merged commit 9e8ecce into llvm:main May 29, 2024
4 checks passed

dtcxzyw deleted the perf/shl-cttz-to-mul-lsb branch May 29, 2024 10:26

dtcxzyw self-assigned this May 31, 2024

dtcxzyw mentioned this pull request May 31, 2024

[DAGCombine] Fix type mismatch in (shl X, cttz(Y)) -> (mul (Y & -Y), X) #94008

Merged

meheff mentioned this pull request Jun 8, 2024

Miscompile of 1 << ZEXT(CTTZ(X)) #94824

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DAGCombine] Transform `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported #85066

[DAGCombine] Transform `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported #85066

dtcxzyw commented Mar 13, 2024

github-actions bot commented Mar 13, 2024 •

edited

Loading

nikic commented Mar 13, 2024

dtcxzyw commented Mar 13, 2024

nikic commented Mar 13, 2024

wangpc-pp left a comment

topperc Mar 14, 2024

dtcxzyw commented Mar 14, 2024

arsenm commented Mar 14, 2024

RKSimon left a comment

llvmbot commented Mar 22, 2024

dtcxzyw commented Mar 22, 2024

arsenm Mar 26, 2024

dtcxzyw Mar 26, 2024

arsenm Mar 26, 2024

dtcxzyw Apr 11, 2024

dtcxzyw Apr 11, 2024

arsenm left a comment

dtcxzyw commented May 27, 2024

arsenm commented May 28, 2024

RKSimon left a comment

nathanchance commented May 31, 2024

dtcxzyw commented May 31, 2024

dtcxzyw commented May 31, 2024

dtcxzyw commented May 31, 2024

[DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported #85066

[DAGCombine] Transform shl X, cttz(Y) to mul (Y & -Y), X if cttz is unsupported #85066

Conversation

dtcxzyw commented Mar 13, 2024

github-actions bot commented Mar 13, 2024 • edited Loading

nikic commented Mar 13, 2024

dtcxzyw commented Mar 13, 2024

nikic commented Mar 13, 2024

wangpc-pp left a comment

Choose a reason for hiding this comment

topperc Mar 14, 2024

Choose a reason for hiding this comment

dtcxzyw commented Mar 14, 2024

arsenm commented Mar 14, 2024

RKSimon left a comment

Choose a reason for hiding this comment

llvmbot commented Mar 22, 2024

dtcxzyw commented Mar 22, 2024

arsenm Mar 26, 2024

Choose a reason for hiding this comment

dtcxzyw Mar 26, 2024

Choose a reason for hiding this comment

arsenm Mar 26, 2024

Choose a reason for hiding this comment

dtcxzyw Apr 11, 2024

Choose a reason for hiding this comment

dtcxzyw Apr 11, 2024

Choose a reason for hiding this comment

arsenm left a comment

Choose a reason for hiding this comment

dtcxzyw commented May 27, 2024

arsenm commented May 28, 2024

RKSimon left a comment

Choose a reason for hiding this comment

nathanchance commented May 31, 2024

dtcxzyw commented May 31, 2024

dtcxzyw commented May 31, 2024

dtcxzyw commented May 31, 2024

[DAGCombine] Transform `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported #85066

[DAGCombine] Transform `shl X, cttz(Y)` to `mul (Y & -Y), X` if cttz is unsupported #85066

github-actions bot commented Mar 13, 2024 •

edited

Loading