Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAGCombine] Fix type mismatch in (shl X, cttz(Y)) -> (mul (Y & -Y), X) #94008

Merged
merged 2 commits into from
Jun 1, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented May 31, 2024

Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as #92753, the types of LHS and RHS in shift nodes may differ.

  • When VT is smaller than ShiftVT, it is safe to use trunc.
  • When VT is larger than ShiftVT, it is safe to use zext iff is_zero_poison is true (i.e., opcode == ISD::CTTZ_ZERO_UNDEF). See also the counterexample src_shl_cttz2 -> tgt_shl_cttz2 in the alive2 proofs.

Fixes issue #85066 (comment).

@llvmbot
Copy link
Member

llvmbot commented May 31, 2024

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-powerpc

Author: Yingwei Zheng (dtcxzyw)

Changes

Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as #92753, the types of LHS and RHS in shift nodes may differ.

  • When VT is smaller than ShiftVT, it is safe to use trunc.
  • When VT is larger than ShiftVT, it is safe to use zext iff is_zero_poison is true (i.e., opcode == ISD::CTTZ_ZERO_UNDEF). See also the counterexample src_shl_cttz2 -> tgt_shl_cttz2 in the alive2 proofs.

Fixes issue #85066 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/94008.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+7-4)
  • (added) llvm/test/CodeGen/PowerPC/pr85066.ll (+45)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 42e861e61201c..2084f9727f9bb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -10109,13 +10109,16 @@ SDValue DAGCombiner::visitSHL(SDNode *N) {
 
   // fold (shl X, cttz(Y)) -> (mul (Y & -Y), X) if cttz is unsupported on the
   // target.
-  if ((N1.getOpcode() == ISD::CTTZ || N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
-      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, VT) &&
+  if (((N1.getOpcode() == ISD::CTTZ &&
+        VT.getScalarSizeInBits() >= ShiftVT.getScalarSizeInBits()) ||
+       N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) &&
+      N1.hasOneUse() && !TLI.isOperationLegalOrCustom(ISD::CTTZ, ShiftVT) &&
       TLI.isOperationLegalOrCustom(ISD::MUL, VT)) {
     SDValue Y = N1.getOperand(0);
     SDLoc DL(N);
-    SDValue NegY = DAG.getNegative(Y, DL, VT);
-    SDValue And = DAG.getNode(ISD::AND, DL, VT, Y, NegY);
+    SDValue NegY = DAG.getNegative(Y, DL, ShiftVT);
+    SDValue And =
+        DAG.getZExtOrTrunc(DAG.getNode(ISD::AND, DL, ShiftVT, Y, NegY), DL, VT);
     return DAG.getNode(ISD::MUL, DL, VT, And, N0);
   }
 
diff --git a/llvm/test/CodeGen/PowerPC/pr85066.ll b/llvm/test/CodeGen/PowerPC/pr85066.ll
new file mode 100644
index 0000000000000..9b47d6e6508fa
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/pr85066.ll
@@ -0,0 +1,45 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=powerpc64le -verify-machineinstrs < %s | FileCheck %s
+
+; Tests from pr85066
+define i64 @test_shl_zext_cttz(i16 %x) {
+; CHECK-LABEL: test_shl_zext_cttz:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    oris 3, 3, 1
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    clrldi 3, 3, 32
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i16 @llvm.cttz.i16(i16 %x, i1 false)
+  %zext = zext i16 %cttz to i64
+  %res = shl i64 1, %zext
+  ret i64 %res
+}
+
+define i64 @test_shl_zext_cttz_zero_is_poison(i16 %x) {
+; CHECK-LABEL: test_shl_zext_cttz_zero_is_poison:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    clrldi 3, 3, 32
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i16 @llvm.cttz.i16(i16 %x, i1 true)
+  %zext = zext i16 %cttz to i64
+  %res = shl i64 1, %zext
+  ret i64 %res
+}
+
+define i16 @test_shl_trunc_cttz(i32 %x) {
+; CHECK-LABEL: test_shl_trunc_cttz:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    neg 4, 3
+; CHECK-NEXT:    and 3, 3, 4
+; CHECK-NEXT:    blr
+entry:
+  %cttz = tail call i32 @llvm.cttz.i32(i32 %x, i1 false)
+  %trunc = trunc i32 %cttz to i16
+  %res = shl i16 1, %trunc
+  ret i16 %res
+}

llvm/test/CodeGen/PowerPC/pr85066.ll Outdated Show resolved Hide resolved
@dtcxzyw dtcxzyw merged commit 47fd32f into llvm:main Jun 1, 2024
7 checks passed
@dtcxzyw dtcxzyw deleted the fix-pr85066 branch June 1, 2024 11:04
dtcxzyw added a commit that referenced this pull request Jun 8, 2024
The pr description in #94008 mismatches with the code.
> + When VT is smaller than ShiftVT, it is safe to use trunc.
> + When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

Closes #94824.
nekoshirro pushed a commit to nekoshirro/Alchemist-LLVM that referenced this pull request Jun 9, 2024
The pr description in llvm#94008 mismatches with the code.
> + When VT is smaller than ShiftVT, it is safe to use trunc.
> + When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

Closes llvm#94824.

Signed-off-by: Hafidz Muzakky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants