Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopVectorize] Enable more early exit vectorisation tests #117008

Merged
merged 3 commits into from
Dec 18, 2024

Conversation

david-arm
Copy link
Contributor

@david-arm david-arm commented Nov 20, 2024

PR #112138 introduced initial support for dispatching to
multiple exit blocks via split middle blocks. This patch
fixes a few issues so that we can enable more tests to use
the new enable-early-exit-vectorization flag. Fixes are:

  1. The code to bail out for any loop live-out values happens
    too late. This is because collectUsersInExitBlocks ignores
    induction variables, which get dealt with in fixupIVUsers.
    I've moved the check much earlier in processLoop by looking
    for outside users of loop-defined values.
  2. We shouldn't yet be interleaving when vectorising loops
    with uncountable early exits, since we've not added support
    for this yet.
  3. Similarly, we also shouldn't be creating vector epilogues.
  4. Similarly, we shouldn't enable tail-folding.
  5. The existing implementation doesn't yet support loops
    that require scalar epilogues, although I plan to add that
    as part of PR [LoopVectorize] Add support for vectorisation of more early exit loops #88385.
  6. The new split middle blocks weren't being added to the
    parent loop.

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-vectorizers

Author: David Sherwood (david-arm)

Changes

PR #112138 introduced initial support for dispatching to
multiple exit blocks via split middle blocks. This patch
fixes a few issues so that we can enable more tests to use
the new enable-early-exit-vectorization flag. Fixes are:

  1. The code to bail out for any loop live-out values happens
    too late. This is because collectUsersInExitBlocks ignores
    induction variables, which get dealt with in fixupIVUsers.
    I've moved the check much earlier in processLoop by looking
    for outside users of loop-defined values.
  2. We shouldn't yet be interleaving when vectorising loops
    with uncountable early exits, since we've not added support
    for this yet.
  3. Similarly, we also shouldn't be creating vector epilogues.
  4. Similarly, we shouldn't enable tail-folding.
  5. The existing implementation doesn't yet support loops
    that require scalar epilogues, although I plan to add that
    as part of PR #88385.
  6. The new split middle blocks weren't being added to the
    parent loop.
  7. VPIRInstruction::execute assumed that the VPIRBasicBlock
    predecessors correspond like-for-like with the predecessors
    of the scalar exit block prior to vectorisation. For example,
    collectUsersInExitBlocks adds the operands to the
    VPIRInstruction in the order returned by predecessors(ExitBB),
    whereas VPIRInstruction::execute processes the operands in
    order of the VPIRBasicBlock predecessors. There is absolutely
    no guarantee that they match up, which in some cases (such as
    the yacr2 test in the LLVM test suite) they don't. I've fixed
    this by maintaining the old behaviour when there is a single
    operand, and when there are 2 or more operands we use the
    same ordering as the BasicBlock predecessors.

Patch is 72.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117008.diff

19 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (+7)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+151-74)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+21-11)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+27-2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanCFG.h (+9)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+31-5)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+65)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+11)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll (+55-5)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/multi-exit-cost.ll (+9-9)
  • (modified) llvm/test/Transforms/LoopVectorize/early_exit_legality.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit_live_outs.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit.ll (+68-8)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit_unsafe_ptrs.ll (+128-1)
  • (added) llvm/test/Transforms/LoopVectorize/single_early_exit_with_outer_loop.ll (+87)
  • (added) llvm/test/Transforms/LoopVectorize/uncountable-early-exit-vplan.ll (+171)
  • (modified) llvm/test/Transforms/LoopVectorize/unsupported_early_exit.ll (+1-1)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index f1568781252c06..13af2fde44459b 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1375,6 +1375,13 @@ bool LoopVectorizationLegality::isFixedOrderRecurrence(
 }
 
 bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) const {
+  // The only block currently permitted after the early exiting block is the
+  // loop latch, so only that blocks needs predication.
+  // FIXME: Once we support instructions in the loop that cannot be executed
+  // speculatively, such as stores, we will also need to predicate all blocks
+  // leading up to the early exit too.
+  if (hasUncountableEarlyExit() && BB == TheLoop->getLoopLatch())
+    return true;
   return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 5073622a095537..80818a162d121c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -385,6 +385,11 @@ static cl::opt<bool> UseWiderVFIfCallVariantsPresent(
     cl::Hidden,
     cl::desc("Try wider VFs if they enable the use of vector variants"));
 
+static cl::opt<bool> EnableEarlyExitVectorization(
+    "enable-early-exit-vectorization", cl::init(false), cl::Hidden,
+    cl::desc(
+        "Enable vectorization of early exit loops with uncountable exits."));
+
 // Likelyhood of bypassing the vectorized loop because assumptions about SCEV
 // variables not overflowing do not hold. See `emitSCEVChecks`.
 static constexpr uint32_t SCEVCheckBypassWeights[] = {1, 127};
@@ -1350,9 +1355,10 @@ class LoopVectorizationCostModel {
       LLVM_DEBUG(dbgs() << "LV: Loop does not require scalar epilogue\n");
       return false;
     }
-    // If we might exit from anywhere but the latch, must run the exiting
-    // iteration in scalar form.
-    if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
+    // If we might exit from anywhere but the latch and early exit vectorization
+    // is disabled, we must run the exiting iteration in scalar form.
+    if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch() &&
+        !(EnableEarlyExitVectorization && Legal->hasUncountableEarlyExit())) {
       LLVM_DEBUG(dbgs() << "LV: Loop requires scalar epilogue: not exiting "
                            "from latch block\n");
       return true;
@@ -2568,9 +2574,9 @@ BasicBlock *InnerLoopVectorizer::emitMemRuntimeChecks(BasicBlock *Bypass) {
 void InnerLoopVectorizer::createVectorLoopSkeleton(StringRef Prefix) {
   LoopVectorPreHeader = OrigLoop->getLoopPreheader();
   assert(LoopVectorPreHeader && "Invalid loop structure");
-  assert((OrigLoop->getUniqueExitBlock() ||
+  assert((OrigLoop->getUniqueLatchExitBlock() ||
           Cost->requiresScalarEpilogue(VF.isVector())) &&
-         "multiple exit loop without required epilogue?");
+         "loops not exiting via the latch without required epilogue?");
 
   LoopMiddleBlock =
       SplitBlock(LoopVectorPreHeader, LoopVectorPreHeader->getTerminator(), DT,
@@ -2753,8 +2759,6 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
   // value (the value that feeds into the phi from the loop latch).
   // We allow both, but they, obviously, have different values.
 
-  assert(OrigLoop->getUniqueExitBlock() && "Expected a single exit block");
-
   DenseMap<Value *, Value *> MissingVals;
 
   Value *EndValue = cast<PHINode>(OrigPhi->getIncomingValueForBlock(
@@ -2808,6 +2812,8 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
     }
   }
 
+  assert((MissingVals.empty() || OrigLoop->getUniqueExitBlock()) &&
+         "Expected a single exit block for escaping values");
   for (auto &I : MissingVals) {
     PHINode *PHI = cast<PHINode>(I.first);
     // One corner case we have to handle is two IVs "chasing" each-other,
@@ -2939,6 +2945,21 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
   PSE.getSE()->forgetLoop(OrigLoop);
   PSE.getSE()->forgetBlockAndLoopDispositions();
 
+  // When dealing with uncountable early exits we create middle.split blocks
+  // between the vector loop region and the exit block. These blocks need
+  // adding to any outer loop.
+  VPRegionBlock *VectorRegion = State.Plan->getVectorLoopRegion();
+  Loop *OuterLoop = OrigLoop->getParentLoop();
+  if (Legal->hasUncountableEarlyExit() && OuterLoop) {
+    VPBasicBlock *MiddleVPBB = State.Plan->getMiddleBlock();
+    VPBlockBase *PredVPBB = MiddleVPBB->getSinglePredecessor();
+    while (PredVPBB && PredVPBB != VectorRegion) {
+      BasicBlock *MiddleSplitBB = State.CFG.VPBB2IRBB[cast<VPBasicBlock>(PredVPBB)];
+      OuterLoop->addBasicBlockToLoop(MiddleSplitBB, *LI);
+      PredVPBB = PredVPBB->getSinglePredecessor();
+    }
+  }
+
   // After vectorization, the exit blocks of the original loop will have
   // additional predecessors. Invalidate SCEVs for the exit phis in case SE
   // looked through single-entry phis.
@@ -2969,7 +2990,6 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
   for (Instruction *PI : PredicatedInstructions)
     sinkScalarOperands(&*PI);
 
-  VPRegionBlock *VectorRegion = State.Plan->getVectorLoopRegion();
   VPBasicBlock *HeaderVPBB = VectorRegion->getEntryBasicBlock();
   BasicBlock *HeaderBB = State.CFG.VPBB2IRBB[HeaderVPBB];
 
@@ -3591,7 +3611,8 @@ void LoopVectorizationCostModel::collectLoopUniforms(ElementCount VF) {
   TheLoop->getExitingBlocks(Exiting);
   for (BasicBlock *E : Exiting) {
     auto *Cmp = dyn_cast<Instruction>(E->getTerminator()->getOperand(0));
-    if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse())
+    if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse() &&
+        (TheLoop->getLoopLatch() == E || !Legal->hasUncountableEarlyExit()))
       AddToWorklistIfAllowed(Cmp);
   }
 
@@ -4044,7 +4065,8 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
   // a bottom-test and a single exiting block. We'd have to handle the fact
   // that not every instruction executes on the last iteration.  This will
   // require a lane mask which varies through the vector loop body.  (TODO)
-  if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
+  if (Legal->hasUncountableEarlyExit() ||
+      TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
     // If there was a tail-folding hint/switch, but we can't fold the tail by
     // masking, fallback to a vectorization with a scalar epilogue.
     if (ScalarEpilogueStatus == CM_ScalarEpilogueNotNeededUsePredicate) {
@@ -4663,7 +4685,9 @@ bool LoopVectorizationPlanner::isCandidateForEpilogueVectorization(
   // Epilogue vectorization code has not been auditted to ensure it handles
   // non-latch exits properly.  It may be fine, but it needs auditted and
   // tested.
-  if (OrigLoop->getExitingBlock() != OrigLoop->getLoopLatch())
+  // TODO: Add support for loops with an early exit.
+  if (Legal->hasUncountableEarlyExit() ||
+      OrigLoop->getExitingBlock() != OrigLoop->getLoopLatch())
     return false;
 
   return true;
@@ -4913,6 +4937,10 @@ LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
   if (!Legal->isSafeForAnyVectorWidth())
     return 1;
 
+  // We don't attempt to perform interleaving for early exit loops.
+  if (Legal->hasUncountableEarlyExit())
+    return 1;
+
   auto BestKnownTC = getSmallBestKnownTC(PSE, TheLoop);
   const bool HasReductions = !Legal->getReductionVars().empty();
 
@@ -7746,11 +7774,14 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
 
   // 2.5 Collect reduction resume values.
   auto *ExitVPBB = BestVPlan.getMiddleBlock();
-  if (VectorizingEpilogue)
+  if (VectorizingEpilogue) {
+    assert(!ILV.Legal->hasUncountableEarlyExit() &&
+           "Epilogue vectorisation not yet supported with early exits");
     for (VPRecipeBase &R : *ExitVPBB) {
       fixReductionScalarResumeWhenVectorizingEpilog(
           &R, State, State.CFG.VPBB2IRBB[ExitVPBB]);
     }
+  }
 
   // 2.6. Maintain Loop Hints
   // Keep all loop hints from the original loop on the vector loop (we'll
@@ -7775,6 +7806,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
     LoopVectorizeHints Hints(L, true, *ORE);
     Hints.setAlreadyVectorized();
   }
+
   TargetTransformInfo::UnrollingPreferences UP;
   TTI.getUnrollingPreferences(L, *PSE.getSE(), UP, ORE);
   if (!UP.UnrollVectorizedLoop || CanonicalIVStartValue)
@@ -7787,15 +7819,17 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   ILV.printDebugTracesAtEnd();
 
   // 4. Adjust branch weight of the branch in the middle block.
-  auto *MiddleTerm =
-      cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
-  if (MiddleTerm->isConditional() &&
-      hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
-    // Assume that `Count % VectorTripCount` is equally distributed.
-    unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
-    assert(TripCount > 0 && "trip count should not be zero");
-    const uint32_t Weights[] = {1, TripCount - 1};
-    setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
+  if (ExitVPBB) {
+    auto *MiddleTerm =
+        cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
+    if (MiddleTerm->isConditional() &&
+        hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
+      // Assume that `Count % VectorTripCount` is equally distributed.
+      unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
+      assert(TripCount > 0 && "trip count should not be zero");
+      const uint32_t Weights[] = {1, TripCount - 1};
+      setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
+    }
   }
 
   return State.ExpandedSCEVs;
@@ -8180,7 +8214,7 @@ VPValue *VPRecipeBuilder::createEdgeMask(BasicBlock *Src, BasicBlock *Dst) {
   // If source is an exiting block, we know the exit edge is dynamically dead
   // in the vector loop, and thus we don't need to restrict the mask.  Avoid
   // adding uses of an otherwise potentially dead instruction.
-  if (OrigLoop->isLoopExiting(Src))
+  if (!Legal->hasUncountableEarlyExit() && OrigLoop->isLoopExiting(Src))
     return EdgeMaskCache[Edge] = SrcMask;
 
   VPValue *EdgeMask = getVPValueOrAddLiveIn(BI->getCondition());
@@ -8863,47 +8897,46 @@ static void addScalarResumePhis(VPRecipeBuilder &Builder, VPlan &Plan) {
   }
 }
 
-// Collect VPIRInstructions for phis in the original exit block that are modeled
+// Collect VPIRInstructions for phis in the exit blocks that are modeled
 // in VPlan and add the exiting VPValue as operand. Some exiting values are not
 // modeled explicitly yet and won't be included. Those are un-truncated
 // VPWidenIntOrFpInductionRecipe, VPWidenPointerInductionRecipe and induction
 // increments.
-static SetVector<VPIRInstruction *> collectUsersInExitBlock(
+static SetVector<VPIRInstruction *> collectUsersInExitBlocks(
     Loop *OrigLoop, VPRecipeBuilder &Builder, VPlan &Plan,
     const MapVector<PHINode *, InductionDescriptor> &Inductions) {
-  auto *MiddleVPBB = Plan.getMiddleBlock();
-  // No edge from the middle block to the unique exit block has been inserted
-  // and there is nothing to fix from vector loop; phis should have incoming
-  // from scalar loop only.
-  if (MiddleVPBB->getNumSuccessors() != 2)
-    return {};
   SetVector<VPIRInstruction *> ExitUsersToFix;
-  VPBasicBlock *ExitVPBB = cast<VPIRBasicBlock>(MiddleVPBB->getSuccessors()[0]);
-  BasicBlock *ExitingBB = OrigLoop->getExitingBlock();
-  for (VPRecipeBase &R : *ExitVPBB) {
-    auto *ExitIRI = dyn_cast<VPIRInstruction>(&R);
-    if (!ExitIRI)
-      continue;
-    auto *ExitPhi = dyn_cast<PHINode>(&ExitIRI->getInstruction());
-    if (!ExitPhi)
-      break;
-    Value *IncomingValue = ExitPhi->getIncomingValueForBlock(ExitingBB);
-    VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
-    // Exit values for inductions are computed and updated outside of VPlan and
-    // independent of induction recipes.
-    // TODO: Compute induction exit values in VPlan.
-    if ((isa<VPWidenIntOrFpInductionRecipe>(V) &&
-         !cast<VPWidenIntOrFpInductionRecipe>(V)->getTruncInst()) ||
-        isa<VPWidenPointerInductionRecipe>(V) ||
-        (isa<Instruction>(IncomingValue) &&
-         OrigLoop->contains(cast<Instruction>(IncomingValue)) &&
-         any_of(IncomingValue->users(), [&Inductions](User *U) {
-           auto *P = dyn_cast<PHINode>(U);
-           return P && Inductions.contains(P);
-         })))
-      continue;
-    ExitUsersToFix.insert(ExitIRI);
-    ExitIRI->addOperand(V);
+  for (VPIRBasicBlock *ExitVPBB : Plan.getExitBlocks()) {
+    BasicBlock *ExitBB = ExitVPBB->getIRBasicBlock();
+    for (VPRecipeBase &R : *ExitVPBB) {
+      auto *ExitIRI = dyn_cast<VPIRInstruction>(&R);
+      if (!ExitIRI)
+        continue;
+      auto *ExitPhi = dyn_cast<PHINode>(&ExitIRI->getInstruction());
+      if (!ExitPhi)
+        break;
+      for (BasicBlock *ExitingBB : predecessors(ExitBB)) {
+        if (!OrigLoop->contains(ExitingBB))
+          continue;
+        Value *IncomingValue = ExitPhi->getIncomingValueForBlock(ExitingBB);
+        VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
+        // Exit values for inductions are computed and updated outside of VPlan
+        // and independent of induction recipes.
+        // TODO: Compute induction exit values in VPlan.
+        if ((isa<VPWidenIntOrFpInductionRecipe>(V) &&
+             !cast<VPWidenIntOrFpInductionRecipe>(V)->getTruncInst()) ||
+            isa<VPWidenPointerInductionRecipe>(V) ||
+            (isa<Instruction>(IncomingValue) &&
+             OrigLoop->contains(cast<Instruction>(IncomingValue)) &&
+             any_of(IncomingValue->users(), [&Inductions](User *U) {
+               auto *P = dyn_cast<PHINode>(U);
+               return P && Inductions.contains(P);
+             })))
+          continue;
+        ExitUsersToFix.insert(ExitIRI);
+        ExitIRI->addOperand(V);
+      }
+    }
   }
   return ExitUsersToFix;
 }
@@ -8911,28 +8944,31 @@ static SetVector<VPIRInstruction *> collectUsersInExitBlock(
 // Add exit values to \p Plan. Extracts are added for each entry in \p
 // ExitUsersToFix if needed and their operands are updated.
 static void
-addUsersInExitBlock(VPlan &Plan,
-                    const SetVector<VPIRInstruction *> &ExitUsersToFix) {
+addUsersInExitBlocks(VPlan &Plan,
+                     const SetVector<VPIRInstruction *> &ExitUsersToFix) {
   if (ExitUsersToFix.empty())
     return;
 
-  auto *MiddleVPBB = Plan.getMiddleBlock();
-  VPBuilder B(MiddleVPBB, MiddleVPBB->getFirstNonPhi());
-
   // Introduce extract for exiting values and update the VPIRInstructions
   // modeling the corresponding LCSSA phis.
   for (VPIRInstruction *ExitIRI : ExitUsersToFix) {
+
     VPValue *V = ExitIRI->getOperand(0);
     // Pass live-in values used by exit phis directly through to their users in
     // the exit block.
     if (V->isLiveIn())
       continue;
 
-    LLVMContext &Ctx = ExitIRI->getInstruction().getContext();
-    VPValue *Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
-                                  {V, Plan.getOrAddLiveIn(ConstantInt::get(
-                                          IntegerType::get(Ctx, 32), 1))});
-    ExitIRI->setOperand(0, Ext);
+    for (VPBlockBase *PredVPB : ExitIRI->getParent()->getPredecessors()) {
+      auto *PredVPBB = cast<VPBasicBlock>(PredVPB);
+      VPBuilder B(PredVPBB, PredVPBB->getFirstNonPhi());
+
+      LLVMContext &Ctx = ExitIRI->getInstruction().getContext();
+      VPValue *Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
+                                    {V, Plan.getOrAddLiveIn(ConstantInt::get(
+                                            IntegerType::get(Ctx, 32), 1))});
+      ExitIRI->setOperand(0, Ext);
+    }
   }
 }
 
@@ -9204,11 +9240,17 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
          "VPBasicBlock");
   RecipeBuilder.fixHeaderPhis();
 
+  if (Legal->hasUncountableEarlyExit()) {
+    VPlanTransforms::handleUncountableEarlyExit(
+        *Plan, *PSE.getSE(), OrigLoop, Legal->getUncountableExitingBlocks(),
+        RecipeBuilder);
+  }
   addScalarResumePhis(RecipeBuilder, *Plan);
-  SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlock(
+  SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlocks(
       OrigLoop, RecipeBuilder, *Plan, Legal->getInductionVars());
   addExitUsersForFirstOrderRecurrences(*Plan, ExitUsersToFix);
-  addUsersInExitBlock(*Plan, ExitUsersToFix);
+  addUsersInExitBlocks(*Plan, ExitUsersToFix);
+
   // ---------------------------------------------------------------------------
   // Transform initial VPlan: Apply previously taken decisions, in order, to
   // bring the VPlan to its final state.
@@ -9968,12 +10010,31 @@ bool LoopVectorizePass::processLoop(Loop *L) {
   }
 
   if (LVL.hasUncountableEarlyExit()) {
-    reportVectorizationFailure("Auto-vectorization of loops with uncountable "
-                               "early exit is not yet supported",
-                               "Auto-vectorization of loops with uncountable "
-                               "early exit is not yet supported",
-                               "UncountableEarlyExitLoopsUnsupported", ORE, L);
-    return false;
+    if (!EnableEarlyExitVectorization) {
+      reportVectorizationFailure("Auto-vectorization of loops with uncountable "
+                                 "early exit is disabled",
+                                 "Auto-vectorization of loops with uncountable "
+                                 "early exit is disabled",
+                                 "UncountableEarlyExitLoopsDisabled", ORE,
+                                 L);
+      return false;
+    }
+    for (BasicBlock *BB : L->blocks()) {
+      for (Instruction &I : *BB) {
+        for (User *U : I.users()) {
+          Instruction *UI = cast<Instruction>(U);
+          if (!L->contains(UI)) {
+            reportVectorizationFailure(
+                "Auto-vectorization of loops with uncountable "
+                "early exit and live-outs is not yet supported",
+                "Auto-vectorization of loop with uncountable "
+                "early exit and live-outs is not yet supported",
+                "UncountableEarlyExitLoopLiveOutsUnsupported", ORE, L);
+            return false;
+          }
+        }
+      }
+    }
   }
 
   // Entrance to the VPlan-native vectorization path. Outer loops are processed
@@ -9990,6 +10051,22 @@ bool LoopVectorizePass::processLoop(Loop *L) {
   InterleavedAccessInfo IAI(PSE, L, DT, LI, LVL.getLAI());
   bool UseInterleaved = TTI->enableInterleavedAccessVectorization();
 
+  if (LVL.hasUncountableEarlyExit()) {
+    BasicBlock *LoopLatch = L->getLoopLatch();
+    if (IAI.requiresScalarEpilogue() ||
+        llvm::any_of(LVL.getCountableExitingBlocks(), [LoopLatch](BasicBlock *BB) {
+          return BB != LoopLatch;
+        })) {
+      reportVectorizationFailure("Auto-vectorization of early exit loops "
+                                 "requiring a scalar epilogue is unsupported",
+                                 "Auto-vectorization of early exit loops "
+                                 "requiring a scalar epilogue is unsupported",
+                                 "UncountableEarlyExitUnsupported", ORE,
+                                 L);
+      return false;
+    }
+  }
+
   // If an override option has been passed in for interleaved accesses, use it.
   if (EnableInterleavedMemAccesses.getNumOccurrences() > 0)
     UseInterleaved = EnableInterleavedMemAccesses;
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 8b1a4aeb88f81f..63c04bbb11e505 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -870,15 +870,9 @@ VPlanPtr VPlan::createInitialVPlan(Type *InductionTy,
   auto Plan = std::make_unique<VPlan>(Entry, VecPreheader, ScalarHeader);
 
   // Create SCEV and VPValue for the trip count.
-
-  // Currently only loops with countable exits are vectorized, but cal...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-llvm-transforms

Author: David Sherwood (david-arm)

Changes

PR #112138 introduced initial support for dispatching to
multiple exit blocks via split middle blocks. This patch
fixes a few issues so that we can enable more tests to use
the new enable-early-exit-vectorization flag. Fixes are:

  1. The code to bail out for any loop live-out values happens
    too late. This is because collectUsersInExitBlocks ignores
    induction variables, which get dealt with in fixupIVUsers.
    I've moved the check much earlier in processLoop by looking
    for outside users of loop-defined values.
  2. We shouldn't yet be interleaving when vectorising loops
    with uncountable early exits, since we've not added support
    for this yet.
  3. Similarly, we also shouldn't be creating vector epilogues.
  4. Similarly, we shouldn't enable tail-folding.
  5. The existing implementation doesn't yet support loops
    that require scalar epilogues, although I plan to add that
    as part of PR #88385.
  6. The new split middle blocks weren't being added to the
    parent loop.
  7. VPIRInstruction::execute assumed that the VPIRBasicBlock
    predecessors correspond like-for-like with the predecessors
    of the scalar exit block prior to vectorisation. For example,
    collectUsersInExitBlocks adds the operands to the
    VPIRInstruction in the order returned by predecessors(ExitBB),
    whereas VPIRInstruction::execute processes the operands in
    order of the VPIRBasicBlock predecessors. There is absolutely
    no guarantee that they match up, which in some cases (such as
    the yacr2 test in the LLVM test suite) they don't. I've fixed
    this by maintaining the old behaviour when there is a single
    operand, and when there are 2 or more operands we use the
    same ordering as the BasicBlock predecessors.

Patch is 72.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117008.diff

19 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (+7)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+151-74)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+21-11)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+27-2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanCFG.h (+9)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+31-5)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+65)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+11)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll (+55-5)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/multi-exit-cost.ll (+9-9)
  • (modified) llvm/test/Transforms/LoopVectorize/early_exit_legality.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit_live_outs.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit.ll (+68-8)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/single_early_exit_unsafe_ptrs.ll (+128-1)
  • (added) llvm/test/Transforms/LoopVectorize/single_early_exit_with_outer_loop.ll (+87)
  • (added) llvm/test/Transforms/LoopVectorize/uncountable-early-exit-vplan.ll (+171)
  • (modified) llvm/test/Transforms/LoopVectorize/unsupported_early_exit.ll (+1-1)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index f1568781252c06..13af2fde44459b 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1375,6 +1375,13 @@ bool LoopVectorizationLegality::isFixedOrderRecurrence(
 }
 
 bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) const {
+  // The only block currently permitted after the early exiting block is the
+  // loop latch, so only that blocks needs predication.
+  // FIXME: Once we support instructions in the loop that cannot be executed
+  // speculatively, such as stores, we will also need to predicate all blocks
+  // leading up to the early exit too.
+  if (hasUncountableEarlyExit() && BB == TheLoop->getLoopLatch())
+    return true;
   return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 5073622a095537..80818a162d121c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -385,6 +385,11 @@ static cl::opt<bool> UseWiderVFIfCallVariantsPresent(
     cl::Hidden,
     cl::desc("Try wider VFs if they enable the use of vector variants"));
 
+static cl::opt<bool> EnableEarlyExitVectorization(
+    "enable-early-exit-vectorization", cl::init(false), cl::Hidden,
+    cl::desc(
+        "Enable vectorization of early exit loops with uncountable exits."));
+
 // Likelyhood of bypassing the vectorized loop because assumptions about SCEV
 // variables not overflowing do not hold. See `emitSCEVChecks`.
 static constexpr uint32_t SCEVCheckBypassWeights[] = {1, 127};
@@ -1350,9 +1355,10 @@ class LoopVectorizationCostModel {
       LLVM_DEBUG(dbgs() << "LV: Loop does not require scalar epilogue\n");
       return false;
     }
-    // If we might exit from anywhere but the latch, must run the exiting
-    // iteration in scalar form.
-    if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
+    // If we might exit from anywhere but the latch and early exit vectorization
+    // is disabled, we must run the exiting iteration in scalar form.
+    if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch() &&
+        !(EnableEarlyExitVectorization && Legal->hasUncountableEarlyExit())) {
       LLVM_DEBUG(dbgs() << "LV: Loop requires scalar epilogue: not exiting "
                            "from latch block\n");
       return true;
@@ -2568,9 +2574,9 @@ BasicBlock *InnerLoopVectorizer::emitMemRuntimeChecks(BasicBlock *Bypass) {
 void InnerLoopVectorizer::createVectorLoopSkeleton(StringRef Prefix) {
   LoopVectorPreHeader = OrigLoop->getLoopPreheader();
   assert(LoopVectorPreHeader && "Invalid loop structure");
-  assert((OrigLoop->getUniqueExitBlock() ||
+  assert((OrigLoop->getUniqueLatchExitBlock() ||
           Cost->requiresScalarEpilogue(VF.isVector())) &&
-         "multiple exit loop without required epilogue?");
+         "loops not exiting via the latch without required epilogue?");
 
   LoopMiddleBlock =
       SplitBlock(LoopVectorPreHeader, LoopVectorPreHeader->getTerminator(), DT,
@@ -2753,8 +2759,6 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
   // value (the value that feeds into the phi from the loop latch).
   // We allow both, but they, obviously, have different values.
 
-  assert(OrigLoop->getUniqueExitBlock() && "Expected a single exit block");
-
   DenseMap<Value *, Value *> MissingVals;
 
   Value *EndValue = cast<PHINode>(OrigPhi->getIncomingValueForBlock(
@@ -2808,6 +2812,8 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
     }
   }
 
+  assert((MissingVals.empty() || OrigLoop->getUniqueExitBlock()) &&
+         "Expected a single exit block for escaping values");
   for (auto &I : MissingVals) {
     PHINode *PHI = cast<PHINode>(I.first);
     // One corner case we have to handle is two IVs "chasing" each-other,
@@ -2939,6 +2945,21 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
   PSE.getSE()->forgetLoop(OrigLoop);
   PSE.getSE()->forgetBlockAndLoopDispositions();
 
+  // When dealing with uncountable early exits we create middle.split blocks
+  // between the vector loop region and the exit block. These blocks need
+  // adding to any outer loop.
+  VPRegionBlock *VectorRegion = State.Plan->getVectorLoopRegion();
+  Loop *OuterLoop = OrigLoop->getParentLoop();
+  if (Legal->hasUncountableEarlyExit() && OuterLoop) {
+    VPBasicBlock *MiddleVPBB = State.Plan->getMiddleBlock();
+    VPBlockBase *PredVPBB = MiddleVPBB->getSinglePredecessor();
+    while (PredVPBB && PredVPBB != VectorRegion) {
+      BasicBlock *MiddleSplitBB = State.CFG.VPBB2IRBB[cast<VPBasicBlock>(PredVPBB)];
+      OuterLoop->addBasicBlockToLoop(MiddleSplitBB, *LI);
+      PredVPBB = PredVPBB->getSinglePredecessor();
+    }
+  }
+
   // After vectorization, the exit blocks of the original loop will have
   // additional predecessors. Invalidate SCEVs for the exit phis in case SE
   // looked through single-entry phis.
@@ -2969,7 +2990,6 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
   for (Instruction *PI : PredicatedInstructions)
     sinkScalarOperands(&*PI);
 
-  VPRegionBlock *VectorRegion = State.Plan->getVectorLoopRegion();
   VPBasicBlock *HeaderVPBB = VectorRegion->getEntryBasicBlock();
   BasicBlock *HeaderBB = State.CFG.VPBB2IRBB[HeaderVPBB];
 
@@ -3591,7 +3611,8 @@ void LoopVectorizationCostModel::collectLoopUniforms(ElementCount VF) {
   TheLoop->getExitingBlocks(Exiting);
   for (BasicBlock *E : Exiting) {
     auto *Cmp = dyn_cast<Instruction>(E->getTerminator()->getOperand(0));
-    if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse())
+    if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse() &&
+        (TheLoop->getLoopLatch() == E || !Legal->hasUncountableEarlyExit()))
       AddToWorklistIfAllowed(Cmp);
   }
 
@@ -4044,7 +4065,8 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
   // a bottom-test and a single exiting block. We'd have to handle the fact
   // that not every instruction executes on the last iteration.  This will
   // require a lane mask which varies through the vector loop body.  (TODO)
-  if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
+  if (Legal->hasUncountableEarlyExit() ||
+      TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
     // If there was a tail-folding hint/switch, but we can't fold the tail by
     // masking, fallback to a vectorization with a scalar epilogue.
     if (ScalarEpilogueStatus == CM_ScalarEpilogueNotNeededUsePredicate) {
@@ -4663,7 +4685,9 @@ bool LoopVectorizationPlanner::isCandidateForEpilogueVectorization(
   // Epilogue vectorization code has not been auditted to ensure it handles
   // non-latch exits properly.  It may be fine, but it needs auditted and
   // tested.
-  if (OrigLoop->getExitingBlock() != OrigLoop->getLoopLatch())
+  // TODO: Add support for loops with an early exit.
+  if (Legal->hasUncountableEarlyExit() ||
+      OrigLoop->getExitingBlock() != OrigLoop->getLoopLatch())
     return false;
 
   return true;
@@ -4913,6 +4937,10 @@ LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
   if (!Legal->isSafeForAnyVectorWidth())
     return 1;
 
+  // We don't attempt to perform interleaving for early exit loops.
+  if (Legal->hasUncountableEarlyExit())
+    return 1;
+
   auto BestKnownTC = getSmallBestKnownTC(PSE, TheLoop);
   const bool HasReductions = !Legal->getReductionVars().empty();
 
@@ -7746,11 +7774,14 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
 
   // 2.5 Collect reduction resume values.
   auto *ExitVPBB = BestVPlan.getMiddleBlock();
-  if (VectorizingEpilogue)
+  if (VectorizingEpilogue) {
+    assert(!ILV.Legal->hasUncountableEarlyExit() &&
+           "Epilogue vectorisation not yet supported with early exits");
     for (VPRecipeBase &R : *ExitVPBB) {
       fixReductionScalarResumeWhenVectorizingEpilog(
           &R, State, State.CFG.VPBB2IRBB[ExitVPBB]);
     }
+  }
 
   // 2.6. Maintain Loop Hints
   // Keep all loop hints from the original loop on the vector loop (we'll
@@ -7775,6 +7806,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
     LoopVectorizeHints Hints(L, true, *ORE);
     Hints.setAlreadyVectorized();
   }
+
   TargetTransformInfo::UnrollingPreferences UP;
   TTI.getUnrollingPreferences(L, *PSE.getSE(), UP, ORE);
   if (!UP.UnrollVectorizedLoop || CanonicalIVStartValue)
@@ -7787,15 +7819,17 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   ILV.printDebugTracesAtEnd();
 
   // 4. Adjust branch weight of the branch in the middle block.
-  auto *MiddleTerm =
-      cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
-  if (MiddleTerm->isConditional() &&
-      hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
-    // Assume that `Count % VectorTripCount` is equally distributed.
-    unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
-    assert(TripCount > 0 && "trip count should not be zero");
-    const uint32_t Weights[] = {1, TripCount - 1};
-    setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
+  if (ExitVPBB) {
+    auto *MiddleTerm =
+        cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
+    if (MiddleTerm->isConditional() &&
+        hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
+      // Assume that `Count % VectorTripCount` is equally distributed.
+      unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
+      assert(TripCount > 0 && "trip count should not be zero");
+      const uint32_t Weights[] = {1, TripCount - 1};
+      setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
+    }
   }
 
   return State.ExpandedSCEVs;
@@ -8180,7 +8214,7 @@ VPValue *VPRecipeBuilder::createEdgeMask(BasicBlock *Src, BasicBlock *Dst) {
   // If source is an exiting block, we know the exit edge is dynamically dead
   // in the vector loop, and thus we don't need to restrict the mask.  Avoid
   // adding uses of an otherwise potentially dead instruction.
-  if (OrigLoop->isLoopExiting(Src))
+  if (!Legal->hasUncountableEarlyExit() && OrigLoop->isLoopExiting(Src))
     return EdgeMaskCache[Edge] = SrcMask;
 
   VPValue *EdgeMask = getVPValueOrAddLiveIn(BI->getCondition());
@@ -8863,47 +8897,46 @@ static void addScalarResumePhis(VPRecipeBuilder &Builder, VPlan &Plan) {
   }
 }
 
-// Collect VPIRInstructions for phis in the original exit block that are modeled
+// Collect VPIRInstructions for phis in the exit blocks that are modeled
 // in VPlan and add the exiting VPValue as operand. Some exiting values are not
 // modeled explicitly yet and won't be included. Those are un-truncated
 // VPWidenIntOrFpInductionRecipe, VPWidenPointerInductionRecipe and induction
 // increments.
-static SetVector<VPIRInstruction *> collectUsersInExitBlock(
+static SetVector<VPIRInstruction *> collectUsersInExitBlocks(
     Loop *OrigLoop, VPRecipeBuilder &Builder, VPlan &Plan,
     const MapVector<PHINode *, InductionDescriptor> &Inductions) {
-  auto *MiddleVPBB = Plan.getMiddleBlock();
-  // No edge from the middle block to the unique exit block has been inserted
-  // and there is nothing to fix from vector loop; phis should have incoming
-  // from scalar loop only.
-  if (MiddleVPBB->getNumSuccessors() != 2)
-    return {};
   SetVector<VPIRInstruction *> ExitUsersToFix;
-  VPBasicBlock *ExitVPBB = cast<VPIRBasicBlock>(MiddleVPBB->getSuccessors()[0]);
-  BasicBlock *ExitingBB = OrigLoop->getExitingBlock();
-  for (VPRecipeBase &R : *ExitVPBB) {
-    auto *ExitIRI = dyn_cast<VPIRInstruction>(&R);
-    if (!ExitIRI)
-      continue;
-    auto *ExitPhi = dyn_cast<PHINode>(&ExitIRI->getInstruction());
-    if (!ExitPhi)
-      break;
-    Value *IncomingValue = ExitPhi->getIncomingValueForBlock(ExitingBB);
-    VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
-    // Exit values for inductions are computed and updated outside of VPlan and
-    // independent of induction recipes.
-    // TODO: Compute induction exit values in VPlan.
-    if ((isa<VPWidenIntOrFpInductionRecipe>(V) &&
-         !cast<VPWidenIntOrFpInductionRecipe>(V)->getTruncInst()) ||
-        isa<VPWidenPointerInductionRecipe>(V) ||
-        (isa<Instruction>(IncomingValue) &&
-         OrigLoop->contains(cast<Instruction>(IncomingValue)) &&
-         any_of(IncomingValue->users(), [&Inductions](User *U) {
-           auto *P = dyn_cast<PHINode>(U);
-           return P && Inductions.contains(P);
-         })))
-      continue;
-    ExitUsersToFix.insert(ExitIRI);
-    ExitIRI->addOperand(V);
+  for (VPIRBasicBlock *ExitVPBB : Plan.getExitBlocks()) {
+    BasicBlock *ExitBB = ExitVPBB->getIRBasicBlock();
+    for (VPRecipeBase &R : *ExitVPBB) {
+      auto *ExitIRI = dyn_cast<VPIRInstruction>(&R);
+      if (!ExitIRI)
+        continue;
+      auto *ExitPhi = dyn_cast<PHINode>(&ExitIRI->getInstruction());
+      if (!ExitPhi)
+        break;
+      for (BasicBlock *ExitingBB : predecessors(ExitBB)) {
+        if (!OrigLoop->contains(ExitingBB))
+          continue;
+        Value *IncomingValue = ExitPhi->getIncomingValueForBlock(ExitingBB);
+        VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
+        // Exit values for inductions are computed and updated outside of VPlan
+        // and independent of induction recipes.
+        // TODO: Compute induction exit values in VPlan.
+        if ((isa<VPWidenIntOrFpInductionRecipe>(V) &&
+             !cast<VPWidenIntOrFpInductionRecipe>(V)->getTruncInst()) ||
+            isa<VPWidenPointerInductionRecipe>(V) ||
+            (isa<Instruction>(IncomingValue) &&
+             OrigLoop->contains(cast<Instruction>(IncomingValue)) &&
+             any_of(IncomingValue->users(), [&Inductions](User *U) {
+               auto *P = dyn_cast<PHINode>(U);
+               return P && Inductions.contains(P);
+             })))
+          continue;
+        ExitUsersToFix.insert(ExitIRI);
+        ExitIRI->addOperand(V);
+      }
+    }
   }
   return ExitUsersToFix;
 }
@@ -8911,28 +8944,31 @@ static SetVector<VPIRInstruction *> collectUsersInExitBlock(
 // Add exit values to \p Plan. Extracts are added for each entry in \p
 // ExitUsersToFix if needed and their operands are updated.
 static void
-addUsersInExitBlock(VPlan &Plan,
-                    const SetVector<VPIRInstruction *> &ExitUsersToFix) {
+addUsersInExitBlocks(VPlan &Plan,
+                     const SetVector<VPIRInstruction *> &ExitUsersToFix) {
   if (ExitUsersToFix.empty())
     return;
 
-  auto *MiddleVPBB = Plan.getMiddleBlock();
-  VPBuilder B(MiddleVPBB, MiddleVPBB->getFirstNonPhi());
-
   // Introduce extract for exiting values and update the VPIRInstructions
   // modeling the corresponding LCSSA phis.
   for (VPIRInstruction *ExitIRI : ExitUsersToFix) {
+
     VPValue *V = ExitIRI->getOperand(0);
     // Pass live-in values used by exit phis directly through to their users in
     // the exit block.
     if (V->isLiveIn())
       continue;
 
-    LLVMContext &Ctx = ExitIRI->getInstruction().getContext();
-    VPValue *Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
-                                  {V, Plan.getOrAddLiveIn(ConstantInt::get(
-                                          IntegerType::get(Ctx, 32), 1))});
-    ExitIRI->setOperand(0, Ext);
+    for (VPBlockBase *PredVPB : ExitIRI->getParent()->getPredecessors()) {
+      auto *PredVPBB = cast<VPBasicBlock>(PredVPB);
+      VPBuilder B(PredVPBB, PredVPBB->getFirstNonPhi());
+
+      LLVMContext &Ctx = ExitIRI->getInstruction().getContext();
+      VPValue *Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
+                                    {V, Plan.getOrAddLiveIn(ConstantInt::get(
+                                            IntegerType::get(Ctx, 32), 1))});
+      ExitIRI->setOperand(0, Ext);
+    }
   }
 }
 
@@ -9204,11 +9240,17 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
          "VPBasicBlock");
   RecipeBuilder.fixHeaderPhis();
 
+  if (Legal->hasUncountableEarlyExit()) {
+    VPlanTransforms::handleUncountableEarlyExit(
+        *Plan, *PSE.getSE(), OrigLoop, Legal->getUncountableExitingBlocks(),
+        RecipeBuilder);
+  }
   addScalarResumePhis(RecipeBuilder, *Plan);
-  SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlock(
+  SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlocks(
       OrigLoop, RecipeBuilder, *Plan, Legal->getInductionVars());
   addExitUsersForFirstOrderRecurrences(*Plan, ExitUsersToFix);
-  addUsersInExitBlock(*Plan, ExitUsersToFix);
+  addUsersInExitBlocks(*Plan, ExitUsersToFix);
+
   // ---------------------------------------------------------------------------
   // Transform initial VPlan: Apply previously taken decisions, in order, to
   // bring the VPlan to its final state.
@@ -9968,12 +10010,31 @@ bool LoopVectorizePass::processLoop(Loop *L) {
   }
 
   if (LVL.hasUncountableEarlyExit()) {
-    reportVectorizationFailure("Auto-vectorization of loops with uncountable "
-                               "early exit is not yet supported",
-                               "Auto-vectorization of loops with uncountable "
-                               "early exit is not yet supported",
-                               "UncountableEarlyExitLoopsUnsupported", ORE, L);
-    return false;
+    if (!EnableEarlyExitVectorization) {
+      reportVectorizationFailure("Auto-vectorization of loops with uncountable "
+                                 "early exit is disabled",
+                                 "Auto-vectorization of loops with uncountable "
+                                 "early exit is disabled",
+                                 "UncountableEarlyExitLoopsDisabled", ORE,
+                                 L);
+      return false;
+    }
+    for (BasicBlock *BB : L->blocks()) {
+      for (Instruction &I : *BB) {
+        for (User *U : I.users()) {
+          Instruction *UI = cast<Instruction>(U);
+          if (!L->contains(UI)) {
+            reportVectorizationFailure(
+                "Auto-vectorization of loops with uncountable "
+                "early exit and live-outs is not yet supported",
+                "Auto-vectorization of loop with uncountable "
+                "early exit and live-outs is not yet supported",
+                "UncountableEarlyExitLoopLiveOutsUnsupported", ORE, L);
+            return false;
+          }
+        }
+      }
+    }
   }
 
   // Entrance to the VPlan-native vectorization path. Outer loops are processed
@@ -9990,6 +10051,22 @@ bool LoopVectorizePass::processLoop(Loop *L) {
   InterleavedAccessInfo IAI(PSE, L, DT, LI, LVL.getLAI());
   bool UseInterleaved = TTI->enableInterleavedAccessVectorization();
 
+  if (LVL.hasUncountableEarlyExit()) {
+    BasicBlock *LoopLatch = L->getLoopLatch();
+    if (IAI.requiresScalarEpilogue() ||
+        llvm::any_of(LVL.getCountableExitingBlocks(), [LoopLatch](BasicBlock *BB) {
+          return BB != LoopLatch;
+        })) {
+      reportVectorizationFailure("Auto-vectorization of early exit loops "
+                                 "requiring a scalar epilogue is unsupported",
+                                 "Auto-vectorization of early exit loops "
+                                 "requiring a scalar epilogue is unsupported",
+                                 "UncountableEarlyExitUnsupported", ORE,
+                                 L);
+      return false;
+    }
+  }
+
   // If an override option has been passed in for interleaved accesses, use it.
   if (EnableInterleavedMemAccesses.getNumOccurrences() > 0)
     UseInterleaved = EnableInterleavedMemAccesses;
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 8b1a4aeb88f81f..63c04bbb11e505 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -870,15 +870,9 @@ VPlanPtr VPlan::createInitialVPlan(Type *InductionTy,
   auto Plan = std::make_unique<VPlan>(Entry, VecPreheader, ScalarHeader);
 
   // Create SCEV and VPValue for the trip count.
-
-  // Currently only loops with countable exits are vectorized, but cal...
[truncated]

Copy link

github-actions bot commented Nov 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

PR llvm#112138 introduced initial support for dispatching to
multiple exit blocks via split middle blocks. This patch
fixes a few issues so that we can enable more tests to use
the new enable-early-exit-vectorization flag. Fixes are:

1. The code to bail out for any loop live-out values happens
too late. This is because collectUsersInExitBlocks ignores
induction variables, which get dealt with in fixupIVUsers.
I've moved the check much earlier in processLoop by looking
for outside users of loop-defined values.
2. We shouldn't yet be interleaving when vectorising loops
with uncountable early exits, since we've not added support
for this yet.
3. Similarly, we also shouldn't be creating vector epilogues.
4. Similarly, we shouldn't enable tail-folding.
5. The existing implementation doesn't yet support loops
that require scalar epilogues, although I plan to add that
as part of PR llvm#88385.
6. The new split middle blocks weren't being added to the
parent loop.
Comment on lines 2 to 3
; RUN: opt -S < %s -p loop-vectorize -enable-early-exit-vectorization -enable-early-exit-vectorization \
; RUN: | FileCheck %s --check-prefix=MAY_FAULT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given there is only a single RUN line, why change the CHECK prefix? As it stands it's just making the change look larger than necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 10222 to 10225
"Auto-vectorization of loops with uncountable "
"early exit and live-outs is not yet supported",
"Auto-vectorization of loop with uncountable "
"early exit and live-outs is not yet supported",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd just present the facts and drop the "yet".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

declare void @init_mem(ptr, i64);


define void @early_exit_in_outer_loop1() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some minimum commentary to highlight the difference between the loop1 and loop2 variants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

return false;
}

// Needed to prevent InnerLoopVectorizer::fixupIVUsers from crashing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this comment. What scenario causes fixupIVUsers to crash? I'm not after huge detail but if there is a specific property of a user that is not supported then it would be nice to mention it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 5022 to 5024
// We don't attempt to perform interleaving for early exit loops.
if (Legal->hasUncountableEarlyExit())
return 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more details on why not? Is all that would be needed to deal with multiple parts in AnyOf & co?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's because we'd have to support multiple parts in AnyOf and I think we probably need to investigate the best approach and check that the generated code isn't horrible.

if (LVL.hasUncountableEarlyExit()) {
BasicBlock *LoopLatch = L->getLoopLatch();
if (IAI.requiresScalarEpilogue() ||
llvm::any_of(LVL.getCountableExitingBlocks(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
llvm::any_of(LVL.getCountableExitingBlocks(),
any_of(LVL.getCountableExitingBlocks(),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 10222 to 10225
"Auto-vectorization of loops with uncountable "
"early exit and live-outs is not yet supported",
"Auto-vectorization of loop with uncountable "
"early exit and live-outs is not yet supported",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a variant that takes a single message instead of duplicating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! I've added one and I've started using it in this patch, but as a follow-on I'm happy to clean up any other cases in the vectoriser or legality code that take duplicated strings.

BasicBlock *BypassBlock = ILV.getAdditionalBypassBlock();
for (VPRecipeBase &R : *ExitVPBB) {
fixReductionScalarResumeWhenVectorizingEpilog(
&R, State, State.CFG.VPBB2IRBB[ExitVPBB], BypassBlock);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -4123,7 +4138,8 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
// a bottom-test and a single exiting block. We'd have to handle the fact
// that not every instruction executes on the last iteration. This will
// require a lane mask which varies through the vector loop body. (TODO)
if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
if (Legal->hasUncountableEarlyExit() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? If there's an uncountable early exit, there won't be a single exiting block and the check below will be true (as there must be a latch)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the latch should always be non-null.

@@ -4753,7 +4769,9 @@ bool LoopVectorizationPlanner::isCandidateForEpilogueVectorization(
// Epilogue vectorization code has not been auditted to ensure it handles
// non-latch exits properly. It may be fine, but it needs auditted and
// tested.
if (OrigLoop->getExitingBlock() != OrigLoop->getLoopLatch())
// TODO: Add support for loops with an early exit.
if (Legal->hasUncountableEarlyExit() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? If there's an uncountable early exit, there won't be a single exiting block and the check below will be true (as there must be a latch)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the latch should always be non-null.

Comment on lines 10216 to 10218
for (BasicBlock *BB : L->blocks()) {
for (Instruction &I : *BB) {
for (User *U : I.users()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the concern only relates to induction variables can you make use of LoopVectorizationLegality::getInductionVars()? If not and you need to walk across all the BasicBlocks, would it be sufficient to just iterate through a block's phi instructions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Done!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit surprised to see those changes were needed given the existing checks in VPlan. I had a look to see why we were missing cases. Looks like we weren't properly handling exit phis with multiple operands properly, which should be fixable #120260

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an issue I believe I described in a previous vectoriser call and at the time I mentioned I had a fix for this already. I just haven't had chance to land the fix because it depended upon this patch landing first. In the spirit of landing small, incremental patches I decided it was best to run all the vectoriser tests with the new flag to get the code defended, then follow on with incremental fixes and additional functionality. This patch holds up several other patches and I would prefer not to make this patch dependent on #120260.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any concerns regarding #120260 or will the future patches rely on the IR based checks?

It is also possible to share multiple stacked PR to give a preview of what changes are in the pipeline (one option is to have a PR just include all commits or do stacked PRs via user branches https://discourse.llvm.org/t/update-on-github-pull-requests/71540/146?u=fhahn)

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me. Without looking deeper I cannot confirm if #120260 is valid but I figure there's no harm in landing all the fixes (and tests) in this PR and then #120260 can remove the one it improves upon when ready.

@david-arm david-arm merged commit 13107cb into llvm:main Dec 18, 2024
8 checks passed
@fhahn
Copy link
Contributor

fhahn commented Dec 18, 2024

This works for me. Without looking deeper I cannot confirm if #120260 is valid but I figure there's no harm in landing all the fixes (and tests) in this PR and then #120260 can remove the one it improves upon when ready.

Yeah sounds good to me, I guess we can remove the extra checks again as part of #120260

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/12513

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
...
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug47654.cpp (984 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug50022.cpp (985 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/test_libc.cpp (986 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/wtime.c (987 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp (988 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/std_complex_arithmetic.cpp (989 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/complex_reduction.cpp (990 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49021.cpp (991 of 993)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/std_complex_arithmetic.cpp (992 of 993)
TIMEOUT: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp (993 of 993)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp' FAILED ********************
Exit Code: -9
Timeout: Reached timeout of 100 seconds

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True

--

********************
Slowest Tests:
--------------------------------------------------------------------------
100.05s: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp
16.93s: libomptarget :: amdgcn-amd-amdhsa :: offloading/bug49021.cpp
13.34s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_min.cpp
12.98s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_max.cpp
11.37s: libomptarget :: amdgcn-amd-amdhsa :: offloading/complex_reduction.cpp
9.92s: libomptarget :: amdgcn-amd-amdhsa :: jit/empty_kernel_lvl2.c
9.20s: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/10908

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/9553

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/opt -S < /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/10400

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder ml-opt-rel-x86-64 running on ml-opt-rel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/10400

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-rel-x86-64-b1/build/bin/opt -S < /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/ml-opt-rel-x86-64-b1/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder ml-opt-dev-x86-64 running on ml-opt-dev-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/10531

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-dev-x86-64-b1/build/bin/opt -S < /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/ml-opt-dev-x86-64-b1/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@david-arm
Copy link
Contributor Author

Hi, looks like a merge issue as the tests were passing before I merged. I'll revert the patch and reapply.

@fhahn
Copy link
Contributor

fhahn commented Dec 18, 2024

Looks like the test updates were not in sync with current main, pushed a fix 3e02038 to avoid the test failure

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/14787

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@fhahn
Copy link
Contributor

fhahn commented Dec 18, 2024

@david-arm it looks like the patch wasn't rebased in a while, and some of the check lines were stale and just needed a simple update. The only way to avoid this currently is updating the branch before merging and waiting for the precommit tests to complete again, as otherwise it won't catch failures due to changes on main since they were run the last time

@david-arm
Copy link
Contributor Author

Looks like the test updates were not in sync with current main, pushed a fix 3e02038 to avoid the test failure

That's fast @fhahn - thank you! I was just about to look into it. I do normally rebase downstream and run make check-all, but for some reason I forgot today. Does rebasing the patch in github trigger new runs though? I don't entirely trust it to do the right thing.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/15495

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-5 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/11593

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S < /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
�[0;1;32m              ^
�[0m�[1m<stdin>:180:49: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m %13 = icmp eq <4 x i32> %wide.load, %wide.load2
�[0;1;32m                                                ^
�[0m�[1m<stdin>:180:49: �[0m�[0;1;30mnote: �[0m�[1mwith "TMP13" equal to "%13"
�[0m %13 = icmp eq <4 x i32> %wide.load, %wide.load2
�[0;1;32m                                                ^
�[0m�[1m<stdin>:182:33: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
�[0;1;32m                                ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: �[0m�[0;1;31merror: �[0m�[1mundefined variable: LOOP0
�[0m; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
�[0;1;32m           ^
�[0m�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: �[0m�[0;1;31merror: �[0m�[1mundefined variable: LOOP0
�[0m; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
�[0;1;32m                                  ^
�[0m�[1m<stdin>:267:2: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m!0 = distinct !{!0, !1, !2}
�[0;1;32m ^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m             1: �[0m�[1m�[0;1;46m; ModuleID = '<stdin>' �[0m
�[0;1;30m             2: �[0m�[1m�[0;1;46msource_filename = "<stdin>" �[0m
�[0;1;30m             3: �[0m�[1m�[0;1;46mtarget datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32" �[0m
�[0;1;30m             4: �[0m�[1m�[0;1;46mtarget triple = "aarch64-unknown-linux-gnu" �[0m
�[0;1;30m             5: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m             6: �[0m�[1m�[0;1;46m%my.struct = type { i8, i8 } �[0m
�[0;1;30m             7: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m             8: �[0m�[1m�[0;1;46mdeclare void @init_mem(ptr, i64) �[0m
�[0;1;30m             9: �[0m�[1m�[0;1;46m �[0m
�[0;1;30m            10: �[0m�[1m�[0;1;46m; Function Attrs: vscale_range(1,16) �[0m
�[0;1;30m            11: �[0m�[1m�[0;1;46m�[0mdefine i64 @same_exit_block_pre_inc_use1() #0 {�[0;1;46m �[0m
�[0;1;32mlabel:9'0       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:9'1       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@fhahn
Copy link
Contributor

fhahn commented Dec 18, 2024

@david-arm just noticed it failing locally. Pressing the Update Branch button should merge in changes from current main and then trigger a new run of the pre-commit checks

@david-arm
Copy link
Contributor Author

@david-arm just noticed it failing locally. Pressing the Update Branch button should merge in changes from current main and then trigger a new run of the pre-commit checks

Well anyway, apologies for my silly mistake. I seem to remember in the past there were times I rebased and it didn't trigger a new run, but maybe I was doing something wrong. Even so, I "normally" prefer to be paranoid and run "make check-all" downstream, except for today when I didn't. :)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot4 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/6556

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 87942 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (71784 of 87942)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/opt -S < /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
Step 10 (stage2/asan_ubsan check) failure: stage2/asan_ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 87942 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (71784 of 87942)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/opt -S < /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
Step 13 (stage2/msan check) failure: stage2/msan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 87940 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (66553 of 87940)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/opt -S < /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-asan running on sanitizer-buildbot7 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/24/builds/3370

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85554 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70..
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (67742 of 85554)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/opt -S < /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
Step 11 (stage2/asan check) failure: stage2/asan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85554 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70..
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (67742 of 85554)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/opt -S < /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
Step 13 (stage3/asan check) failure: stage3/asan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 82717 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 
FAIL: LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll (67748 of 82717)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/opt -S < /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build2_asan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/8496

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 18, 2024

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/17922

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/LoopVectorize/AArch64/simple_early_exit.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /build/buildbot/premerge-monolithic-linux/build/bin/opt -S < /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll -p loop-vectorize -enable-early-exit-vectorization | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll --check-prefixes=CHECK
+ /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -p loop-vectorize -enable-early-exit-vectorization
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:304:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP14:%.*]] = xor <4 x i1> [[TMP13]], splat (i1 true)
              ^
<stdin>:180:49: note: scanning from here
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:180:49: note: with "TMP13" equal to "%13"
 %13 = icmp eq <4 x i32> %wide.load, %wide.load2
                                                ^
<stdin>:182:33: note: possible intended match here
 %14 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> %13)
                                ^
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:12: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
           ^
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll:422:35: error: undefined variable: LOOP0
; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
                                  ^
<stdin>:267:2: note: possible intended match here
!0 = distinct !{!0, !1, !2}
 ^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
           175:  %10 = getelementptr inbounds i32, ptr %9, i32 0 
           176:  %wide.load = load <4 x i32>, ptr %10, align 4 
           177:  %11 = getelementptr inbounds i32, ptr %p2, i64 %8 
           178:  %12 = getelementptr inbounds i32, ptr %11, i32 0 
           179:  %wide.load2 = load <4 x i32>, ptr %12, align 4 
           180:  %13 = icmp eq <4 x i32> %wide.load, %wide.load2 
next:304'0                                                      X error: no match found
next:304'1                                                        with "TMP13" equal to "%13"
           181:  %index.next = add nuw i64 %index, 4 
next:304'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

fhahn added a commit that referenced this pull request Dec 31, 2024
This ensures that all blocks created during VPlan execution are properly
added to an enclosing loop, if present.

Split off from #108378 and also
needed once more of the skeleton blocks are created directly via VPlan.

This also allows removing the custom logic for early-exit loop
vectorization added as part of
#117008.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…xecute (NFCI).

This ensures that all blocks created during VPlan execution are properly
added to an enclosing loop, if present.

Split off from llvm/llvm-project#108378 and also
needed once more of the skeleton blocks are created directly via VPlan.

This also allows removing the custom logic for early-exit loop
vectorization added as part of
llvm/llvm-project#117008.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants