Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPlan] Dispatch to multiple exit blocks via middle blocks. #112138

Merged
merged 29 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
245b56a
[VPlan] Support VPIRBBs and VPIRInst phis with multiple predecessors.
fhahn Oct 1, 2024
47258de
[VPlan] Dispatch to multiple exit blocks via middle blocks.
fhahn Sep 18, 2024
9265fb1
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Oct 31, 2024
3831acb
!fixup address first set of comments, thanks!
fhahn Oct 31, 2024
e64888a
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Nov 4, 2024
64db0ee
!fixup clean up merge failures
fhahn Nov 4, 2024
3259e66
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Nov 5, 2024
0f8aedf
!fixup address latest comments, thanks!
fhahn Nov 5, 2024
9212f96
[VPlan] Generalize collectUsersInExitBlocks for multiple exit bbs.
fhahn Nov 5, 2024
5cb0851
!fixup address comments
fhahn Nov 5, 2024
e849195
!fixup address more comments.
fhahn Nov 5, 2024
7b98d34
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Nov 6, 2024
c53eca6
!fixup remove left over canVectorizeEarlyExit
fhahn Nov 6, 2024
43a8ef7
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Nov 23, 2024
e26af8e
!fixup address latest comments, thanks!
fhahn Nov 23, 2024
06c3d39
[VPlan] Print incoming VPBB for phi VPIRInstruction (NFC).
fhahn Nov 23, 2024
552bd91
!fixup update recipe printing
fhahn Nov 23, 2024
2042a43
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Nov 23, 2024
00dea4a
!fixup add dbg message
fhahn Nov 23, 2024
7b8866d
!fixup fix formatting
fhahn Nov 24, 2024
4d5608f
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Dec 8, 2024
b9ee739
!restore test checks.
fhahn Dec 8, 2024
43d5590
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Dec 10, 2024
cba7dce
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Dec 10, 2024
95f4276
!fixup address most comment, a few more pending
fhahn Dec 10, 2024
c3d3b39
!fixup address remaining comments, thanks!
fhahn Dec 10, 2024
a875249
!fixup update doc
fhahn Dec 10, 2024
65d0288
Merge remote-tracking branch 'origin/main' into vplan-branch-on-multi…
fhahn Dec 11, 2024
8d04383
!fixup address latest comments, thanks!
fhahn Dec 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,9 @@ class LoopVectorizationLegality {
/// we can use in-order reductions.
bool canVectorizeFPMath(bool EnableStrictReductions);

/// Returns true if the loop has an early exit that we can vectorize.
bool canVectorizeEarlyExit() const;

/// Return true if we can vectorize this loop while folding its tail by
/// masking.
bool canFoldTailByMasking() const;
Expand Down
29 changes: 29 additions & 0 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ AllowStridedPointerIVs("lv-strided-pointer-ivs", cl::init(false), cl::Hidden,
cl::desc("Enable recognition of non-constant strided "
"pointer induction variables."));

static cl::opt<bool>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a description here, such as what I had in https://github.com/llvm/llvm-project/pull/88385/files for the same flag? Or is the idea to try to not expose this too much at this stage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to move it to LoopVectorize.cpp, thanks. Originally this was only used in combination with the new helper introduced to LVL, but that changed after using the existing checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the flag still seems to be in the old place. Perhaps the patch hasn't updated correctly?

EnableEarlyExitVectorization("enable-early-exit-vectorization",
cl::init(false), cl::Hidden, cl::desc(""));

namespace llvm {
cl::opt<bool>
HintsAllowReordering("hints-allow-reordering", cl::init(true), cl::Hidden,
Expand Down Expand Up @@ -1378,6 +1382,10 @@ bool LoopVectorizationLegality::isFixedOrderRecurrence(
}

bool LoopVectorizationLegality::blockNeedsPredication(BasicBlock *BB) const {
// When vectorizing early exits, create predicates for all blocks, except the
// header.
if (canVectorizeEarlyExit() && BB != TheLoop->getHeader())
return true;
return LoopAccessInfo::blockNeedsPredication(BB, TheLoop, DT);
}

Expand Down Expand Up @@ -1514,6 +1522,27 @@ bool LoopVectorizationLegality::canVectorizeWithIfConvert() {
return true;
}

bool LoopVectorizationLegality::canVectorizeEarlyExit() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little odd to have both canVectorizeEarlyExit and isVectorizableEarlyExitLoop in the same class. Would it make sense to move the hints check into isVectorizableEarlyExitLoop to bypass legality checks? You could even get the benefit of existing code to build up a list of countable and uncountable exits that can be used later on when splitting the middle.block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I tried to limit the scope to just support vectorizing loops with multiple countable exits, but this probably made things a bit more complicated for not too much gain. Updated to use the existing isVectorizableEarlyExitLoop

I removed the new codegen tests (only kept the VPlan version). Are there any existing tests already for which adding the flag would be sufficient now that this is using the existing checks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can also able enable early exit autovec with Transforms/LoopVectorize/single_early_exit.ll

// Currently only allow vectorizing loops with early exits, if early-exit
// vectorization is explicitly enabled and the loop has metadata to force
// vectorization.
if (!EnableEarlyExitVectorization)
return false;

SmallVector<BasicBlock *> Exiting;
TheLoop->getExitingBlocks(Exiting);
if (Exiting.size() == 1)
return false;

LoopVectorizeHints Hints(TheLoop, true, *ORE);
if (Hints.getForce() == LoopVectorizeHints::FK_Undefined)
return false;

Function *Fn = TheLoop->getHeader()->getParent();
return Hints.allowVectorization(Fn, TheLoop,
true /*VectorizeOnlyWhenForced*/);
}

// Helper function to canVectorizeLoopNestCFG.
bool LoopVectorizationLegality::canVectorizeLoopCFG(Loop *Lp,
bool UseVPlanNativePath) {
Expand Down
82 changes: 51 additions & 31 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1363,9 +1363,11 @@ class LoopVectorizationCostModel {
// If we might exit from anywhere but the latch, must run the exiting
// iteration in scalar form.
if (TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
LLVM_DEBUG(
dbgs() << "LV: Loop requires scalar epilogue: multiple exits\n");
return true;
if (!Legal->canVectorizeEarlyExit()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can fold the conditions together into a single if statement:

  if (!Legal->canVectorizeEarlyExit() && TheLoop->getExitingBlock() != TheLoop->getLoopLatch()) {
    ...
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

folded thanks!

LLVM_DEBUG(
dbgs() << "LV: Loop requires scalar epilogue: multiple exits\n");
return true;
}
}
if (IsVectorizing && InterleaveInfo.requiresScalarEpilogue()) {
LLVM_DEBUG(dbgs() << "LV: Loop requires scalar epilogue: "
Expand Down Expand Up @@ -2575,7 +2577,8 @@ void InnerLoopVectorizer::createVectorLoopSkeleton(StringRef Prefix) {
LoopVectorPreHeader = OrigLoop->getLoopPreheader();
assert(LoopVectorPreHeader && "Invalid loop structure");
LoopExitBlock = OrigLoop->getUniqueExitBlock(); // may be nullptr
assert((LoopExitBlock || Cost->requiresScalarEpilogue(VF.isVector())) &&
assert((LoopExitBlock || Cost->requiresScalarEpilogue(VF.isVector()) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can completely remove the need for the LoopExitBlock variable - see #108231, since it's only ever used in one place. And even then what's actually meant by LoopExitBlock in createEpilogueVectorizedLoopSkeleton is the exit block from the latch.

Then in #88385 I think we can replace this assert with:

  assert((OrigLoop->getUniqueLatchExitBlock() ||
          Cost->requiresScalarEpilogue(VF.isVector())) &&
         "multiple exit loop without required epilogue?");

because even if canVectorizeEarlyExit returns true I think we still require an exit from the latch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Legal->canVectorizeEarlyExit()) &&
"multiple exit loop without required epilogue?");

LoopMiddleBlock =
Expand Down Expand Up @@ -2758,8 +2761,6 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
// value (the value that feeds into the phi from the loop latch).
// We allow both, but they, obviously, have different values.

assert(OrigLoop->getUniqueExitBlock() && "Expected a single exit block");

DenseMap<Value *, Value *> MissingVals;

// An external user of the last iteration's value should see the value that
Expand Down Expand Up @@ -2819,6 +2820,9 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
if (PHI->getBasicBlockIndex(MiddleBlock) == -1)
PHI->addIncoming(I.second, MiddleBlock);
}

assert((MissingVals.empty() || OrigLoop->getUniqueExitBlock()) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the assert would be more accurate with the statement Expected a single exit block for escaping values?

Also, might be worth moving the assert to before the for (auto &I : MissingVals) line because that's the point at which we start adjusting the original scalar code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

"Expected a single exit block");
}

namespace {
Expand Down Expand Up @@ -3599,7 +3603,8 @@ void LoopVectorizationCostModel::collectLoopUniforms(ElementCount VF) {
TheLoop->getExitingBlocks(Exiting);
for (BasicBlock *E : Exiting) {
auto *Cmp = dyn_cast<Instruction>(E->getTerminator()->getOperand(0));
if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse())
if (Cmp && TheLoop->contains(Cmp) && Cmp->hasOneUse() &&
(TheLoop->getLoopLatch() == E || !Legal->canVectorizeEarlyExit()))
AddToWorklistIfAllowed(Cmp);
}

Expand Down Expand Up @@ -7692,12 +7697,15 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
BestVPlan.execute(&State);

// 2.5 Collect reduction resume values.
auto *ExitVPBB =
cast<VPBasicBlock>(BestVPlan.getVectorLoopRegion()->getSingleSuccessor());
for (VPRecipeBase &R : *ExitVPBB) {
createAndCollectMergePhiForReduction(
dyn_cast<VPInstruction>(&R), State, OrigLoop,
State.CFG.VPBB2IRBB[ExitVPBB], ExpandedSCEVs);
VPBasicBlock *ExitVPBB = nullptr;
if (BestVPlan.getVectorLoopRegion()->getSingleSuccessor()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could rewrite this as:

if (auto *LoopSucc = BestVPlan.getVectorLoopRegion()->getSingleSuccessor()) {
  ExitVPBB = cast<VPBasicBlock>(LoopSucc);
  ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact there can only be one successor in this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, something left over from earlier versions. Removed, thanks!

ExitVPBB = cast<VPBasicBlock>(
BestVPlan.getVectorLoopRegion()->getSingleSuccessor());
for (VPRecipeBase &R : *ExitVPBB) {
createAndCollectMergePhiForReduction(
dyn_cast<VPInstruction>(&R), State, OrigLoop,
State.CFG.VPBB2IRBB[ExitVPBB], ExpandedSCEVs);
}
}

// 2.6. Maintain Loop Hints
Expand All @@ -7723,6 +7731,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
LoopVectorizeHints Hints(L, true, *ORE);
Hints.setAlreadyVectorized();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: stray white space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

TargetTransformInfo::UnrollingPreferences UP;
TTI.getUnrollingPreferences(L, *PSE.getSE(), UP, ORE);
if (!UP.UnrollVectorizedLoop || CanonicalIVStartValue)
Expand All @@ -7735,15 +7744,17 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
ILV.printDebugTracesAtEnd();

// 4. Adjust branch weight of the branch in the middle block.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... "if the latter exists"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

auto *MiddleTerm =
cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
if (MiddleTerm->isConditional() &&
hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
// Assume that `Count % VectorTripCount` is equally distributed.
unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
assert(TripCount > 0 && "trip count should not be zero");
const uint32_t Weights[] = {1, TripCount - 1};
setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
if (ExitVPBB) {
auto *MiddleTerm =
cast<BranchInst>(State.CFG.VPBB2IRBB[ExitVPBB]->getTerminator());
if (MiddleTerm->isConditional() &&
hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) {
// Assume that `Count % VectorTripCount` is equally distributed.
unsigned TripCount = BestVPlan.getUF() * State.VF.getKnownMinValue();
assert(TripCount > 0 && "trip count should not be zero");
const uint32_t Weights[] = {1, TripCount - 1};
setBranchWeights(*MiddleTerm, Weights, /*IsExpected=*/false);
}
}

return State.ExpandedSCEVs;
Expand Down Expand Up @@ -8128,7 +8139,7 @@ VPValue *VPRecipeBuilder::createEdgeMask(BasicBlock *Src, BasicBlock *Dst) {
// If source is an exiting block, we know the exit edge is dynamically dead
// in the vector loop, and thus we don't need to restrict the mask. Avoid
// adding uses of an otherwise potentially dead instruction.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

if (OrigLoop->isLoopExiting(Src))
if (!Legal->canVectorizeEarlyExit() && OrigLoop->isLoopExiting(Src))
return EdgeMaskCache[Edge] = SrcMask;

VPValue *EdgeMask = getVPValueOrAddLiveIn(BI->getCondition());
Expand Down Expand Up @@ -8778,6 +8789,8 @@ static void addCanonicalIVRecipes(VPlan &Plan, Type *IdxTy, bool HasNUW,
static SetVector<VPIRInstruction *> collectUsersInExitBlock(
Loop *OrigLoop, VPRecipeBuilder &Builder, VPlan &Plan,
const MapVector<PHINode *, InductionDescriptor> &Inductions) {
if (!Plan.getVectorLoopRegion()->getSingleSuccessor())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will only ever be one successor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

return {};
auto *MiddleVPBB =
cast<VPBasicBlock>(Plan.getVectorLoopRegion()->getSingleSuccessor());
// No edge from the middle block to the unique exit block has been inserted
Expand Down Expand Up @@ -8863,6 +8876,8 @@ static void addLiveOutsForFirstOrderRecurrences(
// TODO: Should be replaced by
// Plan->getScalarLoopRegion()->getSinglePredecessor() in the future once the
// scalar region is modeled as well.
if (!VectorRegion->getSingleSuccessor())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one successor permitted. Can delete this I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

return;
auto *MiddleVPBB = cast<VPBasicBlock>(VectorRegion->getSingleSuccessor());
VPBasicBlock *ScalarPHVPBB = nullptr;
if (MiddleVPBB->getNumSuccessors() == 2) {
Expand Down Expand Up @@ -9146,10 +9161,15 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
"VPBasicBlock");
RecipeBuilder.fixHeaderPhis();

SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlock(
OrigLoop, RecipeBuilder, *Plan, Legal->getInductionVars());
addLiveOutsForFirstOrderRecurrences(*Plan, ExitUsersToFix);
addUsersInExitBlock(*Plan, ExitUsersToFix);
if (Legal->canVectorizeEarlyExit()) {
VPlanTransforms::convertToMultiCond(*Plan, *PSE.getSE(), OrigLoop,
RecipeBuilder);
} else {
SetVector<VPIRInstruction *> ExitUsersToFix = collectUsersInExitBlock(
OrigLoop, RecipeBuilder, *Plan, Legal->getInductionVars());
addLiveOutsForFirstOrderRecurrences(*Plan, ExitUsersToFix);
addUsersInExitBlock(*Plan, ExitUsersToFix);
}

// ---------------------------------------------------------------------------
// Transform initial VPlan: Apply previously taken decisions, in order, to
Expand Down Expand Up @@ -9277,8 +9297,6 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
using namespace VPlanPatternMatch;
VPRegionBlock *VectorLoopRegion = Plan->getVectorLoopRegion();
VPBasicBlock *Header = VectorLoopRegion->getEntryBasicBlock();
VPBasicBlock *MiddleVPBB =
cast<VPBasicBlock>(VectorLoopRegion->getSingleSuccessor());
for (VPRecipeBase &R : Header->phis()) {
auto *PhiR = dyn_cast<VPReductionPHIRecipe>(&R);
if (!PhiR || !PhiR->isInLoop() || (MinVF.isScalar() && !PhiR->isOrdered()))
Expand All @@ -9297,8 +9315,6 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
for (VPUser *U : Cur->users()) {
auto *UserRecipe = cast<VPSingleDefRecipe>(U);
if (!UserRecipe->getParent()->getEnclosingLoopRegion()) {
assert(UserRecipe->getParent() == MiddleVPBB &&
"U must be either in the loop region or the middle block.");
continue;
}
Worklist.insert(UserRecipe);
Expand Down Expand Up @@ -9403,6 +9419,10 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
}
VPBasicBlock *LatchVPBB = VectorLoopRegion->getExitingBasicBlock();
Builder.setInsertPoint(&*LatchVPBB->begin());
if (!VectorLoopRegion->getSingleSuccessor())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, only one successor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

return;
VPBasicBlock *MiddleVPBB =
cast<VPBasicBlock>(VectorLoopRegion->getSingleSuccessor());
VPBasicBlock::iterator IP = MiddleVPBB->getFirstNonPhi();
for (VPRecipeBase &R :
Plan->getVectorLoopRegion()->getEntryBasicBlock()->phis()) {
Expand Down
39 changes: 32 additions & 7 deletions llvm/lib/Transforms/Vectorize/VPlan.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -474,6 +474,14 @@ void VPIRBasicBlock::execute(VPTransformState *State) {
// backedges. A backward successor is set when the branch is created.
const auto &PredVPSuccessors = PredVPBB->getHierarchicalSuccessors();
unsigned idx = PredVPSuccessors.front() == this ? 0 : 1;
if (TermBr->getSuccessor(idx) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code was for a different version of the patch using BranchOnMultiCond - can be deleted I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, removed, thanks!

PredVPBlock == getPlan()->getVectorLoopRegion() &&
PredVPBlock->getNumSuccessors()) {
// Update PRedBB and TermBr for BranchOnMultiCond in predecessor.
PredBB = TermBr->getSuccessor(1);
TermBr = cast<BranchInst>(PredBB->getTerminator());
idx = 0;
}
assert(!TermBr->getSuccessor(idx) &&
"Trying to reset an existing successor block.");
TermBr->setSuccessor(idx, IRBB);
Expand Down Expand Up @@ -908,8 +916,8 @@ VPlanPtr VPlan::createInitialVPlan(Type *InductionTy,
VPBasicBlock *MiddleVPBB = new VPBasicBlock("middle.block");
VPBlockUtils::insertBlockAfter(MiddleVPBB, TopRegion);

VPBasicBlock *ScalarPH = new VPBasicBlock("scalar.ph");
if (!RequiresScalarEpilogueCheck) {
VPBasicBlock *ScalarPH = new VPBasicBlock("scalar.ph");
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
return Plan;
}
Expand All @@ -923,10 +931,14 @@ VPlanPtr VPlan::createInitialVPlan(Type *InductionTy,
// we unconditionally branch to the scalar preheader. Do nothing.
// 3) Otherwise, construct a runtime check.
BasicBlock *IRExitBlock = TheLoop->getUniqueExitBlock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should just be able to do

  BasicBlock *IRExitBlock = TheLoop->getUniqueLatchExitBlock();

here and remove the logic below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

auto *VPExitBlock = VPIRBasicBlock::fromBasicBlock(IRExitBlock);
// The connection order corresponds to the operands of the conditional branch.
VPBlockUtils::insertBlockAfter(VPExitBlock, MiddleVPBB);
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
if (IRExitBlock) {
auto *VPExitBlock = VPIRBasicBlock::fromBasicBlock(IRExitBlock);
// The connection order corresponds to the operands of the conditional
// branch.
VPBlockUtils::insertBlockAfter(VPExitBlock, MiddleVPBB);
VPBasicBlock *ScalarPH = new VPBasicBlock("scalar.ph");
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
}

auto *ScalarLatchTerm = TheLoop->getLoopLatch()->getTerminator();
// Here we use the same DebugLoc as the scalar loop latch terminator instead
Expand Down Expand Up @@ -1031,7 +1043,9 @@ void VPlan::execute(VPTransformState *State) {
// VPlan execution rather than earlier during VPlan construction.
BasicBlock *MiddleBB = State->CFG.ExitBB;
VPBasicBlock *MiddleVPBB =
cast<VPBasicBlock>(getVectorLoopRegion()->getSingleSuccessor());
getVectorLoopRegion()->getNumSuccessors() == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One successor only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

? cast<VPBasicBlock>(getVectorLoopRegion()->getSuccessors()[0])
: cast<VPBasicBlock>(getVectorLoopRegion()->getSuccessors()[1]);
// Find the VPBB for the scalar preheader, relying on the current structure
// when creating the middle block and its successrs: if there's a single
// predecessor, it must be the scalar preheader. Otherwise, the second
Expand All @@ -1044,6 +1058,10 @@ void VPlan::execute(VPTransformState *State) {
MiddleSuccs.size() == 1 ? MiddleSuccs[0] : MiddleSuccs[1]);
assert(!isa<VPIRBasicBlock>(ScalarPhVPBB) &&
"scalar preheader cannot be wrapped already");
if (ScalarPhVPBB->getNumSuccessors() != 0) {
ScalarPhVPBB = cast<VPBasicBlock>(ScalarPhVPBB->getSuccessors()[1]);
MiddleVPBB = cast<VPBasicBlock>(MiddleVPBB->getSuccessors()[1]);
}
replaceVPBBWithIRVPBB(ScalarPhVPBB, ScalarPh);
replaceVPBBWithIRVPBB(MiddleVPBB, MiddleBB);

Expand All @@ -1065,6 +1083,10 @@ void VPlan::execute(VPTransformState *State) {
VPBasicBlock *LatchVPBB = getVectorLoopRegion()->getExitingBasicBlock();
BasicBlock *VectorLatchBB = State->CFG.VPBB2IRBB[LatchVPBB];

if (!getVectorLoopRegion()->getSingleSuccessor())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One successor only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

VectorLatchBB =
cast<BranchInst>(VectorLatchBB->getTerminator())->getSuccessor(1);

// Fix the latch value of canonical, reduction and first-order recurrences
// phis in the vector loop.
VPBasicBlock *Header = getVectorLoopRegion()->getEntryBasicBlock();
Expand All @@ -1091,7 +1113,10 @@ void VPlan::execute(VPTransformState *State) {
// Move the last step to the end of the latch block. This ensures
// consistent placement of all induction updates.
Instruction *Inc = cast<Instruction>(Phi->getIncomingValue(1));
Inc->moveBefore(VectorLatchBB->getTerminator()->getPrevNode());
if (VectorLatchBB->getTerminator() == &*VectorLatchBB->getFirstNonPHI())
Inc->moveBefore(VectorLatchBB->getTerminator());
else
Inc->moveBefore(VectorLatchBB->getTerminator()->getPrevNode());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth extending the comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is not needed in the latest version, restored the original code.


// Use the steps for the last part as backedge value for the induction.
if (auto *IV = dyn_cast<VPWidenIntOrFpInductionRecipe>(&R))
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -1274,6 +1274,7 @@ class VPInstruction : public VPRecipeWithIRFlags,
// operand). Only generates scalar values (either for the first lane only or
// for all lanes, depending on its uses).
PtrAdd,
AnyOf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth adding a simple comment here? Something along the lines of:

  // Returns a scalar boolean value, which is true if any lane in the vector
  // predicate input is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some explanation how AnyOf relates (or should relate) to ComputeReductionResult?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think AnyOf also needs adding to the switch statement in VPRecipeBase::mayWriteToMemory and return false?

};

private:
Expand Down
Loading
Loading