-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[VPlan] Add exit phi operands during initial construction (NFC). #136455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction.
@llvm/pr-subscribers-vectorizers Author: Florian Hahn (fhahn) ChangesAdd incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction. Full diff: https://github.com/llvm/llvm-project/pull/136455.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 7a5f618d09e95..f9a7769b76b94 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9392,11 +9392,7 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder,
continue;
}
- PHINode &ExitPhi = ExitIRI->getIRPhi();
- BasicBlock *ExitingBB = OrigLoop->getLoopLatch();
- Value *IncomingValue = ExitPhi.getIncomingValueForBlock(ExitingBB);
- VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
- ExitIRI->addOperand(V);
+ VPValue *V = ExitIRI->getOperand(0);
if (V->isLiveIn())
continue;
assert(V->getDefiningRecipe()->getParent()->getEnclosingLoopRegion() &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 9fcccfcf8117f..95f0a91113fb9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -352,6 +352,19 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader()));
Plan->getEntry()->setPlan(&*Plan);
+ // Add incoming operands for the VPIRInstructions wrapping the exit phis.
+ for (auto *EB : Plan->getExitBlocks()) {
+ for (VPRecipeBase &R : *EB) {
+ auto *PhiR = dyn_cast<VPIRPhi>(&R);
+ if (!PhiR)
+ break;
+ PHINode &Phi = PhiR->getIRPhi();
+ for (BasicBlock *Pred : predecessors(EB->getIRBasicBlock()))
+ PhiR->addOperand(
+ getOrCreateVPOperand(Phi.getIncomingValueForBlock(Pred)));
+ }
+ }
+
for (const auto &[IRBB, VPB] : BB2VPBB)
VPB2IRBB[VPB] = IRBB;
@@ -464,6 +477,12 @@ void VPlanTransforms::createLoopRegions(VPlan &Plan, Type *InductionTy,
VPBlockUtils::connectBlocks(ScalarPH, Plan.getScalarHeader());
if (!RequiresScalarEpilogueCheck) {
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
+ // The exit blocks are dead, remove any recipes to make sure no users remain
+ // that may pessimize transforms.
+ for (auto *EB : Plan.getExitBlocks()) {
+ for (VPRecipeBase &R : make_early_inc_range(*EB))
+ R.eraseFromParent();
+ }
return;
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 2cc558f49ccce..9e68ab55783ce 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1139,7 +1139,7 @@ InstructionCost VPIRInstruction::computeCost(ElementCount VF,
void VPIRInstruction::extractLastLaneOfOperand(VPBuilder &Builder) {
assert(isa<PHINode>(getInstruction()) &&
"can only add exiting operands to phi nodes");
- assert(getNumOperands() == 1 && "must have a single operand");
+ assert(getNumOperands() > 0 && "must have at least one operand");
VPValue *Exiting = getOperand(0);
if (!Exiting->isLiveIn()) {
LLVMContext &Ctx = getInstruction().getContext();
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index b80fe18d1bd66..9f5d76c2c856d 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2501,20 +2501,24 @@ void VPlanTransforms::handleUncountableEarlyExit(
if (!ExitIRI)
break;
- PHINode &ExitPhi = ExitIRI->getIRPhi();
- VPValue *IncomingFromEarlyExit = RecipeBuilder.getVPValueOrAddLiveIn(
- ExitPhi.getIncomingValueForBlock(UncountableExitingBlock));
-
+ unsigned EarlyExitIdx = 0;
if (OrigLoop->getUniqueExitBlock()) {
+ // After the transform, the first incoming value is coming from the
+ // orignial loop latch, while the second operand is from the early exit.
+ // Sawp the phi operands, if the first predecessor in the original IR is
+ // not the loop latch.
+ if (*pred_begin(VPEarlyExitBlock->getIRBasicBlock()) !=
+ OrigLoop->getLoopLatch())
+ ExitIRI->swapOperands();
+
+ EarlyExitIdx = 1;
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors
// (MiddleVPBB and NewMiddle). Add the incoming value from MiddleVPBB
// which is coming from the original latch.
- VPValue *IncomingFromLatch = RecipeBuilder.getVPValueOrAddLiveIn(
- ExitPhi.getIncomingValueForBlock(OrigLoop->getLoopLatch()));
- ExitIRI->addOperand(IncomingFromLatch);
ExitIRI->extractLastLaneOfOperand(MiddleBuilder);
}
+ VPValue *IncomingFromEarlyExit = ExitIRI->getOperand(EarlyExitIdx);
auto IsVector = [](ElementCount VF) { return VF.isVector(); };
// When the VFs are vectors, need to add `extract` to get the incoming value
// from early exit. When the range contains scalar VF, limit the range to
@@ -2522,14 +2526,16 @@ void VPlanTransforms::handleUncountableEarlyExit(
// and vector VFs.
if (!IncomingFromEarlyExit->isLiveIn() &&
LoopVectorizationPlanner::getDecisionAndClampRange(IsVector, Range)) {
+ // Add the incoming value from the early exit.
VPValue *FirstActiveLane = EarlyExitB.createNaryOp(
VPInstruction::FirstActiveLane, {EarlyExitTakenCond}, nullptr,
"first.active.lane");
- IncomingFromEarlyExit = EarlyExitB.createNaryOp(
- Instruction::ExtractElement, {IncomingFromEarlyExit, FirstActiveLane},
- nullptr, "early.exit.value");
+ ExitIRI->setOperand(
+ EarlyExitIdx,
+ EarlyExitB.createNaryOp(Instruction::ExtractElement,
+ {IncomingFromEarlyExit, FirstActiveLane},
+ nullptr, "early.exit.value"));
}
- ExitIRI->addOperand(IncomingFromEarlyExit);
}
MiddleBuilder.createNaryOp(VPInstruction::BranchOnCond, {IsEarlyExitTaken});
|
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesAdd incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction. Full diff: https://github.com/llvm/llvm-project/pull/136455.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 7a5f618d09e95..f9a7769b76b94 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9392,11 +9392,7 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder,
continue;
}
- PHINode &ExitPhi = ExitIRI->getIRPhi();
- BasicBlock *ExitingBB = OrigLoop->getLoopLatch();
- Value *IncomingValue = ExitPhi.getIncomingValueForBlock(ExitingBB);
- VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue);
- ExitIRI->addOperand(V);
+ VPValue *V = ExitIRI->getOperand(0);
if (V->isLiveIn())
continue;
assert(V->getDefiningRecipe()->getParent()->getEnclosingLoopRegion() &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 9fcccfcf8117f..95f0a91113fb9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -352,6 +352,19 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader()));
Plan->getEntry()->setPlan(&*Plan);
+ // Add incoming operands for the VPIRInstructions wrapping the exit phis.
+ for (auto *EB : Plan->getExitBlocks()) {
+ for (VPRecipeBase &R : *EB) {
+ auto *PhiR = dyn_cast<VPIRPhi>(&R);
+ if (!PhiR)
+ break;
+ PHINode &Phi = PhiR->getIRPhi();
+ for (BasicBlock *Pred : predecessors(EB->getIRBasicBlock()))
+ PhiR->addOperand(
+ getOrCreateVPOperand(Phi.getIncomingValueForBlock(Pred)));
+ }
+ }
+
for (const auto &[IRBB, VPB] : BB2VPBB)
VPB2IRBB[VPB] = IRBB;
@@ -464,6 +477,12 @@ void VPlanTransforms::createLoopRegions(VPlan &Plan, Type *InductionTy,
VPBlockUtils::connectBlocks(ScalarPH, Plan.getScalarHeader());
if (!RequiresScalarEpilogueCheck) {
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
+ // The exit blocks are dead, remove any recipes to make sure no users remain
+ // that may pessimize transforms.
+ for (auto *EB : Plan.getExitBlocks()) {
+ for (VPRecipeBase &R : make_early_inc_range(*EB))
+ R.eraseFromParent();
+ }
return;
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 2cc558f49ccce..9e68ab55783ce 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1139,7 +1139,7 @@ InstructionCost VPIRInstruction::computeCost(ElementCount VF,
void VPIRInstruction::extractLastLaneOfOperand(VPBuilder &Builder) {
assert(isa<PHINode>(getInstruction()) &&
"can only add exiting operands to phi nodes");
- assert(getNumOperands() == 1 && "must have a single operand");
+ assert(getNumOperands() > 0 && "must have at least one operand");
VPValue *Exiting = getOperand(0);
if (!Exiting->isLiveIn()) {
LLVMContext &Ctx = getInstruction().getContext();
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index b80fe18d1bd66..9f5d76c2c856d 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2501,20 +2501,24 @@ void VPlanTransforms::handleUncountableEarlyExit(
if (!ExitIRI)
break;
- PHINode &ExitPhi = ExitIRI->getIRPhi();
- VPValue *IncomingFromEarlyExit = RecipeBuilder.getVPValueOrAddLiveIn(
- ExitPhi.getIncomingValueForBlock(UncountableExitingBlock));
-
+ unsigned EarlyExitIdx = 0;
if (OrigLoop->getUniqueExitBlock()) {
+ // After the transform, the first incoming value is coming from the
+ // orignial loop latch, while the second operand is from the early exit.
+ // Sawp the phi operands, if the first predecessor in the original IR is
+ // not the loop latch.
+ if (*pred_begin(VPEarlyExitBlock->getIRBasicBlock()) !=
+ OrigLoop->getLoopLatch())
+ ExitIRI->swapOperands();
+
+ EarlyExitIdx = 1;
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors
// (MiddleVPBB and NewMiddle). Add the incoming value from MiddleVPBB
// which is coming from the original latch.
- VPValue *IncomingFromLatch = RecipeBuilder.getVPValueOrAddLiveIn(
- ExitPhi.getIncomingValueForBlock(OrigLoop->getLoopLatch()));
- ExitIRI->addOperand(IncomingFromLatch);
ExitIRI->extractLastLaneOfOperand(MiddleBuilder);
}
+ VPValue *IncomingFromEarlyExit = ExitIRI->getOperand(EarlyExitIdx);
auto IsVector = [](ElementCount VF) { return VF.isVector(); };
// When the VFs are vectors, need to add `extract` to get the incoming value
// from early exit. When the range contains scalar VF, limit the range to
@@ -2522,14 +2526,16 @@ void VPlanTransforms::handleUncountableEarlyExit(
// and vector VFs.
if (!IncomingFromEarlyExit->isLiveIn() &&
LoopVectorizationPlanner::getDecisionAndClampRange(IsVector, Range)) {
+ // Add the incoming value from the early exit.
VPValue *FirstActiveLane = EarlyExitB.createNaryOp(
VPInstruction::FirstActiveLane, {EarlyExitTakenCond}, nullptr,
"first.active.lane");
- IncomingFromEarlyExit = EarlyExitB.createNaryOp(
- Instruction::ExtractElement, {IncomingFromEarlyExit, FirstActiveLane},
- nullptr, "early.exit.value");
+ ExitIRI->setOperand(
+ EarlyExitIdx,
+ EarlyExitB.createNaryOp(Instruction::ExtractElement,
+ {IncomingFromEarlyExit, FirstActiveLane},
+ nullptr, "early.exit.value"));
}
- ExitIRI->addOperand(IncomingFromEarlyExit);
}
MiddleBuilder.createNaryOp(VPInstruction::BranchOnCond, {IsEarlyExitTaken});
|
@@ -9392,11 +9392,7 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder, | |||
continue; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above "and add the exiting VPValue as operand" is now obsolete. Worth noting that users of multiple (early) exits are excluded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, aslo dropped .. modeled in VPlan, as all are now modeled in VPlan.
@@ -9392,11 +9392,7 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder, | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: can R iterate over ExitVPBB->phis() above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 8c83355, thanks
Value *IncomingValue = ExitPhi.getIncomingValueForBlock(ExitingBB); | ||
VPValue *V = Builder.getVPValueOrAddLiveIn(IncomingValue); | ||
ExitIRI->addOperand(V); | ||
VPValue *V = ExitIRI->getOperand(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth asserting that ExitIRI has a single operand? Expected to match its single middle-block predecessor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
@@ -352,6 +352,19 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG( | |||
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader())); | |||
Plan->getEntry()->setPlan(&*Plan); | |||
|
|||
// Add incoming operands for the VPIRInstructions wrapping the exit phis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on their IR predecessors, as they still have no VPlan predecessors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be the next change ;)
@@ -352,6 +352,19 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG( | |||
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader())); | |||
Plan->getEntry()->setPlan(&*Plan); | |||
|
|||
// Add incoming operands for the VPIRInstructions wrapping the exit phis. | |||
for (auto *EB : Plan->getExitBlocks()) { | |||
for (VPRecipeBase &R : *EB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can R iterate over EB->phis()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors | ||
// (MiddleVPBB and NewMiddle). Add the incoming value from MiddleVPBB | ||
// which is coming from the original latch. | ||
VPValue *IncomingFromLatch = RecipeBuilder.getVPValueOrAddLiveIn( | ||
ExitPhi.getIncomingValueForBlock(OrigLoop->getLoopLatch())); | ||
ExitIRI->addOperand(IncomingFromLatch); | ||
ExitIRI->extractLastLaneOfOperand(MiddleBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: the extract from the latch created for (first) operand uses extractLastLaneOfOperand(), the extract from early exit replaced below explicitly. Can this be done more consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, we could as follow-up add the extracts from last lane for all exit phis during construction, then have the extract from the early exit updated here?
OrigLoop->getLoopLatch()) | ||
ExitIRI->swapOperands(); | ||
|
||
EarlyExitIdx = 1; | ||
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors | ||
// (MiddleVPBB and NewMiddle). Add the incoming value from MiddleVPBB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: worth indicating that early.exit and latch.exit may be the same block, in https://llvm.org/docs/Vectorizers.html#early-exit-vectorization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do separately, thanks
VPValue *IncomingFromEarlyExit = RecipeBuilder.getVPValueOrAddLiveIn( | ||
ExitPhi.getIncomingValueForBlock(UncountableExitingBlock)); | ||
|
||
unsigned EarlyExitIdx = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned EarlyExitIdx = 0; | |
// By default, assume early exit operand is first, e.g., when the two exit blocks are distinct - VPEarlyExitBlock has a single predecessor. | |
unsigned EarlyExitIdx = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, perhaps its better instead to set EarlyExitIdx to be the last operand, consistently?
If VPEarlyExitBlock has two predecessors, they are already ordered such that early exit is second, by construction. But its underlying IRBB may have its predecessors ordered the other way around, and it is this order which corresponds to the order of operands of VPEarlyExitBlock's phi recipes. Therefore, if early exit is the first predecessor of the underlying IRBB, we swap the operands of phi recipes, bringing them to match VPEarlyExitBlock's predecessor order with early exit being last (second).
ExitIRI->extractLastLaneOfOperand(MiddleBuilder); | ||
} | ||
|
||
VPValue *IncomingFromEarlyExit = ExitIRI->getOperand(EarlyExitIdx); | ||
auto IsVector = [](ElementCount VF) { return VF.isVector(); }; | ||
// When the VFs are vectors, need to add `extract` to get the incoming value | ||
// from early exit. When the range contains scalar VF, limit the range to | ||
// scalar VF to prevent mis-compilation for the range containing both scalar | ||
// and vector VFs. | ||
if (!IncomingFromEarlyExit->isLiveIn() && | ||
LoopVectorizationPlanner::getDecisionAndClampRange(IsVector, Range)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: clamping range inside VPlanTransform? Limiting the range to scalar VF - is another VPlan constructed for the vector (sub)range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is done early on when we are clamping the loop range for other reasons as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Independent) Still puzzled about clamping the range at this stage, when all VPlans were already created following range clampings. Rather than say asserting that the range contains either scalar or vector VF's but not both, and introduce extracts if it's the latter. Extracts added above for latch exit need not check vector VF's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the transform isn't run when all VPlans have been created, but in tryToBuildVPlanWithVPRecipes
where we the range is also clamped in other places ( https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp#L9758)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, very well, thanks.
(Independent) May be good to indicate somehow transforms that may clamp the range - that belong to tryToBuildVPlanWithVPRecipes stage, and prevent them from operating afterwards, perhaps by disabling range clamping afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The range should only be available in tryToBuildVPlanWithVPRecipes
and not subsequent optimizations; we could seaprate things even clearer by first building all VPlans (i.e. only do tryToBuildVPlanWithVPRecipes
) and then optimize all VPlans separatel
ExitIRI->setOperand( | ||
EarlyExitIdx, | ||
EarlyExitB.createNaryOp(Instruction::ExtractElement, | ||
{IncomingFromEarlyExit, FirstActiveLane}, | ||
nullptr, "early.exit.value")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can continue to (re)set IncomingFromEarlyExit
here, followed by resetting the operand (instead of adding it):
ExitIRI->setOperand(EarlyExitIdx, IncomingFromEarlyExit);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done ,thanks
Also handle VPIRPhi in VPRecipeBase::isPhi, to simplify existing code dealing with VPIRPhis. Suggested as part of #136455.
@@ -352,6 +352,20 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG( | |||
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader())); | |||
Plan->getEntry()->setPlan(&*Plan); | |||
|
|||
// Fix VPlan loop-closed-ssa exit phi's by add incoming operands to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Fix VPlan loop-closed-ssa exit phi's by add incoming operands to the | |
// Fix VPlan loop-closed-ssa exit phi's by adding incoming operands to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
@@ -9388,11 +9387,8 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder, | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: is this early continue needed:
// Nothing to do for unreachable exit blocks.
if (ExitVPBB->getNumPredecessors() == 0)
continue;
given that unreachable exit blocks have been emptied of their all their recipes - including phi ones?
@@ -9388,11 +9387,8 @@ collectUsersInExitBlocks(Loop *OrigLoop, VPRecipeBuilder &Builder, | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: loop-unswitch this condition
if (ExitVPBB->getSinglePredecessor() != Plan.getMiddleBlock()) {
assert(ExitIRI->getNumOperands() ==
ExitVPBB->getPredecessors().size() &&
"early-exit must update exit values on construction");
continue;
rather than check it for all phi's?
assert can be removed given that the number of operands of all phi recipes should be equal to the number of predecessors - check in verifier instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks
for (auto *EB : Plan.getExitBlocks()) { | ||
for (VPRecipeBase &R : make_early_inc_range(*EB)) | ||
R.eraseFromParent(); | ||
} | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case
// 2) If we require a scalar epilogue, there is no conditional branch as
// we unconditionally branch to the scalar preheader. Do nothing.
is handle by early return above. Better place the explanation earlier, and replace "Do nothing" with "Empty the unreachable exit blocks of their recipes". The connection from scalar loop to exit blocks is (currently) outside of VPlan's scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks. Also changed 2->1 as this is the first handled case.
for (BasicBlock *Pred : predecessors(EB->getIRBasicBlock())) | ||
PhiR->addOperand( | ||
getOrCreateVPOperand(Phi.getIncomingValueForBlock(Pred))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order of operands set here corresponds to the order of predecessors of underlying IRBB, before EB VPBB has predecessors. This inconsistency requires attention later, when these predecessors are added, possibly in a different order. May be worth leaving a note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
// By default, assume early exit operand is first, e.g., when the two exit | ||
// blocks are distinct - VPEarlyExitBlock has a single predecessor. | ||
unsigned EarlyExitIdx = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, could we agree on
// By default, assume early exit operand is first, e.g., when the two exit | |
// blocks are distinct - VPEarlyExitBlock has a single predecessor. | |
unsigned EarlyExitIdx = 0; | |
// Early exit operand should always be last, i.e., 0 if VPEarlyExitBlock has a single predecessor and 1 if it has two. | |
unsigned EarlyExitIdx = ExitIRI->getNumOperands() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
// The incoming values currently correspond to the original IR | ||
// predecessors. After the transform, the first incoming value is coming | ||
// from the original loop latch, while the second operand is from the | ||
// early exit. Swap the phi operands, if the first predecessor in the | ||
// original IR is not the loop latch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still a seems a bit confusing. Would something like this read clearer:
If VPEarlyExitBlock has two predecessors, they are already ordered such that early exit is second (and latch exit is first), by construction. But its underlying IRBB (EarlyExitIRBB) may have its predecessors ordered the other way around, and it is the order of the latter which corresponds to the order of operands of VPEarlyExitBlock's phi recipes. Therefore, if early exit (UncountableExitingBlock) is the first predecessor of EarlyExitIRBB, we swap the operands of phi recipes, thereby bringing them to match VPEarlyExitBlock's predecessor order, with early exit being last (second). Otherwise they already match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted, thanks
OrigLoop->getLoopLatch()) | ||
ExitIRI->swapOperands(); | ||
|
||
EarlyExitIdx = 1; | ||
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If there's a unique exit block, VPEarlyExitBlock has 2 predecessors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
// (MiddleVPBB and NewMiddle). Extract the last lane of the incoming value | ||
// from MiddleVPBB which is coming from the original latch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// (MiddleVPBB and NewMiddle). Extract the last lane of the incoming value | |
// from MiddleVPBB which is coming from the original latch. | |
// The first of two operands corresponds to the latch exit, via MiddleVPBB predecessor. | |
// Extract its last lane. |
(The other immediate predecessor, which corresponds to early exit, is VectorEarlyExitVPBB, rather than NewMiddle).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
ExitIRI->extractLastLaneOfOperand(MiddleBuilder); | ||
} | ||
|
||
VPValue *IncomingFromEarlyExit = ExitIRI->getOperand(EarlyExitIdx); | ||
auto IsVector = [](ElementCount VF) { return VF.isVector(); }; | ||
// When the VFs are vectors, need to add `extract` to get the incoming value | ||
// from early exit. When the range contains scalar VF, limit the range to | ||
// scalar VF to prevent mis-compilation for the range containing both scalar | ||
// and vector VFs. | ||
if (!IncomingFromEarlyExit->isLiveIn() && | ||
LoopVectorizationPlanner::getDecisionAndClampRange(IsVector, Range)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Independent) Still puzzled about clamping the range at this stage, when all VPlans were already created following range clampings. Rather than say asserting that the range contains either scalar or vector VF's but not both, and introduce extracts if it's the latter. Extracts added above for latch exit need not check vector VF's?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! Adding several comments and suggestions.
collectUsersInLatchExitBlock(Loop *OrigLoop, VPRecipeBuilder &Builder, | ||
VPlan &Plan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are OrigLoop
and Builder
still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed, dropped, thanks!
ExitIRI->getParent()->getSinglePredecessor() == MiddleVPBB && | ||
"exit values from early exits must be fixed when branch to " | ||
"early-exit is added"); | ||
ExitIRI->extractLastLaneOfOperand(B); | ||
ExitIRI->extractLastLaneOfFirstOperand(B); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent: addUsersInExitBlocks()
--> addExtractsToExitUsers()
? This method introduces extracts for exit users of a single (latch) exit block, which are expected to be single operand VPIRPhi's, rather than adds users in multiple/all exit blocks. Drop "Add exit values to \p Plan."?
@@ -352,6 +352,22 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG( | |||
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader())); | |||
Plan->getEntry()->setPlan(&*Plan); | |||
|
|||
// Fix VPlan loop-closed-ssa exit phi's by adding incoming operands to the | |||
// VPIRInstructions wrapping them. | |||
// Note that the operand order may need adjusting when predecessors are added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Note that the operand order may need adjusting when predecessors are added, | |
// Note that the operand order corresponds to IR predecessor order, and may need adjusting when VPlan predecessors are added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks!
// The exit blocks are unreachable, remove their recipes to make sure no | ||
// users remain that may pessimize transforms. | ||
for (auto *EB : Plan.getExitBlocks()) { | ||
for (VPRecipeBase &R : make_early_inc_range(*EB)) | ||
R.eraseFromParent(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems so - better build the complete skeleton CFG first, then convert it to HCFG by converting its loops into regions.
VPValue *Exiting = getOperand(0); | ||
if (!Exiting->isLiveIn()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VPValue *Exiting = getOperand(0); | |
if (!Exiting->isLiveIn()) { | |
if (Exiting->isLiveIn()) | |
return; | |
VPValue *Exiting = getOperand(0); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately, thanks
|
||
// By default, assume early exit operand is first, e.g., when the two exit | ||
// blocks are distinct - VPEarlyExitBlock has a single predecessor. | ||
unsigned EarlyExitIdx = 0; | ||
if (OrigLoop->getUniqueExitBlock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. It's more consistent with following explanation.
ExitIRI->extractLastLaneOfOperand(MiddleBuilder); | ||
} | ||
|
||
VPValue *IncomingFromEarlyExit = ExitIRI->getOperand(EarlyExitIdx); | ||
auto IsVector = [](ElementCount VF) { return VF.isVector(); }; | ||
// When the VFs are vectors, need to add `extract` to get the incoming value | ||
// from early exit. When the range contains scalar VF, limit the range to | ||
// scalar VF to prevent mis-compilation for the range containing both scalar | ||
// and vector VFs. | ||
if (!IncomingFromEarlyExit->isLiveIn() && | ||
LoopVectorizationPlanner::getDecisionAndClampRange(IsVector, Range)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, very well, thanks.
(Independent) May be good to indicate somehow transforms that may clamp the range - that belong to tryToBuildVPlanWithVPRecipes stage, and prevent them from operating afterwards, perhaps by disabling range clamping afterwards?
Reduce indent level, as suggested in #136455.
@@ -352,6 +352,23 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG( | |||
Plan->getEntry()->setOneSuccessor(getOrCreateVPBB(TheLoop->getHeader())); | |||
Plan->getEntry()->setPlan(&*Plan); | |||
|
|||
// Fix VPlan loop-closed-ssa exit phi's by adding incoming operands to the | |||
// VPIRInstructions wrapping them. | |||
// // Note that the operand order corresponds to IR predecessor order, and may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"// //" >> "//"
As suggested in #136455, now unreachable exit blocks won't have any phi nodes.
Add incoming exit phi operands during the initial VPlan construction. This ensures all users are added to the initial VPlan and is also needed in preparation to retaining exiting edges during initial construction.