Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloning improvements #66257

Merged
merged 3 commits into from
Mar 21, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,6 @@ unsigned totalLoopCount; // counts the total number of natural loops
unsigned totalUnnatLoopCount; // counts the total number of (not-necessarily natural) loops
unsigned totalUnnatLoopOverflows; // # of methods that identified more unnatural loops than we can represent
unsigned iterLoopCount; // counts the # of loops with an iterator (for like)
unsigned simpleTestLoopCount; // counts the # of loops with an iterator and a simple loop condition (iter < const)
unsigned constIterLoopCount; // counts the # of loops with a constant iterator (for like)
bool hasMethodLoops; // flag to keep track if we already counted a method as having loops
unsigned loopsThisMethod; // counts the number of loops in the current method
Expand Down Expand Up @@ -1556,7 +1555,6 @@ void Compiler::compShutdown()
fprintf(fout, "Total number of 'unnatural' loops is %5u\n", totalUnnatLoopCount);
fprintf(fout, "# of methods overflowing unnat loop limit is %5u\n", totalUnnatLoopOverflows);
fprintf(fout, "Total number of loops with an iterator is %5u\n", iterLoopCount);
fprintf(fout, "Total number of loops with a simple iterator is %5u\n", simpleTestLoopCount);
fprintf(fout, "Total number of loops with a constant iterator is %5u\n", constIterLoopCount);

fprintf(fout, "--------------------------------------------------\n");
Expand Down
45 changes: 19 additions & 26 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -2498,13 +2498,13 @@ enum LoopFlags : unsigned short

// LPFLG_UNUSED = 0x0001,
// LPFLG_UNUSED = 0x0002,
LPFLG_ITER = 0x0004, // loop of form: for (i = icon or lclVar; test_condition(); i++)
LPFLG_ITER = 0x0004, // loop of form: for (i = icon or expression; test_condition(); i++)
// LPFLG_UNUSED = 0x0008,

LPFLG_CONTAINS_CALL = 0x0010, // If executing the loop body *may* execute a call
LPFLG_VAR_INIT = 0x0020, // iterator is initialized with a local var (var # found in lpVarInit)
LPFLG_CONST_INIT = 0x0040, // iterator is initialized with a constant (found in lpConstInit)
LPFLG_SIMD_LIMIT = 0x0080, // iterator is compared with vector element count (found in lpConstLimit)
// LPFLG_UNUSED = 0x0020,
LPFLG_CONST_INIT = 0x0040, // iterator is initialized with a constant (found in lpConstInit)
LPFLG_SIMD_LIMIT = 0x0080, // iterator is compared with vector element count (found in lpConstLimit)

LPFLG_VAR_LIMIT = 0x0100, // iterator is compared with a local var (var # found in lpVarLimit)
LPFLG_CONST_LIMIT = 0x0200, // iterator is compared with a constant (found in lpConstLimit)
Expand Down Expand Up @@ -6929,16 +6929,11 @@ class Compiler

var_types lpIterOperType() const; // For overflow instructions

// Set to the block where we found the initialization for LPFLG_CONST_INIT or LPFLG_VAR_INIT loops.
// Set to the block where we found the initialization for LPFLG_CONST_INIT loops.
// Initially, this will be 'head', but 'head' might change if we insert a loop pre-header block.
BasicBlock* lpInitBlock;

union {
int lpConstInit; // initial constant value of iterator
// : Valid if LPFLG_CONST_INIT
unsigned lpVarInit; // initial local var number to which we initialize the iterator
// : Valid if LPFLG_VAR_INIT
};
int lpConstInit; // initial constant value of iterator : Valid if LPFLG_CONST_INIT

// The following is for LPFLG_ITER loops only (i.e. the loop condition is "i RELOP const or var")

Expand Down Expand Up @@ -12068,21 +12063,19 @@ extern Histogram bbOneBBSizeTable;

#if COUNT_LOOPS

extern unsigned totalLoopMethods; // counts the total number of methods that have natural loops
extern unsigned maxLoopsPerMethod; // counts the maximum number of loops a method has
extern unsigned totalLoopOverflows; // # of methods that identified more loops than we can represent
extern unsigned totalLoopCount; // counts the total number of natural loops
extern unsigned totalUnnatLoopCount; // counts the total number of (not-necessarily natural) loops
extern unsigned totalUnnatLoopOverflows; // # of methods that identified more unnatural loops than we can represent
extern unsigned iterLoopCount; // counts the # of loops with an iterator (for like)
extern unsigned simpleTestLoopCount; // counts the # of loops with an iterator and a simple loop condition (iter <
// const)
extern unsigned constIterLoopCount; // counts the # of loops with a constant iterator (for like)
extern bool hasMethodLoops; // flag to keep track if we already counted a method as having loops
extern unsigned loopsThisMethod; // counts the number of loops in the current method
extern bool loopOverflowThisMethod; // True if we exceeded the max # of loops in the method.
extern Histogram loopCountTable; // Histogram of loop counts
extern Histogram loopExitCountTable; // Histogram of loop exit counts
extern unsigned totalLoopMethods; // counts the total number of methods that have natural loops
extern unsigned maxLoopsPerMethod; // counts the maximum number of loops a method has
extern unsigned totalLoopOverflows; // # of methods that identified more loops than we can represent
extern unsigned totalLoopCount; // counts the total number of natural loops
extern unsigned totalUnnatLoopCount; // counts the total number of (not-necessarily natural) loops
extern unsigned totalUnnatLoopOverflows; // # of methods that identified more unnatural loops than we can represent
extern unsigned iterLoopCount; // counts the # of loops with an iterator (for like)
extern unsigned constIterLoopCount; // counts the # of loops with a constant iterator (for like)
extern bool hasMethodLoops; // flag to keep track if we already counted a method as having loops
extern unsigned loopsThisMethod; // counts the number of loops in the current method
extern bool loopOverflowThisMethod; // True if we exceeded the max # of loops in the method.
extern Histogram loopCountTable; // Histogram of loop counts
extern Histogram loopExitCountTable; // Histogram of loop exit counts

#endif // COUNT_LOOPS

Expand Down
12 changes: 2 additions & 10 deletions src/coreclr/jit/fgdiagnostic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3781,13 +3781,10 @@ void Compiler::fgDebugCheckLoopTable()
}

// Check the flags.
// Note that the various init/limit flags are only used when LPFLG_ITER is set, but they are set first,
// Note that the various limit flags are only used when LPFLG_ITER is set, but they are set first,
// separately, and only if everything works out is LPFLG_ITER set. If LPFLG_ITER is NOT set, the
// individual flags are not un-set (arguably, they should be).

// Only one of the `init` flags can be set.
assert(genCountBits((unsigned)(loop.lpFlags & (LPFLG_VAR_INIT | LPFLG_CONST_INIT))) <= 1);

// Only one of the `limit` flags can be set. (Note that LPFLG_SIMD_LIMIT is a "sub-flag" that can be
// set when LPFLG_CONST_LIMIT is set.)
assert(genCountBits((unsigned)(loop.lpFlags & (LPFLG_VAR_LIMIT | LPFLG_CONST_LIMIT | LPFLG_ARRLEN_LIMIT))) <=
Expand All @@ -3799,14 +3796,9 @@ void Compiler::fgDebugCheckLoopTable()
assert(loop.lpFlags & LPFLG_CONST_LIMIT);
}

if (loop.lpFlags & (LPFLG_CONST_INIT | LPFLG_VAR_INIT))
if (loop.lpFlags & LPFLG_CONST_INIT)
{
assert(loop.lpInitBlock != nullptr);

if (loop.lpFlags & LPFLG_VAR_INIT)
{
assert(loop.lpVarInit < lvaCount);
}
}

if (loop.lpFlags & LPFLG_ITER)
Expand Down
29 changes: 11 additions & 18 deletions src/coreclr/jit/loopcloning.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1038,17 +1038,18 @@ bool Compiler::optDeriveLoopCloningConditions(unsigned loopNum, LoopCloneContext
if (loop->lpFlags & LPFLG_CONST_INIT)
{
// Only allowing non-negative const init at this time.
// REVIEW: why?
// This is because the variable initialized with this constant will be used as an array index,
// and array indices must be non-negative.
if (loop->lpConstInit < 0)
{
JITDUMP("> Init %d is invalid\n", loop->lpConstInit);
return false;
}
}
else if (loop->lpFlags & LPFLG_VAR_INIT)
else
{
// initVar >= 0
const unsigned initLcl = loop->lpVarInit;
// iterVar >= 0
const unsigned initLcl = loop->lpIterVar();
if (!genActualTypeIsInt(lvaGetDesc(initLcl)))
{
JITDUMP("> Init var V%02u not compatible with TYP_INT\n", initLcl);
Expand All @@ -1059,11 +1060,6 @@ bool Compiler::optDeriveLoopCloningConditions(unsigned loopNum, LoopCloneContext
LC_Expr(LC_Ident(0, LC_Ident::Const)));
context->EnsureConditions(loopNum)->Push(geZero);
}
else
{
JITDUMP("> Not variable init\n");
return false;
}

// Limit Conditions
LC_Ident ident;
Expand Down Expand Up @@ -1096,7 +1092,7 @@ bool Compiler::optDeriveLoopCloningConditions(unsigned loopNum, LoopCloneContext
ArrIndex* index = new (getAllocator(CMK_LoopClone)) ArrIndex(getAllocator(CMK_LoopClone));
if (!loop->lpArrLenLimit(this, index))
{
JITDUMP("> ArrLen not matching");
JITDUMP("> ArrLen not matching\n");
return false;
}
ident = LC_Ident(LC_Array(LC_Array::Jagged, index, LC_Array::ArrLen));
Expand Down Expand Up @@ -1823,8 +1819,7 @@ void Compiler::optCloneLoop(unsigned loopInd, LoopCloneContext* context)

// We're going to transform this loop:
//
// H --> E (or, H conditionally branches around the loop and has fall-through to F == T == E)
// F
// H --> E (or, H conditionally branches around the loop and has fall-through to T == E)
// T
// E
// B ?-> T
Expand All @@ -1833,14 +1828,12 @@ void Compiler::optCloneLoop(unsigned loopInd, LoopCloneContext* context)
// to this pair of loops:
//
// H ?-> H3 (all loop failure conditions branch to new slow path loop head)
// H2--> E (Optional; if E == T == F, let H fall through to F/T/E)
// F
// H2--> E (Optional; if T == E, let H fall through to T/E)
// T
// E
// B ?-> T
// X2--> X
// H3 --> E2 (aka slowHead. Or, H3 falls through to F2 == T2 == E2)
// F2
// H3 --> E2 (aka slowHead. Or, H3 falls through to T2 == E2)
// T2
// E2
// B2 ?-> T2
Expand Down Expand Up @@ -1908,7 +1901,7 @@ void Compiler::optCloneLoop(unsigned loopInd, LoopCloneContext* context)
BasicBlock* slowHeadPrev = newPred;

// Now we'll make "h2", after "h" to go to "e" -- unless the loop is a do-while,
// so that "h" already falls through to "e" (e == t == f).
// so that "h" already falls through to "e" (e == t).
// It might look like this code is unreachable, since "h" must be a BBJ_ALWAYS, but
// later we will change "h" to a BBJ_COND along with a set of loop conditions.
// TODO: it still might be unreachable, since cloning currently is restricted to "do-while" loop forms.
Expand Down Expand Up @@ -2100,7 +2093,7 @@ void Compiler::optCloneLoop(unsigned loopInd, LoopCloneContext* context)

BasicBlock* condLast = optInsertLoopChoiceConditions(context, loopInd, slowHead, h);

// Add the fall-through path pred (either to F/T/E for fall-through from conditions to fast path,
// Add the fall-through path pred (either to T/E for fall-through from conditions to fast path,
// or H2 if branch to E of fast path).
assert(condLast->bbJumpKind == BBJ_COND);
JITDUMP("Adding " FMT_BB " -> " FMT_BB "\n", condLast->bbNum, condLast->bbNext->bbNum);
Expand Down
36 changes: 14 additions & 22 deletions src/coreclr/jit/loopcloning.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,45 +138,37 @@ exception occurs.
1. Loop detection has completed and the loop table is populated.

2. The loops that will be considered are the ones with the LPFLG_ITER flag:
"for (i = icon or lclVar; test_condition(); i++)"
"for ( ; test_condition(); i++)"

Limitations

1. For array based optimizations the loop choice condition is checked
before the loop body. This implies that the loop initializer statement
has not executed at the time of the check. So any loop cloning condition
involving the initial value of the loop counter cannot be condition checked
as it hasn't been assigned yet at the time of condition checking. Therefore
the initial value has to be statically known. This can be fixed with further
effort.

2. Loops containing nested exception handling regions are not cloned. (Cloning them
1. Loops containing nested exception handling regions are not cloned. (Cloning them
would require creating new exception handling regions for the cloned loop, which
is "hard".) There are a few other EH-related edge conditions that also cause us to
reject cloning.

3. If the loop contains RETURN blocks, and cloning those would push us over the maximum
2. If the loop contains RETURN blocks, and cloning those would push us over the maximum
number of allowed RETURN blocks in the function (either due to GC info encoding limitations
or otherwise), we reject cloning.

4. Loop increment must be `i += 1`
3. Loop increment must be `i += 1`

5. Loop test must be `i < x` where `x` is a constant, a variable, or `a.Length` for array `a`
4. Loop test must be `i < x` or `i <= x` where `x` is a constant, a variable, or `a.Length` for array `a`

(There is some implementation support for decrementing loops, but it is incomplete.
There is some implementation support for `i <= x` conditions, but it is incomplete
(Compiler::optDeriveLoopCloningConditions() only handles GT_LT conditions))
(There is some implementation support for decrementing loops, but it is incomplete.)

6. Loop must have been converted to a do-while form.
5. Loop must have been converted to a do-while form.

7. There are a few other loop well-formedness conditions.
6. There are a few other loop well-formedness conditions.

8. Multi-dimensional (non-jagged) loop index checking is only partially implemented.
7. Multi-dimensional (non-jagged) loop index checking is only partially implemented.

9. Constant initializations and constant limits must be non-negative (REVIEW: why? The
implementation does use `unsigned` to represent them.)
8. Constant initializations and constant limits must be non-negative. This is because the
iterator variable will be used as an array index, and array indices must be non-negative.
For `i = v` variable initialization, we add a dynamic check that `v >= 0`. Constant initializations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For `i = v` variable initialization, we add a dynamic check that `v >= 0`. Constant initializations
For `i = expr` non-constant initialization, we add a dynamic check that `expr >= 0`. Constant initializations

can be checked statically.

10. The cloned loop (the slow path) is not added to the loop table, meaning certain
9. The cloned loop (the slow path) is not added to the loop table, meaning certain
downstream optimization passes do not see them. See
https://github.com/dotnet/runtime/issues/43713.

Expand Down
Loading