-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloning improvements #66257
Cloning improvements #66257
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue Detailsnull
|
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr superpmi-replay |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
Fix various comments
Assume that any pre-existing initialization is ok. Check it against zero if necessary. Const inits remain as before. Lots of diffs due to more cloning for cases of `for (i = expression...` where `expression` is not just a constant or local var.
8573095
to
f2ed920
Compare
/azp run runtime-coreclr jitstress |
Azure Pipelines successfully started running 1 pipeline(s). |
@AndyAyersMS PTAL Will summarize diffs later. Bottom-line: lots more cases where cloning kicks in. Some beneficial, some not. Non-beneficial cases are due to known existing issues with cloning profitability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Couple of small suggestions.
src/coreclr/jit/loopcloning.h
Outdated
implementation does use `unsigned` to represent them.) | ||
8. Constant initializations and constant limits must be non-negative. This is because the | ||
iterator variable will be used as an array index, and array indices must be non-negative. | ||
For `i = v` variable initialization, we add a dynamic check that `v >= 0`. Constant initializations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For `i = v` variable initialization, we add a dynamic check that `v >= 0`. Constant initializations | |
For `i = expr` non-constant initialization, we add a dynamic check that `expr >= 0`. Constant initializations |
src/coreclr/jit/optimizer.cpp
Outdated
@@ -808,22 +810,27 @@ bool Compiler::optPopulateInitInfo(unsigned loopInd, BasicBlock* initBlock, GenT | |||
|
|||
// RHS can be constant or local var. | |||
// TODO-CQ: CLONE: Add arr length for descending loops. | |||
if (rhs->gtOper == GT_CNS_INT && rhs->TypeGet() == TYP_INT) | |||
if (rhs->gtOper != GT_CNS_INT || rhs->TypeGet() != TYP_INT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (rhs->gtOper != GT_CNS_INT || rhs->TypeGet() != TYP_INT) | |
if ((rhs->gtOper != GT_CNS_INT) || (rhs->TypeGet() != TYP_INT)) |
or maybe
if (rhs->gtOper != GT_CNS_INT || rhs->TypeGet() != TYP_INT) | |
if (!rhs->OperIs(GT_CNS_INT) || !rhs->TypeIs(TYP_INT)) |
As expected, there are a lot of cases where more cloning occurs. Unfortunately, not all these cases are beneficial, due to all the existing limitations of cloning or the JIT. For example, sometimes range check can get rid of bounds checks that cloning also gets rid of, meaning both "fast" and "slow" path loops end up identical. (Similarly, sometimes we won't clone a loop because we require a very specific loop condition, and but if CSE would have run first, we would then match that condition.) (Also related, cloning inserts various "cloning conditions" to branch to the "slow path", and sometimes subsequent optimization passes can get rid of some of them, like redundant branch elimination.) Note also that sometimes the JIT clones a lot of code to get rid of one bounds check. It's possible that it's beneficial, but the marginal benefit degrades as the amount of code increases. Some example:
|
Asm diffs are a regression because when cloning kicks in, it duplicates code. E.g., win-x64: aspnet.run.windows.x64.checked.mch:
Detail diffs
benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.crossgen2.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
|
Ubuntu x64 improvements dotnet/perf-autofiling-issues#4220 |
* Loop cloning improvements Fix various comments * Remove loop cloning var initialization condition Assume that any pre-existing initialization is ok. Check it against zero if necessary. Const inits remain as before. Lots of diffs due to more cloning for cases of `for (i = expression...` where `expression` is not just a constant or local var. * Feedback
Remove loop cloning variable initialization condition:
Assume that any pre-existing initialization is acceptable.
Check condition against zero if necessary. Const inits remain as before.
Lots of diffs due to more cloning for cases of
for (i = expression...
where
expression
is not just a constant or local var.Also, fix various comments that were no longer correct (e.g., "first" block
concept is gone)