-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hoist the nullchecks for 'this' object #68588
Conversation
Tagging subscribers to this area: @JulieLeeMSFT |
src/coreclr/jit/optcse.cpp
Outdated
@@ -3684,6 +3684,14 @@ bool Compiler::optIsCSEcandidate(GenTree* tree) | |||
case GT_RETURN: | |||
return false; // Currently the only special nodes that we hit | |||
// that we know that we don't want to CSE | |||
case GT_NULLCHECK: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NULLCHECK
does not produce values and gets the "void" VN, so, how does this work at all...?
Edit: I see how it works, this actually just unblocks hoisting, not CSE. It would be better to special-case this in IsNodeHoistable
.
Also, I don't think there is a need to check for this
here, we can hoist any NULLCHECK
s (it is always profitable, which is also where the cost function should be adjusted for these non-value nodes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optIsCSEcandidate()
has 2 purposes.
- During loop hoisting we check if it will be handled by CSE and if yes, we hoist it out of the loop. During CSE phase, when we actually have to do the expression elimination. By telling that it is CSE candidate, it helps hoisting the check out of loop (because it is an invariant). While working on Hoist the invariants out of multi-level nested loops #68061, I realized cases where we would hoist the field address calculation ahead of null-check although there might not be a visible functional difference. But this PR fixes that to some extent at least for nullchecks around
this
.
- While doing actual CSE, we use
optIsCSEcandidate()
. However, right after that, we check if has any reserved VN and if yes, skip CSEing it.
runtime/src/coreclr/jit/optcse.cpp
Lines 811 to 815 in 8f7761a
ValueNum valueVN = vnStore->VNNormalValue(tree->GetVN(VNK_Liberal)); | |
if (ValueNumStore::isReservedVN(valueVN) && (valueVN != ValueNumStore::VNForNull())) | |
{ | |
continue; | |
} |
Later, global propagation phase takes care of eliminating the null-check that is present inside the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
Until now, it has been the case that a "hoistable tree" is (almost, modulo one struct pessimization) the same as "CSEable tree", and both must be values, as the design of hoisting is that we rely on CSE to "clean things up".
But if we start to hoist non-value trees (which is a perfectly valid thing to do), that's not really related to CSE, because: a) non-value trees will never be actual CSE candidates, b) the "cleanup" phase is assertion propagation.
So that's why I think it is confusing to put this check inside optIsCSECandidate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that's why I think it is confusing to put this check inside optIsCSECandidate.
Sure, I can probably put this in IsNodeHoistable()
which is explicitly for hoisting. In future, we can add more non-value trees in it that can be hoisting candidates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think based on the above discussion it would be good to add a comment in IsNodeHoistable
saying that we also can hoist null checks since assertion prop deals with the redundant check left in the loop. It sort of ties loop hoisting together with assertion prop in the same way it is tied together with CSE.
Also, as @SingleAccretion pointed out, I think you can allow this in more places than just for 'this' pointer. We hoist a tree only if its children are invariant which should be safe enough in general for GT_NULLCHECK. Of course that assumes/hopes assertion prop can pick up the additional cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried experimenting with relaxing the hoisting for other object's nullchecks and there are some wins, but I see some regressions of this sort.
Basically, we overcome the number of variables hoisted out of the loop as compared to the available registers. I have plans to revisit these heuristics (specially for arm64) in or after #68061 and I would look into "other object's nullcheck hoisting" at that time.
[000167] hoistable tree cost too low: 3 < 6 (loopVarCount 6 >= availableRegCount 6)
... not profitable to hoist
it would be good to add a comment
If there are no other changes to be done, I would rather do it in my #68061
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the new on the right? It looks like an improvement to me, both in size and perf score (the loop body is smaller and with fewer memory accesses). Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I was hoping to also get lea
too out of the loop and it was not (because of reasons I stated above). I reran the jit-analyze
with PerfScore and I did some improvements. Let me push the changes that enables this for other objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think there is still work to be done in hoisting, especially with regards to hoisting more invariant statements.
Failures are related to dotnet/arcade#9208 |
@dotnet/jit-contrib |
1 similar comment
@dotnet/jit-contrib |
Improvements: dotnet/perf-autofiling-issues#5074 |
Windows x64 improvements: dotnet/perf-autofiling-issues#5117 |
Marking
nullcheck
aroundthis
as hoistable-candidate should be safe IMO.