-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep precise argument sizes between import and morph. #43130
Conversation
9d5efd5
to
c986f65
Compare
0a0669b
to
115e44c
Compare
GenTreePutArgSmall
.
|
||
if (corType != CORINFO_TYPE_CLASS && corType != CORINFO_TYPE_BYREF && corType != CORINFO_TYPE_PTR && | ||
corType != CORINFO_TYPE_VAR && (argRealClass = info.compCompHnd->getArgClass(sig, argLst)) != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we did not see a warning about an assignment inside if
.
src/coreclr/src/jit/morph.cpp
Outdated
// load byte; call(int) -> CALL(PUTARG_TYPE byte(IND byte)); | ||
// load int; call(byte) -> CALL(PUTARG_TYPE int (IND int)); | ||
// etc. | ||
if ((call->callSig != nullptr) && (Target::g_tgtArgOrder == Target::ARG_ORDER_R2L)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callSig == nullptr
means mostly a helper call, all helper calls that I have checked are working with i4 and i8 types but I expect I will have to add more PUTARG_TYPE
there once I run more tests on arm64 OSX and find bugs with missed type information.
PTAL @CarolEidt @AndyAyersMS @dotnet/jit-contrib |
I'm puzzled why we need to do things this way -- doesn't the callee sig tell us what size the callee expects? |
We get this information from callee sig, but it is available only during importation and then we lose it. In debug we save it in
if we decide to keep it in Release it will increase the size of all large nodes. |
Seems like we could deal with that easily enough, we have various extensions already hanging off of calls. For instance we could refactor Is there other benefit to having these extra IR nodes beyond being able to handle this ABI quirk? |
These extra nodes are only for the ABI quirk, I do not see benefits to other places from it. I was thinking to postpone casts creations but it was a long shot (if we pass 1 byte as 1 byte we don't need
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments and suggestions
{ | ||
GenTree* effectiveVal = this; | ||
for (;;) | ||
{ | ||
assert(!effectiveVal->OperIs(GT_PUTARG_TYPE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why you added this here - that is, I can see why we wouldn't expect to encounter these under a COMMA
, but I'm not sure why this special assert is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have several similar functions like gtEffectiveVal
, gtRetExprVal
and gtSkipPutArgType
. When we call each of them we often don't expect nodes that can be skipped by two others to be on top.
For example, when we call gtEffectiveVal
we don't expect something like GT_RET_EXPR(GT_COMMA)
or GT_PUTARG_TYPE(GT_COMMA)
, similar for gtRetExprVal
it should not see GT_PUTARG_TYPE(GT_PUTARG_TYPE)
.
If we do it means the caller did not check for these nodes and we don't skip what we hoped to skip, it doesn't cause correctness issues in my understanding but introduces asm regressions when we can't parse something like we did before the change.
src/coreclr/src/jit/morph.cpp
Outdated
// After tail call transformations we could have more nodes in the list + special arguments + | ||
// setup trees. I need to find a clean way to match this list with the original signature if | ||
// is possible. If not we will be left without a check that PUTARG_TYPE survive as they should. | ||
if (nodeArgsCount == sigArgsCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be !=
? Otherwise, the loop below will never execute, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a temporary condition, the idea here is to match the signature with the call arguments as we have them in morph and check that we still have the correct types. However, it is tricky to do because we could have some nodes added to gtCallArgs
between importation and call to this function (also this function can be called several times).
So we need to skip all additional nodes that were added and I have problems with identifying them. I am planning to fix it once I pass all the other tests.
5e87bfb
to
6b0b884
Compare
0fcee3c
to
ada8c7f
Compare
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR was updated and is passing the tests now.
Thanks to @AndyAyersMS for helping me offline with the previous test failures.
The changes include #44790 that I hope to merge first.
This is the last big JIT change for arm64 apple, with it we will have precise arguments sizes from importer to codegen. The next changes will be on VM side for interop stubs.
{ | ||
InlArgInfo* inlCurArgInfo = &pInlineInfo->inlArgInfo[argNum]; | ||
|
||
inlCurArgInfo->argNode = curArgVal; // Save the original tree, with PUT_ARG and RET_EXPR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main change is that we save PUTARG_TYPE
and RET_EXPR
in inlCurArgInfo->argNode
.
.maxstack 8 | ||
IL_0000: nop | ||
IL_0001: ldarg.0 | ||
IL_0002: call void Runtime_43130::CallAMethod1(int16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32
is passed as int16
to Inline1
where it is passed to CallAMethod1
, before the last changes the saved node was LCL_VAR int
and we added another PUTARG_TYPE
on top of an existing one, now it is fixed.
@@ -23820,7 +23848,8 @@ Statement* Compiler::fgInlinePrependStatements(InlineInfo* inlineInfo) | |||
|
|||
GenTree* argSingleUseNode = argInfo.argBashTmpNode; | |||
|
|||
if ((argSingleUseNode != nullptr) && !(argSingleUseNode->gtFlags & GTF_VAR_CLONED) && argIsSingleDef) | |||
if ((argSingleUseNode != nullptr) && !(argSingleUseNode->gtFlags & GTF_VAR_CLONED) && argIsSingleDef && | |||
!argHasPutArg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new condition allows us to always the argument as a temp with the correct type, we need it because for a correct direct substitution we need to know if the parent is a call or not but we don't have this information.
We create the node when we see an IL opcode like ldlocal
and don't have parent information, later, when the node is used, we can't distinguish it from a regular LCL_VAR
.
The change causes a small regression on all platforms:
windows x64 crossgen: 5134 (0.02% of base)
windows x64 pmi: 14911 (0.03% of base)
windows arm64 crossgen: 5688 (0.01% of base)
linux x64 crossgen: 9036 (0.02% of base)
, etc. all lower than 0.04%.
What happens there is:
- during importation we create a
LCL_VAR tempForOurArg
, save a pointer to it; - in
fgInsertInlineeBlocks
we iterate over all such pointers and do in-place replacement if there was only one use of the lclVar, but we don't know the user, so we can't say if it is a call that need aPUTARG
or not.
I have not found a simple way to avoid it or compensate for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment in the code here explaining why argHasPutArg
disqualifies this arg from direct substitution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left you some questions about this change.
src/coreclr/src/jit/flowgraph.cpp
Outdated
@@ -5179,7 +5179,7 @@ void Compiler::fgObserveInlineConstants(OPCODE opcode, const FgStack& stack, boo | |||
// Check for the double whammy of an incoming constant argument | |||
// feeding a constant test. | |||
unsigned varNum = FgStack::SlotTypeToArgNum(slot0); | |||
if (impInlineInfo->inlArgInfo[varNum].argNode->OperIsConst()) | |||
if (impInlineInfo->inlArgInfo[varNum].argIsInvariant) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this change for correctness? If not perhaps you should PR it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for the other two below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not need this for correctness. I am planning to merge them in #44790 (comment), if we decide not to merge it I will need to change this code to use argNode->gtSkipPutArgType()->gtRetExprVal()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just approved #44790.
@@ -23820,7 +23848,8 @@ Statement* Compiler::fgInlinePrependStatements(InlineInfo* inlineInfo) | |||
|
|||
GenTree* argSingleUseNode = argInfo.argBashTmpNode; | |||
|
|||
if ((argSingleUseNode != nullptr) && !(argSingleUseNode->gtFlags & GTF_VAR_CLONED) && argIsSingleDef) | |||
if ((argSingleUseNode != nullptr) && !(argSingleUseNode->gtFlags & GTF_VAR_CLONED) && argIsSingleDef && | |||
!argHasPutArg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment in the code here explaining why argHasPutArg
disqualifies this arg from direct substitution.
GenTree* argNode = inlArgInfo[argNum].argNode; | ||
const bool argHasPutArg = argNode->OperIs(GT_PUTARG_TYPE); | ||
|
||
unsigned __int64 bbFlags = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note above; I don't believe you can set these to zero like this and pass jit stress.
5b28654
to
05d3c2e
Compare
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to merge once you've updated for #44790 and perhaps added a comment indicating why PUTARG_TYPE
can't be directly substituted.
64db1b2
to
133ef0d
Compare
Create `GT_PUTARG_TYPE` when signature type does not match node type. Check in morph that this information has survived inlining and other phases between.
133ef0d
to
8c7ecdb
Compare
I have checked that the diffs are still as expected before the merge, PR and jitstress are clear, |
Thanks @AndyAyersMS and @CarolEidt for your help with this change. |
On arm64 OSX we need to know precise argument sizes. I have added support for that between morph and lowering in #42503, this PR supports in between importation and morph.
When do we need it:
without the change, morph was seeing the incorrect type of the nodes but for other platforms it was not important (because all arguments were passed in TARGET_POINTER_SIZE slots).
I have stopped on the solution that adds a special GT_OPER to keep this information between these phases and a debug code in morph to check that it has survived phases between them (mostly inlining). Thanks @CarolEidt for the idea.
I want to enable this code for all platforms because:
I was also considering other solutions, the main 2:
fgArgInfo
;impImplicitIorI4Cast
to insert cast not only between i4 and i8 but between all sizes, butNo asm diffs (SPMI x64/x86).Update: ci shows related failures during the check phase, I will fix them soon, but I think it should not block review.