Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GetElement and ToScalar to lower IND, LCL_FLD, and LCL_VAR specially #86400

Merged
merged 5 commits into from
May 23, 2023

Conversation

tannergooding
Copy link
Member

This resolves #86345 by updating the lowering of GetElement/ToScalar to specially handle GT_IND, GT_LCL_FLD, and GT_LCL_VAR.

In particular we see many diffs such as:

        cmp      edx, 4
        jae      SHORT G_M33861_IG04
-       vmovups  xmm0, xmmword ptr [rcx]
-       vmovaps  xmmword ptr [rsp+20H], xmm0
-       vmovss   xmm0, dword ptr [rsp+4*rdx+20H]
+       vmovss   xmm0, dword ptr [rcx+4*rdx]

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 17, 2023
@ghost ghost assigned tannergooding May 17, 2023
@ghost
Copy link

ghost commented May 17, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This resolves #86345 by updating the lowering of GetElement/ToScalar to specially handle GT_IND, GT_LCL_FLD, and GT_LCL_VAR.

In particular we see many diffs such as:

        cmp      edx, 4
        jae      SHORT G_M33861_IG04
-       vmovups  xmm0, xmmword ptr [rcx]
-       vmovaps  xmmword ptr [rsp+20H], xmm0
-       vmovss   xmm0, dword ptr [rsp+4*rdx+20H]
+       vmovss   xmm0, dword ptr [rcx+4*rdx]
Author: tannergooding
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@tannergooding
Copy link
Member Author

CC. @dotnet/jit-contrib

BlockRange().Remove(op2);

int32_t addOffset = (static_cast<uint8_t>(op2->AsIntCon()->IconValue()) % count);
addOffset *= static_cast<int32_t>(elemSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to go out of bounds here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant is restricted to between 0..255 and then % count That means its already never going to be more than 64. Vector512<byte> is 63, Vector512<long> is 7 * 8 == 56, etc

Combined with the range check we've already emitted for non-constant indices where required means we should always end up with something that is in range.

Except maybe there's still some edge case for the existing offset being int.MaxValue though? Maybe *(basePtr + int.MaxValue).GetElement(5)....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the check to cover if (op2->OperIsConst() && (newOffset < (INT32_MAX - comp->maxSIMDStructBytes()))) to handle such an extreme edge-case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and logged #86609 to track an implicit truncation that GenTreeAddrMode is doing

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM along with a few minor questions

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM along with a few minor questions

@tannergooding tannergooding merged commit b066a16 into dotnet:main May 23, 2023
@tannergooding tannergooding deleted the fix-86345 branch May 23, 2023 01:53
@ghost ghost locked as resolved and limited conversation to collaborators Jun 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider decomposing GetElement(memory, ...) into equivalent nodes
2 participants