Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IND<SIMD>(RVA) #81651

Merged
merged 9 commits into from
Feb 6, 2023
Merged

Optimize IND<SIMD>(RVA) #81651

merged 9 commits into from
Feb 6, 2023

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Feb 5, 2023

Closes #81643

// Mainly for:
static Vector128<byte> Test1() => Vector128.Create("0123456789ABCDEF"u8);


// But various ROS<> patterns should work too:
static ReadOnlySpan<byte> rva => new byte[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
static Vector128<byte> Test2() => Unsafe.ReadUnaligned<Vector128<byte>>(ref MemoryMarshal.GetReference(rva));

New codegen:

; Method P:Test1():System.Runtime.Intrinsics.Vector128`1[ubyte]
       vzeroupper 
       vmovups  xmm0, xmmword ptr [reloc @RWD00]
       vmovups  xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
; Total bytes of code: 19


; Method P:Test2():System.Runtime.Intrinsics.Vector128`1[ubyte]
       vzeroupper 
       vmovups  xmm0, xmmword ptr [reloc @RWD00]
       vmovups  xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
; Total bytes of code: 19

But most importantly, these are now GT_CNS_VEC nodes so can participate in various JIT constant foldings/optimizations.

@ghost ghost assigned EgorBo Feb 5, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 5, 2023
@ghost
Copy link

ghost commented Feb 5, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Closes #81643

// Mainly for:
static Vector128<byte> Test1() => Vector128.Create("0123456789ABCDEF"u8);


// But various ROS<> patterns should work too:
static ReadOnlySpan<byte> rva => new byte[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
static Vector128<byte> Test2() => Unsafe.ReadUnaligned<Vector128<byte>>(ref MemoryMarshal.GetReference(rva));

New codegen:

; Method P:Test1():System.Runtime.Intrinsics.Vector128`1[ubyte]
       vzeroupper 
       vmovups  xmm0, xmmword ptr [reloc @RWD00]
       vmovups  xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
; Total bytes of code: 19


; Method P:Test2():System.Runtime.Intrinsics.Vector128`1[ubyte]
       vzeroupper 
       vmovups  xmm0, xmmword ptr [reloc @RWD00]
       vmovups  xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
; Total bytes of code: 19

But most importantly, these are now GT_CNS_VEC nodes so can participate in various JIT constant foldings/optimizations.

Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review February 5, 2023 09:52
@EgorBo
Copy link
Member Author

EgorBo commented Feb 5, 2023

@jakobbotsch @SingleAccretion PTAL (since you reviewed previous PRs in this function)

src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/assertionprop.cpp Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
@jkotas
Copy link
Member

jkotas commented Feb 5, 2023

Does this optimization need to be implemented in Mono as well to be something one can depend on?

@EgorBo
Copy link
Member Author

EgorBo commented Feb 5, 2023

Does this optimization need to be implemented in Mono as well to be something one can depend on?

From my understanding - it's not necessary if it's more complicated than this (I spent like 15 minutes on this). We replace a load from the RVA data with a load from the Data Section. It's just that for RyuJIT it's now imported as GT_CNS_VEC so it can participate in various foldings and be better understandable for CSE.

I need to check Mono's codegen, but perhaps LLVM can do the same with constant global data itself? (depends on how LLVM IR is formed I assume)

@EgorBo
Copy link
Member Author

EgorBo commented Feb 6, 2023

Failure is #75244

@EgorBo EgorBo merged commit 09a3007 into dotnet:main Feb 6, 2023
@EgorBo EgorBo deleted the opt-vector-create-rva branch February 6, 2023 11:23
@ghost ghost locked as resolved and limited conversation to collaborators Mar 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize Vector.Create("utf8-literal"u8)
5 participants