-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize IND<SIMD>(RVA) #81651
Optimize IND<SIMD>(RVA) #81651
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsCloses #81643 // Mainly for:
static Vector128<byte> Test1() => Vector128.Create("0123456789ABCDEF"u8);
// But various ROS<> patterns should work too:
static ReadOnlySpan<byte> rva => new byte[] {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
static Vector128<byte> Test2() => Unsafe.ReadUnaligned<Vector128<byte>>(ref MemoryMarshal.GetReference(rva)); New codegen: ; Method P:Test1():System.Runtime.Intrinsics.Vector128`1[ubyte]
vzeroupper
vmovups xmm0, xmmword ptr [reloc @RWD00]
vmovups xmmword ptr [rcx], xmm0
mov rax, rcx
ret
; Total bytes of code: 19
; Method P:Test2():System.Runtime.Intrinsics.Vector128`1[ubyte]
vzeroupper
vmovups xmm0, xmmword ptr [reloc @RWD00]
vmovups xmmword ptr [rcx], xmm0
mov rax, rcx
ret
; Total bytes of code: 19 But most importantly, these are now GT_CNS_VEC nodes so can participate in various JIT constant foldings/optimizations.
|
@jakobbotsch @SingleAccretion PTAL (since you reviewed previous PRs in this function) |
Co-authored-by: SingleAccretion <[email protected]>
Does this optimization need to be implemented in Mono as well to be something one can depend on? |
From my understanding - it's not necessary if it's more complicated than this (I spent like 15 minutes on this). We replace a load from the RVA data with a load from the Data Section. It's just that for RyuJIT it's now imported as I need to check Mono's codegen, but perhaps LLVM can do the same with constant global data itself? (depends on how LLVM IR is formed I assume) |
Failure is #75244 |
Closes #81643
New codegen:
But most importantly, these are now GT_CNS_VEC nodes so can participate in various JIT constant foldings/optimizations.