-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono] ReadOnlySpan<byte> hack doesn't work in Mono #37449
Labels
Comments
Tagging subscribers to this area: @lewurm |
Dotnet-GitSync-Bot
added
the
untriaged
New issue has not been triaged by the area owner
label
Jun 4, 2020
/cc @SamMonoRT |
monojenkins
pushed a commit
to monojenkins/mono
that referenced
this issue
Jun 4, 2020
Partially fixes dotnet/runtime#37449 ```csharp static ReadOnlySpan<byte> Arr { [MethodImpl(MethodImplOptions.AggressiveInlining)] get => new byte[] {1, 2, 3, 4}; } [MethodImpl(MethodImplOptions.NoInlining)] public static byte GetByte(int i) => Arr[0]; ``` #### Codegen for GetByte() in LLVM-JIT mode: ```asm 0000000000000000 <gram_GetByte__int_>: <BB>:1 0: 48 83 ec 28 sub $0x28,%rsp 4: c5 f8 57 c0 vxorps %xmm0,%xmm0,%xmm0 8: c5 f8 29 04 24 vmovaps %xmm0,(%rsp) d: 48 b8 b0 bd 55 d2 3f movabs $0x563fd255bdb0,%rax 14: 56 00 00 17: 48 89 04 24 mov %rax,(%rsp) 1b: c7 44 24 08 04 00 00 movl $0x4,0x8(%rsp) 22: 00 23: 48 8b 04 24 mov (%rsp),%rax 27: 48 89 44 24 10 mov %rax,0x10(%rsp) 2c: 8b 44 24 08 mov 0x8(%rsp),%eax 30: 89 44 24 18 mov %eax,0x18(%rsp) 34: 8b 44 24 0c mov 0xc(%rsp),%eax 38: 89 44 24 1c mov %eax,0x1c(%rsp) 3c: 83 7c 24 18 00 cmpl $0x0,0x18(%rsp) 41: 74 0c je 4f <gram_GetByte__int_+0x4f> 43: 48 8b 44 24 10 mov 0x10(%rsp),%rax 48: 8a 00 mov (%rax),%al 4a: 48 83 c4 28 add $0x28,%rsp 4e: c3 retq 4f: 48 b8 d0 ae 42 d2 3f movabs $0x563fd242aed0,%rax 56: 56 00 00 59: bf 02 01 00 00 mov $0x102,%edi 5e: ff 10 callq *(%rax) ``` Codegen after I appended `-sroa -instcombine` to the end of LLVM passes list: ```asm 0000000000000000 <gram_GetByte__int_>: <BB>:1 0: 48 b8 d0 ef a6 ba 31 movabs $0x5631baa6efd0,%rax 7: 56 00 00 a: 8a 00 mov (%rax),%al c: c3 retq ``` This https://godbolt.org/z/YKjzsV link explains motivation (on the left is the "final" LLVM IR our llvm-jit produces after the default optimizations). Zoltan noticed that LLVM-AOT where we use `opt -O2` instead of custom pass order optimized that code perfectly. The other issue remains: for some reason we don't inline `get_Arr()` without AggressiveInlining, coreclr does inline it: ``` Successfully inlined Program:get_Arr():System.ReadOnlySpan`1[Byte] (12 IL bytes) (depth 1) [below ALWAYS_INLINE size] ```
Is this the same issue as mono/mono#18572 (comment) ? |
Turns out this one is different, it's bad codegen after LLVM-JIT + inliner refuses to inline get_Arr |
EgorBo
added a commit
to mono/mono
that referenced
this issue
Jun 8, 2020
Partially fixes dotnet/runtime#37449 ```csharp static ReadOnlySpan<byte> Arr { [MethodImpl(MethodImplOptions.AggressiveInlining)] get => new byte[] {1, 2, 3, 4}; } [MethodImpl(MethodImplOptions.NoInlining)] public static byte GetByte(int i) => Arr[0]; ``` #### Codegen for GetByte() in LLVM-JIT mode: ```asm 0000000000000000 <gram_GetByte__int_>: <BB>:1 0: 48 83 ec 28 sub $0x28,%rsp 4: c5 f8 57 c0 vxorps %xmm0,%xmm0,%xmm0 8: c5 f8 29 04 24 vmovaps %xmm0,(%rsp) d: 48 b8 b0 bd 55 d2 3f movabs $0x563fd255bdb0,%rax 14: 56 00 00 17: 48 89 04 24 mov %rax,(%rsp) 1b: c7 44 24 08 04 00 00 movl $0x4,0x8(%rsp) 22: 00 23: 48 8b 04 24 mov (%rsp),%rax 27: 48 89 44 24 10 mov %rax,0x10(%rsp) 2c: 8b 44 24 08 mov 0x8(%rsp),%eax 30: 89 44 24 18 mov %eax,0x18(%rsp) 34: 8b 44 24 0c mov 0xc(%rsp),%eax 38: 89 44 24 1c mov %eax,0x1c(%rsp) 3c: 83 7c 24 18 00 cmpl $0x0,0x18(%rsp) 41: 74 0c je 4f <gram_GetByte__int_+0x4f> 43: 48 8b 44 24 10 mov 0x10(%rsp),%rax 48: 8a 00 mov (%rax),%al 4a: 48 83 c4 28 add $0x28,%rsp 4e: c3 retq 4f: 48 b8 d0 ae 42 d2 3f movabs $0x563fd242aed0,%rax 56: 56 00 00 59: bf 02 01 00 00 mov $0x102,%edi 5e: ff 10 callq *(%rax) ``` Codegen after I appended `-sroa -instcombine` to the end of LLVM passes list: ```asm 0000000000000000 <gram_GetByte__int_>: <BB>:1 0: 48 b8 d0 ef a6 ba 31 movabs $0x5631baa6efd0,%rax 7: 56 00 00 a: 8a 00 mov (%rax),%al c: c3 retq ``` This https://godbolt.org/z/YKjzsV link explains motivation (on the left is the "final" LLVM IR our llvm-jit produces after the default optimizations). Zoltan noticed that LLVM-AOT where we use `opt -O2` instead of custom pass order optimized that code perfectly. The other issue remains: for some reason we don't inline `get_Arr()` without AggressiveInlining, coreclr does inline it: ``` Successfully inlined Program:get_Arr():System.ReadOnlySpan`1[Byte] (12 IL bytes) (depth 1) [below ALWAYS_INLINE size] ``` Co-authored-by: EgorBo <[email protected]>
ghost
locked as resolved and limited conversation to collaborators
Dec 8, 2020
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
A while ago, Roslyn team implemented a
hackfeature: dotnet/roslyn#24621Refer directly to static data when ReadOnlySpan wraps arrays of bytes.
. Technically, it allows to access static read-only arrays without allocations / static initialization, the syntax looks like this:CoreCLR:
As you can see, the access is just a few mov's + bound check (can be eliminated in some cases).
For some reason Mono doesn't inline it
get_Arr()
callMono-LLVM:
Another example:
CoreCLR:
In this example CoreCLR didn't even emit any bound check!
Mono still emits
get_Arr()
and doesn't eliminate the bound checkMono:
This is quite important since this hack is heavily used across the BCL in hot methods:
Note those "// uses C# compiler's optimization for static byte[] data" comments
The text was updated successfully, but these errors were encountered: