Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize BOX+UNBOX for "T? -> T" and "T -> T?" #104931

Merged
merged 5 commits into from
Jul 17, 2024
Merged

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jul 15, 2024

Remove heap allocations for such box+unbox patterns when nullable is involved.

TTo Test<TFrom, TTo>(TFrom o) => (TTo)(object)o;

Current codegen for Test<int?, int> and Test<int, int?>:

; Assembly listing for Test<int?, int>
G_M61296_IG01:
       push     rbx
       sub      rsp, 48
       mov      ebx, ecx
G_M61296_IG02:
       mov      rcx, 0x7FF8348774D0
       call     CORINFO_HELP_NEWSFAST     <-------------- heap allocation!
       mov      dword ptr [rax+0x08], ebx
       mov      r8, rax
       lea      rcx, [rsp+0x28]
       mov      rdx, 0x7FF834B4B750
       call     CORINFO_HELP_UNBOX_NULLABLE
       mov      rax, qword ptr [rsp+0x28]
G_M61296_IG03:
       add      rsp, 48
       pop      rbx
       ret      
; Total bytes of code 59


; Assembly listing for Test<int, int?>
G_M21085_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+0x30], rcx
G_M21085_IG02:
       lea      rdx, [rsp+0x30]
       mov      rcx, 0x7FF83614B750
       call     CORINFO_HELP_BOX_NULLABLE     <-------------- heap allocation!
       mov      eax, dword ptr [rax+0x08]
G_M21085_IG03:
       add      rsp, 40
       ret      
; Total bytes of code 37

New codegen for Test<int?, int> and Test<int, int?>:

; Assembly listing for Test<int?, int>
G_M61296_IG01:
       push     rax
G_M61296_IG02:
       mov      byte  ptr [rsp], 1
       mov      dword ptr [rsp+0x04], ecx
       mov      rax, qword ptr [rsp]
G_M61296_IG03:
       add      rsp, 8
       ret      
; Total bytes of code 18


; Assembly listing for Test<int, int?>
G_M21085_IG01:
       sub      rsp, 40
       mov      qword ptr [rsp+0x30], rcx
G_M21085_IG02:
       cmp      byte  ptr [rsp+0x30], 0
       je       SHORT G_M21085_IG04
       mov      eax, dword ptr [rsp+0x34]
G_M21085_IG03:
       add      rsp, 40
       ret      
G_M21085_IG04:
       call     CORINFO_HELP_THROWNULLREF
       int3     
; Total bytes of code 31

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 15, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo
Copy link
Member Author

EgorBo commented Jul 16, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jul 16, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Jul 16, 2024

@AndyAyersMS @jakobbotsch @dotnet/jit-contrib PTAL, not too many diffs (SPMI is empty due to missing contexts I add here, but there are hits in tests and a few in libs). As part of the effort to make Nullable a bit better.

I am going to re-use impStoreNullableFields for #104954 as well.

NOTE: I am using raw IND/BLK here instead of LCL_FLD because of Nullable<struct> and for simplicity

@jakobbotsch
Copy link
Member

NOTE: I am using raw IND/BLK here instead of LCL_FLD because of Nullable<struct> and for simplicity

I don't really follow -- why is that simpler?

@EgorBo
Copy link
Member Author

EgorBo commented Jul 16, 2024

NOTE: I am using raw IND/BLK here instead of LCL_FLD because of Nullable<struct> and for simplicity

I don't really follow -- why is that simpler?

because apparently it's not that simple to do LclFld store/load for a struct with a T field where T could be struct? Like when I do gtNewStoreLclFldNode(nullableObjTmp, value->TypeGet(), valueOffset, value); - how do I set layout for this value field?

@EgorBo
Copy link
Member Author

EgorBo commented Jul 16, 2024

@jakobbotsch thanks for the suggestions! I rewrote to LCL_FLD, I guess lack of ClassLayout arg in gtNewLclFldNode confused me 🙂 Can't apply your suggestions to the new version

@EgorBo EgorBo merged commit 9df4f7f into dotnet:main Jul 17, 2024
104 of 107 checks passed
@EgorBo EgorBo deleted the boxing-nullable branch July 17, 2024 07:38
ericstj added a commit to ericstj/runtime that referenced this pull request Jul 25, 2024
mmitche pushed a commit that referenced this pull request Jul 26, 2024
…05515)

* Revert "Optimize BOX+UNBOX for "T? -> T" and "T -> T?" (#104931)"

This reverts commit 9df4f7f.

* Add back eeIsSharedInst
@github-actions github-actions bot locked and limited conversation to collaborators Aug 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants