Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do byref liveness updates for same register GPR moves #53684

Merged
4 commits merged into from
Jun 9, 2021

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented Jun 3, 2021

This resolves #51728 and resolves #10821 by ensuring byref liveness updates happen for same register moves.

A summary of the issue is that for certain operations, such as new Span<T>(void* address) or ref T Unsafe.AsRef<T>(void* address), we'll have a pointer that is directly converted to a TYP_BYREF. If the register allocator determines that the pointer is "last use", it may assign the byref result to the same register, at which point we may decide to elide the same register move. This was causing us to not track the byref becoming "live". While this is likely not problematic for the direct case of POINTER to TYP_BYREF, it might be problematic for multi-step conversion, such as:

var a = new int[30];               // a is a live gcref
ref var x = ref a[0];              // x is a live byref
a = null;                          // a is a dead gcref, the backing array is kept alive due to the x byref

fixed (int* p = &x)                // p is a live pinned byref
{
    x = ref Unsafe.NullRef<int>(); // x is a dead byref, the backing array is kept alive due to the p pinned byref
    int* t = p;                    // t is a pointer, not a pinned byref
    x = ref Unsafe.AsRef<int>(t);  // x is a live byref, it still points to the pinned backing array
} // p is a dead pinned byref, x should be keeping the backing array alive

GC.Collect();
Console.WriteLine(x);

Where if t is given a register (say edx) and then x is given the same register (also edx) the JIT would elide the move and not update the liveness and so the "last tracked reference" to the created array would go dead at the end of the fixed block. While in practice, it should be kept alive by x since that's a byref and should be considered live.

If taken, the same support should likely be added for Arm32/Arm64.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 3, 2021
@tannergooding
Copy link
Member Author

CC. @BruceForstall, @AndyAyersMS

This passed all tests locally, but I imagine we will want to also run GC Stress tests if we want to take this.

@tannergooding
Copy link
Member Author

There are no assembly diffs for this change. However, there are textual changes to the dump:

The insNum value changes where-ever an eliminated instruction is inserted (example for System.Runtime.CompilerServices.CastHelpers:StelemRef(System.Array,int,System.Object)):

-IN0005:        mov      rdx, qword ptr [rcx]
+IN0006:        mov      rdx, qword ptr [rcx]
 ; ...
-Added IP mapping: 0x001E STACK_EMPTY (G_M24931_IG02,ins#6,ofs#24) label
+Added IP mapping: 0x001E STACK_EMPTY (G_M24931_IG02,ins#7,ofs#24) label

Carrying additional instrDesc increases the allocation "size" and changes the instruction groups in a few places:

 Allocations for System.Runtime.CompilerServices.CastHelpers:StelemRef(System.Array,int,System.Object) (MethodHash=1b5e9e9c)
-count:       1447, size:     112998, max =       3072
-allocateMemory:     131072, nraUsed:     118856
+count:       1451, size:     113262, max =       3072
+allocateMemory:     131072, nraUsed:     119120
; ...
-              InstDesc |       4444 |   3.93%
+              InstDesc |       4700 |   4.15%

Most importantly, the gcInfoBlockHdrSave() output changes for impacted scenarios, most frequently with the byref becoming live sooner (example for System.Text.ASCIIUtility:NarrowUtf16ToAscii(long,long,long):long):

 *************** In gcInfoBlockHdrSave()
 Set code length to 482.
 Set ReturnKind to Scalar.
 Set Outgoing stack arg area size to 32.
 Stack slot id for offset 32 (0x20) (sp) = 0.
 Register slot id for reg r15 = 1.
 Register slot id for reg rdx = 2.
 Register slot id for reg rax = 3.
 Register slot id for reg rcx = 4.
 Register slot id for reg r15 (byref) = 5.
 Register slot id for reg rbx (byref) = 6.
 Set state of slot 0 at instr offset 0x65 to Live.
 Set state of slot 0 at instr offset 0x105 to Dead.
 Set state of slot 1 at instr offset 0x5d to Live.
 Set state of slot 2 at instr offset 0x60 to Live.
 Set state of slot 3 at instr offset 0x68 to Live.
 Set state of slot 4 at instr offset 0x6f to Live.
 Set state of slot 3 at instr offset 0x77 to Dead.
 Set state of slot 4 at instr offset 0x77 to Dead.
 Set state of slot 2 at instr offset 0x77 to Dead.
 Set state of slot 1 at instr offset 0x77 to Dead.
 Set state of slot 2 at instr offset 0xaa to Live.
 Set state of slot 3 at instr offset 0xad to Live.
 Set state of slot 2 at instr offset 0xb1 to Dead.
 Set state of slot 4 at instr offset 0xb4 to Live.
 Set state of slot 2 at instr offset 0xb7 to Live.
 Set state of slot 3 at instr offset 0xbc to Dead.
 Set state of slot 4 at instr offset 0xbc to Dead.
 Set state of slot 2 at instr offset 0xbc to Dead.
+Set state of slot 5 at instr offset 0xf1 to Live.
 Set state of slot 2 at instr offset 0xfe to Live.
 Set state of slot 4 at instr offset 0x105 to Live.
 Set state of slot 4 at instr offset 0x10a to Dead.
 Set state of slot 2 at instr offset 0x10a to Dead.
-Set state of slot 5 at instr offset 0x10a to Live.
 Set state of slot 5 at instr offset 0x119 to Dead.
+Set state of slot 6 at instr offset 0x155 to Live.
 Set state of slot 2 at instr offset 0x16a to Live.
 Set state of slot 4 at instr offset 0x171 to Live.
 Set state of slot 4 at instr offset 0x176 to Dead.
 Set state of slot 2 at instr offset 0x176 to Dead.
-Set state of slot 6 at instr offset 0x176 to Live.
 Set state of slot 6 at instr offset 0x191 to Dead.
 Set state of slot 2 at instr offset 0x1ab to Live.
 Set state of slot 4 at instr offset 0x1bc to Live.
 Set state of slot 4 at instr offset 0x1c1 to Dead.
 Set state of slot 2 at instr offset 0x1c1 to Dead.
 Defining interruptible range: [0x18, 0x133).
 Defining interruptible range: [0x140, 0x1e2).

@AndyAyersMS
Copy link
Member

There are no assembly diffs for this change

Diffs via pmi or spmi? (I don't recall if SPMI diffing is sensitive to GC info changes, will have to look).

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

tannergooding commented Jun 3, 2021

Diffs via pmi or spmi? (I don't recall if SPMI diffing is sensitive to GC info changes, will have to look).

I used jit-diff --pmi and explicitly excluded the option that sets COMPlus_JitGCDump

Mainly to validate that there were no new instructions being emitted and rather only GC liveness updates with this change.

@AndyAyersMS
Copy link
Member

Should also resolve #10821...?

@tannergooding
Copy link
Member Author

tannergooding commented Jun 3, 2021

Yes, that's the same assert as from #51728, I'll update the top post to auto-close it

  • Running PMI diffs resulted in zero asserts for corelib, frameworks, benchmarks, and tests

@AndyAyersMS
Copy link
Member

Looks like there is a baseline of ~8 failures in gcstress, so some failures here are pre-existing issues.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Just one question/concern about possible appearances in the prolog.

src/coreclr/jit/emit.cpp Show resolved Hide resolved
@tannergooding
Copy link
Member Author

@AndyAyersMS, looks like the only failures for GC Stress are known (and for ARM64 which isn't touched by this).

This likely needs a second review before being merged, correct?

@AndyAyersMS
Copy link
Member

This likely needs a second review before being merged

Not sure it's required, but probably a good idea. @BruceForstall can you please review?

@tannergooding tannergooding marked this pull request as ready for review June 4, 2021 01:17
Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions.

In the case where the byref goes live earlier, can you show the asm code example? I.e., does it make sense that it's going live earlier? Why did it go live later in the previous case?

You should be able to do superpmi asm diffs using superpmi.py asmdiffs --gcinfo to see the effect there.

src/coreclr/jit/emit.h Show resolved Hide resolved
src/coreclr/jit/emitxarch.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/emitxarch.cpp Outdated Show resolved Hide resolved
@tannergooding
Copy link
Member Author

In the case where the byref goes live earlier, can you show the asm code example? I.e., does it make sense that it's going live earlier? Why did it go live later in the previous case?

I'll get the spmi diff here, but in the cases I looked at already they were similar to the one I describe in the top post. Most typically its a new Span(void* address, int length) where the Span has been promoted and the constructor inlined.

In this case, the move may get elided and so the ByReference<T> field of the Span doesn't get marked "live". It instead becomes live later the first time it gets moved to a new register.

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

You should be able to do superpmi asm diffs using superpmi.py asmdiffs --gcinfo to see the effect there.

superpmi seems to be failing with JIT failed to initialize. jit-diffs --pmi isn't seeing any such failures, so I'm unsure what might be wrong here.

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@BruceForstall
Copy link
Member

superpmi seems to be failing with JIT failed to initialize

superpmi.py has an algorithm for determining the baseline JIT to download from the JIT rolling build and use, but it's fragile as it depends on your branch being branched from a local 'main' branch, so it might be wrong, and in this case, it's probably picked a baseline JIT from before the last JIT/EE GUID change. You can build/download and specify your own baseline, using the -base_jit_path argument to superpmi.py asmdiffs

@BruceForstall
Copy link
Member

You can alternatively use the -base_git_hash argument (to superpmi.py asmdiffs) to specify the git hash you wish to use for your baseline, if you don't want to build it yourself, so it know which one to use (if it can't figure it out correctly automatically).

@BruceForstall
Copy link
Member

When I do spmi asm diffs on this change, I see some interesting diffs, e.g.:

  0002E6 mov      rcx, 0xD1FFAB1E
+        ; byrRegs +[rcx]
  0002F0 mov      r8d, 128
  0002F6 mov      bword ptr [rbp-60H], rcx
  0002FA mov      dword ptr [rbp-58H], r8d
  0002FE xor      ecx, ecx
+        ; byrRegs -[rcx]
  000300 mov      dword ptr [rsp+20H], ecx
  000304 mov      rcx, rsi
         ; gcrRegs +[rcx]

so a byref that is actually a constant (string?) gets loaded and stored to a byref stack location, then goes dead before the previous code brings it alive later. Not a hole, presumably, since the value is constant (the stack location is untracked, so no GC state change there).

  000025 lea      r10, [rdx+8]
+        ; byrRegs +[r10]
  000029 lea      r11, bword ptr [rcx+16]
         ; byrRegs +[r11]
  00002D lea      rsi, [rdx+16]
+        ; byrRegs +[rsi]
  000031 lea      rdi, bword ptr [rcx+24]
         ; byrRegs +[rdi]
  000035 lea      rbx, [rdx+24]
+        ; byrRegs +[rbx]
  000039 test     r8d, r8d
  00003C je       SHORT G_M2693_IG03
  00003E cmp      r8d, 1
  000042 je       SHORT G_M2693_IG05
  000044 cmp      r8d, 2
  000048 je       SHORT G_M2693_IG07
                                                ;; bbWeight=1    PerfScore 8.25
  G_M2693_IG03:        ; gcrefRegs=00000000 {}, byrefRegs=00000ECB {rax rcx rbx rsi rdi r9 r10 r11}, byref
-        ; byrRegs +[rbx rsi r10]
         ; GC ptr vars -{V05}

A case where previously the byrefs didn't get marked alive until the label, and now are marked alive as soon as the register gets loaded. Seems like this would be an existing GC hole, but this is a marshalling IL stub, so maybe there's some magic going on here I don't understand.

It looks to me like there is a bug where we now don't emit a nop after a call immediately before an epilog (required by x64 unwinding). E.g., benchmarks MC#10101 <SendUsingHttp11Async>d__74:MoveNext():this:

BEFORE:
0008AE call     System.Net.Http.HttpConnection:ReturnConnectionToPool():this
       ; gcrRegs -[rcx]
       ; gcr arg pop 0
						;; bbWeight=1    PerfScore 10.00
G_M51018_IG66:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
       ; byrRegs -[rsi]
0008B3 nop      //**** this 'nop' disappears
						;; bbWeight=1    PerfScore 0.25
G_M51018_IG67:        ; , funclet epilog, nogc, extend
0008B4 add      rsp, 80

AFTER:
0008AE call     System.Net.Http.HttpConnection:ReturnConnectionToPool():this
       ; gcrRegs -[rcx]
       ; gcr arg pop 0
						;; bbWeight=1    PerfScore 10.00
G_M51018_IG66:        ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, funclet epilog, nogc
       ; byrRegs -[rsi]
0008B3 add      rsp, 80

@tannergooding
Copy link
Member Author

tannergooding commented Jun 5, 2021

It looks to me like there is a bug where we now don't emit a nop after a call immediately before an epilog (required by x64 unwinding). E.g., benchmarks MC#10101 d__74:MoveNext():this:

I'll look and see if I can find out what's causing this. Perhaps there is a bug with instruction groups or something else that is causing this to not be emitted.

There are also outerloop jobs failing where they weren't before, those might have been introduced by a different PR however (looks to be failing since: https://dev.azure.com/dnceng/public/_build/results?buildId=1170978&view=results).

@tannergooding
Copy link
Member Author

I'll look and see if I can find out what's causing this. Perhaps there is a bug with instruction groups or something else that is causing this to not be emitted.

This looks to be because we only emit if the last instruction was a call: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/emitxarch.cpp#L978

Since we now track a INS_mov_eliminated, the last instruction isn't actually a call. I think this can be fixed pretty trivially by not updating emitLastIns if we are a zero size instruction.

@tannergooding tannergooding changed the title Do byref liveness updates for same register GPR moves on x86/x64 Do byref liveness updates for same register GPR moves Jun 5, 2021
@dotnet dotnet deleted a comment Jun 5, 2021
@tannergooding tannergooding force-pushed the byref-liveness branch 2 times, most recently from 32bdccf to 1622b1d Compare June 6, 2021 05:22
@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

tannergooding commented Jun 7, 2021

@BruceForstall, believe I'm back to zero diffs (modulo the GC liveness changes).

I updated this to track the "last instruction" (the last instrDesc created) vs the "last emitted instruction" (the last instruction that actually generated assembly) and to also work on Arm32.
Arm64 is mostly there, but needs a bit more investigation due to it covering additional elisions in IsRedundantMov, I think that could be a separate PR however.

@tannergooding
Copy link
Member Author

Not quite zero diffs actually, seems I'm incorrectly tracking emitLastEmittedIns across non-extended IG

@tannergooding
Copy link
Member Author

Now its at zero diffs, 🎉

> .\src\coreclr\scripts\superpmi.py asmdiffs --gcinfo -base_jit_path C:\Users\tagoo\Source\repos\runtime_base\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll
================ Logging to C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\superpmi.log
Using JIT/EE Version from jiteeversionguid.h: 1052f490-cad7-4610-99bb-6f2bd91a1d19
Using coredistools found at C:\Users\tagoo\Source\repos\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\coredistools.dll
Found download cache directory "C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64" and --force_download not set; skipping download
SuperPMI ASM diffs
Base JIT Path: C:\Users\tagoo\Source\repos\runtime_base\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll
Diff JIT Path: C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll
Using MCH files:
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\benchmarks.run.windows.x64.checked.mch
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\coreclr_tests.pmi.windows.x64.checked.mch
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen.windows.x64.checked.mch
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen2.windows.x64.checked.mch
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.pmi.windows.x64.checked.mch
  C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries_tests.pmi.windows.x64.checked.mch
benchmarks.run.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\benchmarks.run.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\benchmarks.run.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.benchmarks.run.windows.x64.checked.8\diff
Found 132 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 164499
Total bytes of diff: 164499
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 169 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 169 unchanged.
coreclr_tests.pmi.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\coreclr_tests.pmi.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\coreclr_tests.pmi.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.coreclr_tests.pmi.windows.x64.checked.5\diff
Found 5091 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 1034682
Total bytes of diff: 1034682
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 5102 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 5102 unchanged.
libraries.crossgen.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen.windows.x64.checked.2\diff
Found 134 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 154266
Total bytes of diff: 154266
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 214 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 214 unchanged.
libraries.crossgen2.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen2.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.crossgen2.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.crossgen2.windows.x64.checked.2\diff
Found 138 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 159165
Total bytes of diff: 159165
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 224 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 224 unchanged.
libraries.pmi.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.pmi.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries.pmi.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries.pmi.windows.x64.checked.2\diff
Found 547 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 682648
Total bytes of diff: 682648
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 606 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 606 unchanged.
libraries_tests.pmi.windows.x64.checked.mch
Running asm diffs of C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries_tests.pmi.windows.x64.checked.mch
Asm diffs found
Creating dasm files: C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\diff
Differences found. To replay SuperPMI use:

set COMPlus_JitDisasm=*
set COMPlus_JitUnwindDump=*
set COMPlus_JitEHDump=*
set COMPlus_NgenDisasm=*
set COMPlus_NgenUnwindDump=*
set COMPlus_NgenEHDump=*
set COMPlus_JitDiffableDasm=1
set COMPlus_JitEnableNoWayAssert=1
set COMPlus_JitNoForceFallback=1
set COMPlus_JitDisasmWithGC=1
set COMPlus_JitGCDump=*
set COMPlus_NgenGCDump=*
C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\superpmi.exe  -c ### C:\Users\tagoo\Source\repos\runtime2\artifacts\bin\coreclr\windows.x64.Checked\clrjit.dll C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\mch\1052f490-cad7-4610-99bb-6f2bd91a1d19.windows.x64\libraries_tests.pmi.windows.x64.checked.mch

Generated asm is located under C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\diff
Textual differences found in generated asm.
Invoking: C:\Users\tagoo\Source\repos\jitutils\bin\jit-analyze.bat -r --base C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\base --diff C:\Users\tagoo\Source\repos\runtime2\artifacts\spmi\asm.libraries_tests.pmi.windows.x64.checked.2\diff
Found 589 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 1024377
Total bytes of diff: 1024377
Total bytes of delta: 0 (0.00% of base)


0 total files with Code Size differences (0 improved, 0 regressed), 610 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 610 unchanged.

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr outerloop, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

As far as the GC diffs go (which are reported as textual diffs), there are quite a number.

The ones I've gone through resemble those that @BruceForstall called out here: #53684 (comment)
Generally some byref (it often is a promoted Span<T>) becomes live now where it wasn't previously. It likewise explicitly dies now when zeroed out.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this to zero (generated code) diffs.

// movs Rd,i8 T1_J0 00100dddiiiiiiii 2000 low imm(0-255)
// mov{s} Rd,+i8<<i4 T2_L1 11110i00010S1111 0iiiddddiiiiiiii F04F 0000 imm(i8<<i4)
// mov{s} Rd,Rm T2_C3 1110101001011111 0000dddd0000mmmm EA5F 0000
INST5(mov_eliminated, "mov_eliminated", 0, 0, IF_EN5A, 0x0000, 0x4600, 0x2000, 0xF04F0000, 0xEA5F0000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the encodings could all be BAD_CODE, and could even move this down to the INST1 section, but this is probably ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can log an issue to do this in a follow up. It would be good "low hanging fruit".

// mov Vd[],Rn DV_2C 01001110000iiiii 000111nnnnnddddd 4E00 1C00 Vd[],Rn (from general)
// mov Vd,Vn[] DV_2E 01011110000iiiii 000001nnnnnddddd 5E00 0400 Vd,Vn[] (scalar by element)
// mov Vd[],Vn[] DV_2F 01101110000iiiii 0jjjj1nnnnnddddd 6E00 0400 Vd[],Vn[] (from/to elem)
INST9(mov_eliminated, "mov_eliminated", 0, IF_EN9, 0x2A0003E0, 0x11000000, 0x52800000, 0x320003E0, 0x0EA01C00, 0x0E003C00, 0x4E001C00, 0x5E000400, 0x6E000400)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as for instrsarm.h

@@ -77,6 +77,7 @@ INST4(xor, "xor", IUM_RW, 0x000030, 0x003080,
INST4(cmp, "cmp", IUM_RD, 0x000038, 0x003880, 0x00003A, 0x00003C, INS_FLAGS_WritesAllFlags)
INST4(test, "test", IUM_RD, 0x000084, 0x0000F6, 0x000084, 0x0000A8, INS_FLAGS_WritesAllFlags | INS_FLAGS_Resets_CF_OF_Flags) // AF = ?
INST4(mov, "mov", IUM_WR, 0x000088, 0x0000C6, 0x00008A, 0x0000B0, INS_FLAGS_None)
INST4(mov_eliminated, "mov_eliminated", IUM_WR, 0x000088, 0x0000C6, 0x00008A, 0x0000B0, INS_FLAGS_None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@tannergooding
Copy link
Member Author

@BruceForstall any concerns with me setting this to auto-merge after I resolve the merge conflict here?

@BruceForstall
Copy link
Member

@BruceForstall any concerns with me setting this to auto-merge after I resolve the merge conflict here?

Fine with me

@ghost
Copy link

ghost commented Jun 8, 2021

Hello @tannergooding!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 7df92fd into dotnet:main Jun 9, 2021
@tannergooding tannergooding deleted the byref-liveness branch June 9, 2021 00:49
tannergooding added a commit to tannergooding/runtime that referenced this pull request Jun 9, 2021
ghost pushed a commit that referenced this pull request Jun 9, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 9, 2021
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: emitter assert running PMI PMI diffs are failing due to assert
3 participants