JIT: Slightly refactor handling of conditions/conditional branches in the backend #82020

jakobbotsch · 2023-02-13T00:44:37Z

Slightly refactor how compares are represented and processed in the backend in preparation of lifting the remaining cmov restrictions.

Change all compare nodes to always produce values and never set flags. That is, GT_EQ/GT_NE/GT_LT/GT_LE/GT_GE/GT_GT/GT_TEST_EQ/GT_TEST_NE now are all "normal" nodes, always.
Stop optimizing compares by looking for a user JTRUE node. Instead, we optimize the compare when we see it (for example, EQ/NE(AND(x, y), 0) is turned into GT_TEST_EQ/GT_TEST_NE(x, y)), but we do not change the user JTRUE node to a JCC node. We now never create SETCC nodes when we see the compare, and we always turn JTRUE nodes into JCC nodes with a proper condition.
Handle the transformation from JTRUE to JCC when we actually get to JTRUE. The transformation that now happens is:
1. The compare node is turned into a GT_CMP (for GT_EQ to GT_GT) or GT_TEST node (for GT_TEST_EQ/GT_TEST_NE) node. These nodes are always void-typed and do not produce values; they only set the CPU flags, and correspond directly to low-level target instructions (like cmp/test).
2. The JTRUE is turned into JCC with the condition from the compare.
For arm64/xarch, we handle optimizing EQ/NE(node_that_sets_zero_flag, 0) at this stage too now.
Introduce new xarch backend-only GT_BITTEST_EQ/GT_BITTEST_NE that semantically correspond to (x & (1 << y)) != 0. The corresponding "set flags" node is GT_BT and already exists. We would previously transform (x & (1 << y)) != 0 into GT_BT + GT_SETCC (when the user was not a JTRUE). This is now transformed into GT_BITTEST_EQ/GT_BITTEST_NE instead, which are completely normal nodes. The processing of JTRUE handles the transformation into BT + JCC.
Change some of the emitter compare peepholes to work with this scheme. This requires some lookahead when we see them, but allows us to get rid of an "ugly" flag.
Allow liveness to DCE GT_CMP/GT_TEST when they have had GTF_SET_FLAGS unset (due to FG optimizations). We already allowed this for GT_EQ, so this showed up as some regressions.

Some example LIR diffs:

  Generating: N033 (???,???) [000136] -----------                            IL_OFFSET void   INLRT @ 0x00A[E-] REG NA
  Generating: N035 (  1,  1) [000009] -----------                    t9 =    LCL_VAR   int    V02 loc1         u:1 ecx REG ecx $141
  Generating: N037 (  1,  1) [000011] -c---------                   t11 =    CNS_INT   int    16 REG NA $42
                                                                          ┌──▌  t9     int    
                                                                          ├──▌  t11    int    
- Generating: N039 (  3,  3) [000012] N------N-U-                         ▌  GE        void   REG NA $142
+ Generating: N039 (  3,  3) [000012] -------N-U-                         ▌  CMP       void   REG NA
  IN0005:        cmp      ecx, 16
- Generating: N041 (  5,  5) [000013] -----------                         ▌  JTRUE     void   REG NA $VN.Void
+ Generating: N041 (???,???) [000144] -----------                            JCC       void   cond=UGE REG NA
  IN0006:        jae      L_M20112_BB03

                                                                          ┌──▌  t112   ubyte  
                                                                          ├──▌  t54    int    
- Generating: N061 ( 22, 14) [000060] ---XGO-----                         ▌  BT        void   REG NA
+ Generating: N061 ( 22, 14) [000060] ---XGO-----                   t60 = ▌  BITTEST_NE int    REG eax
  IN0009:        bt       eax, esi
- Generating: N063 (???,???) [000144] -----------                  t144 =    SETCC     int    cond=C REG eax
  IN000a:        setb     al
  IN000b:        movzx    eax, al
-                                                                         ┌──▌  t144   int    
+                                                                         ┌──▌  t60   int    
  Generating: N065 ( 23, 15) [000061] ---XGO-----                         ▌  RETURN    int    REG NA <l:$1c8, c:$1c7>

…e backend Slightly refactor how compares are represented and processed in the backend in preparation of lifting the remaining cmov restrictions. * Change all compare nodes to always produce values and never set flags. That is, GT_EQ/GT_NE/GT_LT/GT_LE/GT_GE/GT_GT/GT_TEST_EQ/GT_TEST_NE now are all "normal" nodes. * Stop optimizing compares by looking for a user JTRUE node. Instead, we optimize the compare when we see it (for example, EQ/NE(AND(x, y), 0) is turned into GT_TEST_EQ/GT_TEST_NE(x, y)), but we do not change the user JTRUE node to a JCC node. We now never create SETCC nodes when we see the compare. * Handle the transformation from JTRUE to JCC when we actually get to JCC. The transformation that now happens is: 1. The compare node is turned into a GT_CMP (for GT_EQ to GT_GT) or GT_TEST node (for GT_TEST_EQ/GT_TEST_NE) node. These nodes are always void-typed and do not produce values; they only set the CPU flags, and correspond directly to low-level target instructions (like cmp/test). GT_CMP already existed, GT_TEST is added as part of this PR. 2. The JTRUE is turned into JCC with the condition from the compare. For xarch/arm64 we handle optimizing EQ/NE(node_that_sets_zf, 0) at this stage too now. * Introduce new xarch backend-only GT_BITTEST_EQ/GT_BITTEST_NE that semantically correspond to (x & (1 << y)) != 0. The corresponding "set flags" node is GT_BT and already exists. We would previously transform (x & (1 << y)) != 0 into GT_BT + GT_SETCC (when the user was not a JTRUE). This is now transformed into GT_BITTEST_EQ/GT_BITTEST_NE instead, which are completely normal nodes. The processing of JTRUE handles the transformation into BT + JCC (in the same way as described above). What I was after is moving all handling of the conditions to when we process the JTRUE node, in anticipation of doing the same for GT_SELECT. For GT_SELECT we are going to have to reorder the condition with the true/false values, and when we were handling some of these transformations as part of the compare that seemed dangerous.

ghost · 2023-02-13T00:44:48Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Slightly refactor how compares are represented and processed in the backend in preparation of lifting the remaining cmov restrictions.

Change all compare nodes to always produce values and never set flags. That is, GT_EQ/GT_NE/GT_LT/GT_LE/GT_GE/GT_GT/GT_TEST_EQ/GT_TEST_NE now are all "normal" nodes, always.
Stop optimizing compares by looking for a user JTRUE node. Instead, we optimize the compare when we see it (for example, EQ/NE(AND(x, y), 0) is turned into GT_TEST_EQ/GT_TEST_NE(x, y)), but we do not change the user JTRUE node to a JCC node. We now never create SETCC nodes when we see the compare, and we always turn JTRUE nodes into JCC nodes with a proper condition.
Handle the transformation from JTRUE to JCC when we actually get to JTRUE. The transformation that now happens is:
1. The compare node is turned into a GT_CMP (for GT_EQ to GT_GT) or GT_TEST node (for GT_TEST_EQ/GT_TEST_NE) node. These nodes are always void-typed and do not produce values; they only set the CPU flags, and correspond directly to low-level target instructions (like cmp/test).
2. The JTRUE is turned into JCC with the condition from the compare.
For arm64/xarch, we handle optimizing EQ/NE(node_that_sets_zero_flag, 0) at this stage too now.
Introduce new xarch backend-only GT_BITTEST_EQ/GT_BITTEST_NE that semantically correspond to (x & (1 << y)) != 0. The corresponding "set flags" node is GT_BT and already exists. We would previously transform (x & (1 << y)) != 0 into GT_BT + GT_SETCC (when the user was not a JTRUE). This is now transformed into GT_BITTEST_EQ/GT_BITTEST_NE instead, which are completely normal nodes. The processing of JTRUE handles the transformation into BT + JCC.

What I was after is moving all handling of the condition to when we process the JTRUE node, in anticipation of doing the same for GT_SELECT. For GT_SELECT we are going to have to reorder the condition with the true/false values, and when we were handling this as part of the compares in some cases that seemed dangerous.

Some example LIR diffs:

  Generating: N033 (???,???) [000136] -----------                            IL_OFFSET void   INLRT @ 0x00A[E-] REG NA
  Generating: N035 (  1,  1) [000009] -----------                    t9 =    LCL_VAR   int    V02 loc1         u:1 ecx REG ecx $141
  Generating: N037 (  1,  1) [000011] -c---------                   t11 =    CNS_INT   int    16 REG NA $42
                                                                          ┌──▌  t9     int    
                                                                          ├──▌  t11    int    
- Generating: N039 (  3,  3) [000012] N------N-U-                         ▌  GE        void   REG NA $142
+ Generating: N039 (  3,  3) [000012] -------N-U-                         ▌  CMP       void   REG NA
  IN0005:        cmp      ecx, 16
- Generating: N041 (  5,  5) [000013] -----------                         ▌  JTRUE     void   REG NA $VN.Void
+ Generating: N041 (???,???) [000144] -----------                            JCC       void   cond=UGE REG NA
  IN0006:        jae      L_M20112_BB03

                                                                          ┌──▌  t112   ubyte  
                                                                          ├──▌  t54    int    
- Generating: N061 ( 22, 14) [000060] ---XGO-----                         ▌  BT        void   REG NA
+ Generating: N061 ( 22, 14) [000060] ---XGO-----                   t60 = ▌  BITTEST_NE int    REG eax
  IN0009:        bt       eax, esi
- Generating: N063 (???,???) [000144] -----------                  t144 =    SETCC     int    cond=C REG eax
  IN000a:        setb     al
  IN000b:        movzx    eax, al
                                                                          ┌──▌  t144   int    
  Generating: N065 ( 23, 15) [000061] ---XGO-----                         ▌  RETURN    int    REG NA <l:$1c8, c:$1c7>

Author:	jakobbotsch
Assignees:	jakobbotsch
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2023-02-13T08:45:03Z

Looks like I need to update a few peeps, and there's an arm32 issue.

Given this is a very common node, avoid this allocation

…backend

jakobbotsch · 2023-02-13T14:42:19Z

Some diffs are because we treat GT_BITTEST nodes like other compares now, so we get some of the peepholes from there for free:

@@ -18,15 +18,15 @@
 G_M16052_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, nogc <-- Prolog IG
 						;; size=0 bbWeight=1 PerfScore 0.00
 G_M16052_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
+       xor      eax, eax
        bt       rcx, rdx
        setb     al
-       movzx    rax, al
-						;; size=10 bbWeight=1 PerfScore 1.75
+						;; size=9 bbWeight=1 PerfScore 1.75
 G_M16052_IG03:        ; bbWeight=1, epilog, nogc, extend
        ret      
 						;; size=1 bbWeight=1 PerfScore 1.00
 
-; Total bytes of code 11, prolog size 0, PerfScore 3.85, instruction count 4, allocated bytes for code 11 (MethodHash=2849c14b) for method Microsoft.CodeAnalysis.BitVector:IsTrue(ulong,int):bool
+; Total bytes of code 10, prolog size 0, PerfScore 3.75, instruction count 4, allocated bytes for code 10 (MethodHash=2849c14b) for method Microsoft.CodeAnalysis.BitVector:IsTrue(ulong,int):bool

jakobbotsch · 2023-02-13T15:01:42Z

Some other diffs are because the new "staged" approach composes better, so some optimizations can now enable new optimizations:

@@ -29,14 +29,9 @@ G_M26981_IG05:        ; bbWeight=0.50, gcVars=0000000000000000 {}, gcrefRegs=000
        blsr     eax, ecx
        setne    al
        movzx    rax, al
-       mov      edx, 1
-       xor      r8d, r8d
-       test     eax, eax
-       cmove    edx, r8d
-       mov      eax, edx
        cmp      ecx, 1
        je       SHORT G_M26981_IG07
-						;; size=32 bbWeight=0.50 PerfScore 2.12
+						;; size=16 bbWeight=0.50 PerfScore 1.50
 G_M26981_IG06:        ; bbWeight=4, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref, isz
        inc      eax
        sar      ecx, 1
@@ -47,7 +42,7 @@ G_M26981_IG07:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byr
        ret      
 						;; size=1 bbWeight=0.50 PerfScore 0.50
 
-; Total bytes of code 49, prolog size 0, PerfScore 17.40, instruction count 19, allocated bytes for code 49 (MethodHash=e65b969a) for method ILCompiler.Reflection.ReadyToRun.GcInfoTypes:CeilOfLog2(int):uint
+; Total bytes of code 33, prolog size 0, PerfScore 15.18, instruction count 14, allocated bytes for code 33 (MethodHash=e65b969a) for method ILCompiler.Reflection.ReadyToRun.GcInfoTypes:CeilOfLog2(int):uint

We have the following LIR:

               [000073] -----------                            IL_OFFSET void   INLRT @ 0x00C[E-]
N001 (  1,  1) [000004] -----------                    t4 =    LCL_VAR   int    V00 arg0         u:1 $80
N002 (  1,  1) [000005] -----------                    t5 =    LCL_VAR   int    V00 arg0         u:1 $80
N003 (  1,  1) [000006] -----------                    t6 =    CNS_INT   int    -1 $41
                                                            ┌──▌  t5     int    
                                                            ├──▌  t6     int    
N004 (  3,  3) [000007] -----------                    t7 = ▌  ADD       int    $101
                                                            ┌──▌  t4     int    
                                                            ├──▌  t7     int    
N005 (  5,  5) [000008] -----------                    t8 = ▌  AND       int    $102
N006 (  1,  1) [000009] -----------                    t9 =    CNS_INT   int    0 $40
                                                            ┌──▌  t8     int    
                                                            ├──▌  t9     int    
N007 (  7,  7) [000010] J------N---                   t10 = ▌  NE        int    $103
N008 (  1,  1) [000012] -----------                   t12 =    CNS_INT   int    1 $42
N009 (  1,  1) [000035] -----------                   t35 =    CNS_INT   int    0 $40
                                                            ┌──▌  t10    int    
                                                            ├──▌  t12    int    
                                                            ├──▌  t35    int    
N010 ( 10, 10) [000067] -----------                   t67 = ▌  SELECT    int   
                                                            ┌──▌  t67    int

We first transform the AND(V00, SUB(V00, 1)) into HWINTRINSIC(ResetLowestSetBit). Previously we then lowered NE(HWINTRINSIC(ResetLowestSetBit), 0) into HWINTRINSIC + SETCC since this intrinsic sets ZF. We then ended up with
SELECT(SETCC(HWINTRINSIC(ResetLowestSetBit)), 0, 1) which the SELECT transformation I added recently does not handle.

With this PR we instead end up with SELECT(NE(HWINTRINSIC(ResetLowestSetBit), 0), 1, 0) that is handled and turned into just NE(HWINTRINSIC(ResetLowestSetBit), 0). There is an emitter peephole that then omits generating a compare for the final NE since it knows the HW intrinsic already set the zero flag.

jakobbotsch · 2023-02-13T15:10:50Z

Some other of the improvements are because we now support some dead code elimination for some arithmetic + JCC pairs that we emit in some cases. In those cases, even before this change, we are emitting:

N023 ( 13, 13) [000087] ----GO-----                   t87 = ▌  XOR       int    <l:$500, c:$501>
N026 ( 17, 17) [000090] ----GO-----                            JCC       void   cond=UNE

We did not handle this in fgOptimizeBranchToNext before (we only handled JTRUE with an operand that sets the flags), so we would remove the JCC but leave the XOR with GTF_SET_FLAGS set so that it could not be DCE'd.

…o be tracked Otherwise we may track it after lowering and invalidate the assumptions made by the optimization. Fix dotnet#82020

jakobbotsch · 2023-02-13T22:40:07Z

src/coreclr/jit/codegenxarch.cpp

+        // The emitter's general logic handles op1/op2 for bt reversed. As a
+        // small hack we reverse it in codegen instead of special casing the
+        // emitter throughout.
+        std::swap(op1, op2);


This is preexisting to this PR, the old genCodeForBT had this handling too.

src/coreclr/jit/codegenxarch.cpp

jakobbotsch · 2023-02-13T22:43:40Z

src/coreclr/jit/gentree.h

-    GTF_RELOP_SJUMP_OPT         = 0x04000000, // GT_<relop> -- Swap signed jl/jge with js/jns during emitter, reuses flags
-                                              //               from previous instruction.


We handle this peephole fully when we see the flag setting instruction now, so we don't need this flag to communicate it along.

jakobbotsch · 2023-02-13T22:45:16Z

src/coreclr/jit/lir.cpp

+// Arguments:
+//    other - The range to move from.
+//
+LIR::ReadOnlyRange& LIR::ReadOnlyRange::operator=(ReadOnlyRange&& other)


We already have the move constructor on LIR::ReadOnlyRange and I needed this version to write some logic cleanly.

jakobbotsch · 2023-02-13T22:46:49Z

src/coreclr/jit/liveness.cpp

@@ -2122,7 +2122,6 @@ void Compiler::fgComputeLifeLIR(VARSET_TP& life, BasicBlock* block, VARSET_VALAR
            case GT_STORE_BLK:
            case GT_STORE_DYN_BLK:
            case GT_JCMP:
-            case GT_CMP:


This is what now allows us to do DCE for GT_CMP nodes if they don't have GTF_SET_FLAGS set. Even before this PR we allowed that for the GT_EQ family of nodes, which were exactly the same as GT_CMP when void typed with GTF_SET_FLAGS set.

jakobbotsch · 2023-02-13T22:51:20Z

src/coreclr/jit/lsrabuild.cpp

@@ -1673,7 +1673,7 @@ int LinearScan::ComputeOperandDstCount(GenTree* operand)
        // Stores and void-typed operands may be encountered when processing call nodes, which contain
        // pointers to argument setup stores.
        assert(operand->OperIsStore() || operand->OperIsBlkOp() || operand->OperIsPutArgStk() ||
-               operand->OperIsCompare() || operand->OperIs(GT_CMP) || operand->TypeGet() == TYP_VOID);
+               operand->TypeIs(TYP_VOID));


This OperIs(GT_CMP) was seemingly only here because of code in LowerModPow2 that created a non-void typed GT_CMP, but GT_CMP does not produce a value so that wasn't expected.
The OperIsCompare() check here is unnecessary as those nodes always produce a value now.

jakobbotsch · 2023-02-13T22:52:10Z

/azp run runtime-coreclr libraries-jitstress

azure-pipelines · 2023-02-13T22:52:23Z

Azure Pipelines successfully started running 1 pipeline(s).

a74nh · 2023-02-14T14:55:42Z

src/coreclr/jit/lower.cpp

+        relop->gtType = TYP_VOID;
+        relop->gtFlags |= GTF_SET_FLAGS;
+
+        if (relop->OperIs(GT_EQ, GT_NE, GT_LT, GT_LE, GT_GE, GT_GT))


With this code, a JTRUE(LT) is always turned into a JCC(CMP) with the JCC containing a condition code within gtCondition. In codegen for JCC, we don't need to check the child for the condition. I'm happy with that.

In my and chains patch, I let optimise bools create a chain of the form NE(AND(LT,EQ),0). LowerConstCompare() turns it into TEST_NE(LT,EQ) and then new function lowerTestCompare() turns it into CCMP_LT(EQ). A JTRUE node is allowed to have either a standard compare child or a CCMP child, eg JTRUE(CCMP_LT(EQ).

When combing both worlds, the lowering stage gets a little complicated, but everything else feels simple:

The CCMP_LT has a gtCondition, plus there is also a CCMP node. lowerTestCompare() turns TEST_NE(LT,EQ) into CCMP_LT(CMP). Then LowerJTrue() would turn it into CCMP(CMP), giving us JCC(CCMP(CMP).

Right. After this PR the plan is that we then introduce GT_SELECTCC which is to GT_SELECT as GT_JCC is to GT_JTRUE (with the additional caveat that we never see GT_JTRUE in codegen, because we have an invariant that there is nothing between the relop and GT_JTRUE).

When lowering GT_SELECT and GT_JTRUE we will invoke a common function to transform compare chains into CMP + CCMP + ... , followed up by the transformation into GT_SELECTCC and GT_JCC. For GT_SELECT this function also needs to verify that reordering the compare to happen before GT_SELECT is safe (if it isn't we'll leave GT_SELECT as GT_SELECT instead of transforming into GT_SELECTCC). This should put everything on the same plan and I think it will unify the last of the handling.

FWIW, it sounds like CCMP_LT would not be necessary with generic CCMP. I.e. we can transform TEST_NE(LT,EQ) into SETCC(CCMP(CMP(x, y))) and teach LowerJTrue to pass through SETCC nodes.
I actually considered using this approach instead of having GT_TEST_NE/GT_TEST_EQ/GT_BITTEST_NE/GT_BITTEST_EQ but it creates an additional node and will require some bespoke interference logic for GT_SELECT (but I think that logic will be necessary on arm64 in that case regardless).
I can try to see what it would look like in a follow-up.

we can transform TEST_NE(LT,EQ) into SETCC(CCMP(CMP(x, y)))

That's better. Reduces the number of node types.

When lowering GT_SELECT and GT_JTRUE we will invoke a common function to transform compare chains into CMP + CCMP + ...

If the chain parsing is also done at the CMP and AND nodes, then we should be able to only check the parent and child at each stage, so no probably no need to recursively go through the chain.

So, my question then is what to do with my and chains patch? I should wait until this patch is merged before continuing with it. Should I also wait for the GT_SELECTCC or any other patches - I don't think I need to, they should be mostly independent, but will merge conflict.

I think waiting for this one is a good idea, I hope we can merge it by today or tomorrow. Not sure how much this one will conflict with your PR.
I don't think you need to wait for GT_SELECTCC either. I will probably move the xarch backend to GT_SELECTCC as the first step and leave arm64 alone, so then they shouldn't conflict.

For GT_SELECT this function also needs to verify that reordering the compare to happen before GT_SELECT is safe (if it isn't we'll leave GT_SELECT as GT_SELECT instead of transforming into GT_SELECTCC). This should put everything on the same plan and I think it will unify the last of the handling.

This will drop all the containing within SELECT? ie -SELECT will always load flags from a register, but SELECTCC will require the previous node to be a compare (like JCC) and always rely of the condition flags being set.

Yes, exactly.

src/coreclr/jit/fgopt.cpp

jakobbotsch · 2023-02-14T22:08:50Z

cc @dotnet/jit-contrib PTAL @BruceForstall

Diffs

This turned out to be a bit more than a 'slight' refactoring, but I think it's an overall simplification for how conditions are handled, and should allow us to get some more principled handling for SELECT and the work that @a74nh is doing in the future.

Note that I ran outerloop, jitstress, libraries-jitstress and Fuzzlyn on this (I deleted some of the comments if you're wondering where I did that)

src/coreclr/jit/codegenxarch.cpp

BruceForstall

LGTM. Looks like a nice simplification.

jakobbotsch · 2023-02-16T09:36:23Z

cc @shushanhf -- the LA backend might need some fixups for this.

shushanhf · 2023-02-16T09:44:58Z

cc @shushanhf -- the LA backend might need some fixups for this.

Thanks very much.
I will do it for LA.

ghost assigned jakobbotsch Feb 13, 2023

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 13, 2023

jakobbotsch added 4 commits February 13, 2023 01:45

Remove comment that no longer describes a special case

265c2a6

Remove unnecessary assert

44b955a

Handle a peephole

5d39258

Do not remove new nodes

34a0130

build-analysis bot mentioned this pull request Feb 13, 2023

Test failure Loader\\classloader\\DictionaryExpansion\\DictionaryExpansion\\DictionaryExpansion.cmd #75244

Closed

jakobbotsch added 11 commits February 13, 2023 10:30

Remove some dead code

1375915

Bash JTRUE to JCC instead of inserting new node

4476e53

Given this is a very common node, avoid this allocation

Move an FP peephole to lowering

8d6a8fb

Run jit-format

8c5ba95

Remove more dead code

6b99b98

Support removing JCCs when optimizing branches after lowering

35ca679

Refactor a bit, increase lookahead for JCC in emitter peephole

3684a16

Allow liveness to remove CMP/TEST/BT when GTF_SET_FLAGS is not set

d3a0de7

Slightly more refactoring

c92db68

Merge branch 'main' of github.com:dotnet/runtime into cpu-flag-nodes-…

ebe7bd7

…backend

Switch to IsInvariantInRange

40b4734

jakobbotsch marked this pull request as draft February 13, 2023 12:49

jakobbotsch added 3 commits February 13, 2023 14:55

Set GTF_SET_FLAGS on a GT_CMP node created by decomposition

ce2aa81

Make peephole more robust

c0f2ac8

Make another peephole more robust

a23e52a

jakobbotsch mentioned this pull request Feb 13, 2023

Arm64: Combine if conditions into compare chains #79283

Merged

dotnet deleted a comment from azure-pipelines bot Feb 13, 2023

jakobbotsch commented Feb 13, 2023

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

jakobbotsch commented Feb 13, 2023

View reviewed changes

jakobbotsch closed this in 8a96ee7 Feb 14, 2023

jakobbotsch reopened this Feb 14, 2023

Update src/coreclr/jit/codegenxarch.cpp

46e4f8a

a74nh reviewed Feb 14, 2023

View reviewed changes

jakobbotsch commented Feb 14, 2023

View reviewed changes

src/coreclr/jit/fgopt.cpp Outdated Show resolved Hide resolved

jakobbotsch and others added 3 commits February 14, 2023 16:15

Update src/coreclr/jit/fgopt.cpp

29c9547

Guard optimizations moved out of OptimizeConstCompare

5e1b1df

Reformat a comment

ddeb70c

jakobbotsch requested a review from BruceForstall February 14, 2023 22:09

jakobbotsch commented Feb 14, 2023

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

Add missing early out in genTryFindFlagsConsumer

53fa297

a74nh mentioned this pull request Feb 15, 2023

Use cinc instead of csel when possible #82031

Merged

BruceForstall approved these changes Feb 16, 2023

View reviewed changes

jakobbotsch merged commit f2f734a into dotnet:main Feb 16, 2023

jakobbotsch deleted the cpu-flag-nodes-backend branch February 16, 2023 09:36

shushanhf mentioned this pull request Feb 21, 2023

[LoongArch64] redesign the compare and condition-branch. #82426

Merged

ghost locked as resolved and limited conversation to collaborators Mar 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Slightly refactor handling of conditions/conditional branches in the backend #82020

JIT: Slightly refactor handling of conditions/conditional branches in the backend #82020

jakobbotsch commented Feb 13, 2023 •

edited

Loading

ghost commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023 •

edited

Loading

jakobbotsch Feb 13, 2023

jakobbotsch Feb 13, 2023

jakobbotsch Feb 13, 2023

jakobbotsch Feb 13, 2023 •

edited

Loading

jakobbotsch Feb 13, 2023 •

edited

Loading

jakobbotsch commented Feb 13, 2023

azure-pipelines bot commented Feb 13, 2023

a74nh Feb 14, 2023

jakobbotsch Feb 14, 2023

jakobbotsch Feb 14, 2023

a74nh Feb 14, 2023

a74nh Feb 14, 2023

jakobbotsch Feb 14, 2023

a74nh Feb 15, 2023

jakobbotsch Feb 15, 2023

jakobbotsch commented Feb 14, 2023 •

edited

Loading

BruceForstall left a comment

jakobbotsch commented Feb 16, 2023

shushanhf commented Feb 16, 2023

		GTF_RELOP_SJUMP_OPT = 0x04000000, // GT_<relop> -- Swap signed jl/jge with js/jns during emitter, reuses flags
		// from previous instruction.

JIT: Slightly refactor handling of conditions/conditional branches in the backend #82020

JIT: Slightly refactor handling of conditions/conditional branches in the backend #82020

Conversation

jakobbotsch commented Feb 13, 2023 • edited Loading

ghost commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023

jakobbotsch commented Feb 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch Feb 13, 2023 • edited Loading

Choose a reason for hiding this comment

jakobbotsch Feb 13, 2023 • edited Loading

Choose a reason for hiding this comment

jakobbotsch commented Feb 13, 2023

azure-pipelines bot commented Feb 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch commented Feb 14, 2023 • edited Loading

BruceForstall left a comment

Choose a reason for hiding this comment

jakobbotsch commented Feb 16, 2023

shushanhf commented Feb 16, 2023

jakobbotsch commented Feb 13, 2023 •

edited

Loading

jakobbotsch commented Feb 13, 2023 •

edited

Loading

jakobbotsch Feb 13, 2023 •

edited

Loading

jakobbotsch Feb 13, 2023 •

edited

Loading

jakobbotsch commented Feb 14, 2023 •

edited

Loading