JIT ARM64-SVE: Add AW_2A to AZ_2A, BM_1A, BN_1A #99211

amanasifkhalid · 2024-03-03T21:30:30Z

Part of #94549. Adds the following encodings:

SVE_AW_2A
SVE_AX_1A
SVE_AY_2A
SVE_AZ_2A
SVE_BM_1A
SVE_BN_1A

cstool output:

xar   z0.b, z0.b, z1.b, #1
xar   z2.b, z2.b, z3.b, #8
xar   z4.h, z4.h, z5.h, #2
xar   z6.h, z6.h, z7.h, #16
xar   z8.s, z8.s, z9.s, #3
xar   z10.s, z10.s, z11.s, #32
xar   z12.d, z12.d, z13.d, #4
xar   z14.d, z14.d, z15.d, #64
index z0.b, #-0x10, #0xF
index z1.h, #0xF, #-0x10
index z2.s, #0, #0
index z3.d, #-5, #5
index z0.b, #-0x10, w0
index z1.h, #0, w1
index z2.s, #5, w2
index z3.d, #10, x3
index z4.b, #-0x10, wzr
index z5.d, #15, xzr
index z0.b, w0, #-0x10
index z1.h, w1, #0
index z2.s, w2, #5
index z3.d, x3, #10
index z4.b, wzr, #-0x10
index z5.d, xzr, #15
decb  x0, pow2
decd  x1, vl16, mul #3
dech  x2, vl32, mul #5
decw  x3, vl64, mul #7
incb  x4, vl128, mul #9
incd  x5, mul3, mul #10
inch  x6, mul4, mul #13
incw  x7, all, mul #16
decd  z0.d, pow2
dech  z1.h, vl2, mul #2
decw  z2.s, vl3, mul #4
incd  z3.d, vl4, mul #8
inch  z4.h, vl5, mul #12
incw  z5.s, vl6, mul #16

JitDisasm output:

xar     z0.b, z0.b, z1.b, #1
xar     z2.b, z2.b, z3.b, #8
xar     z4.h, z4.h, z5.h, #2
xar     z6.h, z6.h, z7.h, #16
xar     z8.s, z8.s, z9.s, #3
xar     z10.s, z10.s, z11.s, #32
xar     z12.d, z12.d, z13.d, #4
xar     z14.d, z14.d, z15.d, #64
index   z0.b, #-16, #15
index   z1.h, #15, #-16
index   z2.s, #0, #0
index   z3.d, #-5, #5
index   z0.b, #-16, w0
index   z1.h, #0, w1
index   z2.s, #5, w2
index   z3.d, #10, x3
index   z4.b, #-16, wzr
index   z5.d, #15, xzr
index   z0.b, w0, #-16
index   z1.h, w1, #0
index   z2.s, w2, #5
index   z3.d, x3, #10
index   z4.b, wzr, #-16
index   z5.d, xzr, #15
decb    x0, pow2
decd    x1, vl16, mul #3
dech    x2, vl32, mul #5
decw    x3, vl64, mul #7
incb    x4, vl128, mul #9
incd    x5, mul3, mul #10
inch    x6, mul4, mul #13
incw    x7, all, mul #16
decd    z0.d, pow2
dech    z1.h, vl2, mul #2
decw    z2.s, vl3, mul #4
incd    z3.d, vl4, mul #8
inch    z4.h, vl5, mul #12
incw    z5.s, vl6, mul #16

cc @dotnet/arm64-contrib

ghost · 2024-03-03T21:30:40Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Part of #94549. Adds the following encodings:

SVE_AW_2A
SVE_AX_1A
SVE_AY_2A
SVE_AZ_2A
SVE_BM_1A
SVE_BN_1A

cstool output:

xar   z0.b, z0.b, z1.b, #1
xar   z2.b, z2.b, z3.b, #8
xar   z4.h, z4.h, z5.h, #2
xar   z6.h, z6.h, z7.h, #16
xar   z8.s, z8.s, z9.s, #3
xar   z10.s, z10.s, z11.s, #32
xar   z12.d, z12.d, z13.d, #4
xar   z14.d, z14.d, z15.d, #64
index z0.b, #-0x10, #0xF
index z1.h, #0xF, #-0x10
index z2.s, #0, #0
index z3.d, #-5, #5
index z0.b, #-0x10, w0
index z1.h, #0, w1
index z2.s, #5, w2
index z3.d, #10, x3
index z4.b, #-0x10, wzr
index z5.d, #15, xzr
index z0.b, w0, #-0x10
index z1.h, w1, #0
index z2.s, w2, #5
index z3.d, x3, #10
index z4.b, wzr, #-0x10
index z5.d, xzr, #15
decb  x0, pow2
decd  x1, vl16, mul #3
dech  x2, vl32, mul #5
decw  x3, vl64, mul #7
incb  x4, vl128, mul #9
incd  x5, mul3, mul #10
inch  x6, mul4, mul #13
incw  x7, all, mul #16
decd  z0.d, pow2
dech  z1.h, vl2, mul #2
decw  z2.s, vl3, mul #4
incd  z3.d, vl4, mul #8
inch  z4.h, vl5, mul #12
incw  z5.s, vl6, mul #16

JitDisasm output:

xar     z0.b, z0.b, z1.b, #1
xar     z2.b, z2.b, z3.b, #8
xar     z4.h, z4.h, z5.h, #2
xar     z6.h, z6.h, z7.h, #16
xar     z8.s, z8.s, z9.s, #3
xar     z10.s, z10.s, z11.s, #32
xar     z12.d, z12.d, z13.d, #4
xar     z14.d, z14.d, z15.d, #64
index   z0.b, #-16, #15
index   z1.h, #15, #-16
index   z2.s, #0, #0
index   z3.d, #-5, #5
index   z0.b, #-16, w0
index   z1.h, #0, w1
index   z2.s, #5, w2
index   z3.d, #10, x3
index   z4.b, #-16, wzr
index   z5.d, #15, xzr
index   z0.b, w0, #-16
index   z1.h, w1, #0
index   z2.s, w2, #5
index   z3.d, x3, #10
index   z4.b, wzr, #-16
index   z5.d, xzr, #15
decb    x0, pow2
decd    x1, vl16, mul #3
dech    x2, vl32, mul #5
decw    x3, vl64, mul #7
incb    x4, vl128, mul #9
incd    x5, mul3, mul #10
inch    x6, mul4, mul #13
incw    x7, all, mul #16
decd    z0.d, pow2
dech    z1.h, vl2, mul #2
decw    z2.s, vl3, mul #4
incd    z3.d, vl4, mul #8
inch    z4.h, vl5, mul #12
incw    z5.s, vl6, mul #16

cc @dotnet/arm64-contrib

Author:	amanasifkhalid
Assignees:	amanasifkhalid
Labels:	`area-CodeGen-coreclr`
Milestone:	-

TIHan · 2024-03-03T22:14:32Z

src/coreclr/jit/emitarm64.cpp

@@ -9300,6 +9378,7 @@ void emitter::emitIns_R_I_I(instruction ins,

    id->idIns(ins);
    id->idInsFmt(fmt);
+    id->idInsOpt(opt);


Interesting that we weren't setting the opt before. Makes sense that we need to. I assume it doesn't cause any problems for the existing instructions.

Yeah, the existing instructions don't seem to check opt anywhere else -- not even to assert that it's INS_OPTS_NONE. That's the default value in emitIns_R_I_I, so I imagine always initializing idInsOpt is fine; I didn't notice any issues.

TIHan · 2024-03-03T22:17:24Z

src/coreclr/jit/emitarm64.cpp

@@ -24094,8 +24345,9 @@ BYTE* emitter::emitOutput_InstrSve(BYTE* dst, instrDesc* id)
            dst += emitOutput_Instr(dst, code);
            break;

-        // Immediate and patterm to general purpose.


"patterm" xD

TIHan · 2024-03-03T22:23:34Z

src/coreclr/jit/emitarm64.cpp

@@ -24628,7 +24935,7 @@ BYTE* emitter::emitOutput_InstrSve(BYTE* dst, instrDesc* id)
            code = emitInsCodeSve(ins, fmt);
            code |= insEncodeReg_V_4_to_0(id->idReg1());                                           // ddddd
            code |= insEncodeReg_V_9_to_5(id->idReg2());                                           // nnnnn
-            code |= insEncodeSveElemsize_tszh_22_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx
+            code |= insEncodeSveElemsize_tszh_23_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx


The existing uses for insEncodeSveElemsize_tszh_22_tszl_20_to_19 only assumed 1/2/4 byte sizes and not 8, so we should probably put an assert, before it assert(optGetSveElemsize(id->idInsOpt()) == EA_8BYTE), or we could just have two functions for tszh:tszl with one that allows 1/2/4 byte and the other 1/2/4/8 byte.

I prefer the assert approach; I'll add those in.

+1 for asserts. The size restriction is (usually) a property of the instruction operation, whereas the helper function only cares about bit encodings.

TIHan · 2024-03-03T22:24:01Z

src/coreclr/jit/emitarm64.cpp

@@ -24638,7 +24945,7 @@ BYTE* emitter::emitOutput_InstrSve(BYTE* dst, instrDesc* id)
            code |= insEncodeReg_V_4_to_0(id->idReg1());                                           // ddddd
            code |= insEncodeReg_V_9_to_5(id->idReg2());                                           // nnnnn
            code |= insEncodeUimm5_20_to_16(emitGetInsSC(id));                                     // iii
-            code |= insEncodeSveElemsize_tszh_22_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx
+            code |= insEncodeSveElemsize_tszh_23_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx


Same as comment above regarding tszh:tszl.

TIHan · 2024-03-03T22:24:08Z

src/coreclr/jit/emitarm64.cpp

@@ -24648,7 +24955,7 @@ BYTE* emitter::emitOutput_InstrSve(BYTE* dst, instrDesc* id)
            code |= insEncodeReg_V_4_to_0(id->idReg1());                                           // ddddd
            code |= insEncodeReg_V_9_to_5(id->idReg2());                                           // nnnnn
            code |= insEncodeUimm5_20_to_16(insGetImmDiff(emitGetInsSC(id), id->idInsOpt()));      // iii
-            code |= insEncodeSveElemsize_tszh_22_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx
+            code |= insEncodeSveElemsize_tszh_23_tszl_20_to_19(optGetSveElemsize(id->idInsOpt())); // xx


Same as comment above regarding tszh:tszl.

TIHan · 2024-03-03T22:32:19Z

src/coreclr/jit/emitarm64.cpp

@@ -21738,6 +21922,10 @@ void emitter::emitIns_Call(EmitCallType          callType,
            assert(isValidUimm5From1(imm));
            return (32 - imm);

+        case INS_OPTS_SCALABLE_D:


Do the existing uses of insGetImmDiff assume only B, H, S? If so, we should either put asserts before calling insGetImmDiff ensuring that we don't accidently pass D, or just have two different functions. This is a similar suggestion to insEncodeSveElemsize_tszh_22_tszl_20_to_19 that I had.

TIHan

LGTM, just some suggestions for insGetImmDiff and insEncodeSveElemsize_tszh_23_tszl_20_to_19 if they are applicable.

a74nh

Assuming all the other review comments are fixed up, LGTM.

amanasifkhalid · 2024-03-04T14:38:40Z

Thanks for the reviews!

amanasifkhalid added 7 commits March 1, 2024 18:08

Add AW_2A

decd34b

Add AX_1A

2fe31c7

Add AY_2A

862e547

Add AZ_2A

d7c6cb0

Fix tests

b200400

BM_1A

9778bd9

Add BN_1A

2b3dfee

ghost assigned amanasifkhalid Mar 3, 2024

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 3, 2024

amanasifkhalid mentioned this pull request Mar 3, 2024

Arm64: Implement SVE encodings #94549

Closed

TIHan reviewed Mar 3, 2024

View reviewed changes

TIHan approved these changes Mar 3, 2024

View reviewed changes

build-analysis bot mentioned this pull request Mar 4, 2024

System.Net.Quic.Tests.QuicStreamTests.ReadsClosedFinishes_ConnectionClose CI test failure #99142

Closed

Add asserts

c22a6d9

a74nh approved these changes Mar 4, 2024

View reviewed changes

amanasifkhalid merged commit 962d15c into dotnet:main Mar 4, 2024
129 checks passed

amanasifkhalid deleted the aw-2a branch March 4, 2024 14:40

github-actions bot locked and limited conversation to collaborators Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT ARM64-SVE: Add AW_2A to AZ_2A, BM_1A, BN_1A #99211

JIT ARM64-SVE: Add AW_2A to AZ_2A, BM_1A, BN_1A #99211

amanasifkhalid commented Mar 3, 2024

ghost commented Mar 3, 2024

TIHan Mar 3, 2024

amanasifkhalid Mar 4, 2024

TIHan Mar 3, 2024

TIHan Mar 3, 2024

amanasifkhalid Mar 4, 2024

a74nh Mar 4, 2024

TIHan Mar 3, 2024

TIHan Mar 3, 2024

TIHan Mar 3, 2024

TIHan left a comment

a74nh left a comment

amanasifkhalid commented Mar 4, 2024

JIT ARM64-SVE: Add AW_2A to AZ_2A, BM_1A, BN_1A #99211

JIT ARM64-SVE: Add AW_2A to AZ_2A, BM_1A, BN_1A #99211

Conversation

amanasifkhalid commented Mar 3, 2024

ghost commented Mar 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TIHan left a comment

Choose a reason for hiding this comment

a74nh left a comment

Choose a reason for hiding this comment

amanasifkhalid commented Mar 4, 2024