[Arm64] ASIMD By Element Intrinsics #36916

echesakov · 2020-05-23T03:01:20Z

Fixes #33683
Fixes #24794 - (by element intrinsics is the remaining part)
Fixes #36300
Fixes #36298
Fixes #33683
Fixes #33490
Fixes #35037 - (InsertSelectedScalar is the remaining part)

Dotnet-GitSync-Bot · 2020-05-23T03:01:23Z

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

ghost · 2020-05-23T03:01:40Z

Tagging subscribers to this area: @tannergooding
Notify danmosemsft if you want to be subscribed.

echesakov · 2020-06-11T01:47:09Z

The API part of this PR is ready for review.

This PR includes most of the "by element" intrinsics (excluding the ones that have not been approved yet) and some load/store intrinsic that operate on a single SIMD element. The reason for combining these into one PR is that I found that the JIT changes that I need to make to support all of these are tightly coupled to each other and I was trying to not spread them across many PRs.

@CarolEidt @tannergooding @TamarChristinaArm Can you please take a look while I am working on finishing the JIT part of the changes?

… return value type in AdvSimd.cs

…mNotSupported.cs

…latformNotSupported.cs

…AdvSimd.cs AdvSimd.PlatformNotSupported.cs

…n AdvSimd.cs AdvSimd.PlatformNotSupported.cs

…upported.cs

…rmNotSupported.cs

…edScalar in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

…md.PlatformNotSupported.cs

…AdvSimd.cs AdvSimd.PlatformNotSupported.cs

…dvSimd.cs AdvSimd.PlatformNotSupported.cs

…Arm32 in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

…ormNotSupported.cs

….PlatformNotSupported.cs

…rinsiccodegenarm64.cpp

echesakov · 2020-06-15T20:14:46Z

@CarolEidt @tannergooding This is ready for review now. Please take a look when you have time. I rebased on top of the changes in #37859

tannergooding · 2020-06-15T20:21:16Z

I'll start giving it a look this afternoon. Thanks!

tannergooding · 2020-06-15T23:53:13Z

src/coreclr/src/jit/emitarm64.cpp

@@ -817,7 +817,7 @@ void emitter::emitInsSanityCheck(instrDesc* id)
            assert(isVectorRegister(id->idReg2()));
            assert(isVectorRegister(id->idReg3()));
            elemsize = optGetElemsize(id->idInsOpt());
-            assert(isValidVectorIndex(id->idOpSize(), elemsize, emitGetInsSC(id)));
+            assert(isValidVectorIndex(EA_16BYTE, elemsize, emitGetInsSC(id)));


Was this changed to account for the simd8 result with simd16 selection?

The size of the destination/source registers shouldn't matter here. For example,

FMLA Vd.T, Vn.T, Vm.Ts[index]

index is always encoded as H:L (2 bits) when T is either 2S or 4S (and Ts is S).

In other words, the range of valid values for index is computed based on the assumption that Vm is 16 bytes.

tannergooding · 2020-06-15T23:56:51Z

src/coreclr/src/jit/gentree.h

@@ -4728,8 +4728,8 @@ struct GenTreeJitIntrinsic : public GenTreeOp
    ClassLayout* m_layout;

    union {
-        var_types      gtOtherBaseType; // For AVX2 Gather* intrinsics
-        regNumberSmall gtOtherReg;      // For intrinsics that return 2 registers
+        var_types gtAuxiliaryType; // For intrinsics than need another type (e.g. Avx2.Gather* or SIMD (by element))


Why rename to AuxiliaryType but not AuxiliaryReg?

Since I didn't touch gtOtherReg in other files.

The intent of the renaming was to get rid of BaseType since for SIMD By Element intrinsics I use this field to encode a SIMD type of an indexed element while in some other case we do use it to keep the "other" base type (e.g. in case of wide/long intrinsics). I could've renamed it to gtOtherType.

Do you want me to rename the gtOtherReg -> gtAuxiliaryReg?

Was mostly just interested. This makes sense, thanks!

I like the renaming. I think it's reasonable to keep the reg as gtOtherReg, as that naming is used on other GenTree nodes. In any case, if we didn't change it to gtAuxiliaryType we might want to name it gtOtherType (though I'm happy with this change as-is).

tannergooding · 2020-06-16T00:06:09Z

src/coreclr/src/jit/hwintrinsic.h

+    static bool SIMDScalar(NamedIntrinsic id)
+    {
+        const HWIntrinsicFlag flags = lookupFlags(id);
+        return (flags & HW_Flag_SIMDScalar) != 0;


Why a flag, rather than a category?

Because, many intrinsics can have features of SIMDScalar and some other category:

SIMD ShiftLeftByImmediate + SIMDScalar

SIMD ShiftRightByImmediate + SIMDScalar

SIMD ByIndexedElement + SIMDScalar

and I didn't want to have all possible combinations of those.

In addition, on Arm64 flag SIMDScalar is going to be used only in CodeGen to set INS_OPTS_NONE (i.e. switch between vector variant to scalar variant for a given instruction)
while a category value is used in multiple phases (e.g. ByIndexedElement is used in LSRA, Lower and Importer) for making various decisions (what registers can be used for an indexed element, how to perform containment analysis etc.).

Thanks! Makes sense to me.

tannergooding · 2020-06-16T00:08:57Z

src/coreclr/src/jit/hwintrinsiclistarm64.h

-HARDWARE_INTRINSIC(AdvSimd,       Xor,                                                        -1,      2,     {INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor},               HW_Category_SimpleSIMD,   HW_Flag_Commutative)
-HARDWARE_INTRINSIC(AdvSimd,       ZeroExtendWideningLower,                                     8,      1,     {INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_invalid,        INS_invalid,        INS_invalid,        INS_invalid},           HW_Category_SimpleSIMD,   HW_Flag_BaseTypeFromFirstArg)
-HARDWARE_INTRINSIC(AdvSimd,       ZeroExtendWideningUpper,                                     16,     1,     {INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_invalid,        INS_invalid,        INS_invalid,        INS_invalid},           HW_Category_SimpleSIMD,   HW_Flag_BaseTypeFromFirstArg)
+HARDWARE_INTRINSIC(AdvSimd,       Abs,                                                        -1,      1,     {INS_abs,            INS_invalid,        INS_abs,            INS_invalid,        INS_abs,            INS_invalid,        INS_invalid,        INS_invalid,        INS_fabs,           INS_invalid},     HW_Category_SIMD,                  HW_Flag_BaseTypeFromFirstArg)


Git stops reporting changes for this file after this line. Is it largely just whitespace and flags/category changes to match the previous?

It's unfortunate, so I can only suggest to review in commit-by-commit fashion.
Most of the changes come from this commit so to answer your question - yes, this is to account for flags/category change. I also sorted the values in flags column in alphabetic order.

It also seems that you added HW_Flag_NoFloatingPointUsed in some places. I confess that I probably haven't carefully scrutinized the changes in this file.

It also seems that you added HW_Flag_NoFloatingPointUsed in some places. I confess that I probably haven't carefully scrutinized the changes in this file.

Yes, I added this flag to the intrinsics that operate on general-purpose registers (e.g. crc32h, rbit, clz)

CarolEidt

LGTM overall - a couple of minor suggestions and request for comment.

CarolEidt · 2020-06-16T16:03:56Z

src/coreclr/src/jit/compiler.h

@@ -3782,7 +3782,7 @@ class Compiler
    GenTree* getArgForHWIntrinsic(var_types argType, CORINFO_CLASS_HANDLE argClass, bool expectAddr = false);
    GenTree* impNonConstFallback(NamedIntrinsic intrinsic, var_types simdType, var_types baseType);
    GenTree* addRangeCheckIfNeeded(
-        NamedIntrinsic intrinsic, GenTree* lastOp, bool mustExpand, int immLowerBound, int immUpperBound);
+        NamedIntrinsic intrinsic, GenTree* immOp, bool mustExpand, int immLowerBound, int immUpperBound);


It's minor, but thanks for renaming this - even if/when it was the last op, semantically it's more important that it's an immediate.

CarolEidt · 2020-06-16T16:08:44Z

src/coreclr/src/jit/emitarm64.cpp

@@ -5827,7 +5827,6 @@ void emitter::emitIns_R_R_R(
                    elemsize = EA_1BYTE;
                    opt      = optMakeArrangement(size, elemsize);
                }
-                assert(isValidArrangement(size, opt));


I'm curious why these asserts were removed - are they invalid or somehow redundant?

Good point! For some reason I mistakenly though that they are redundant - I will revert this commit.

CarolEidt · 2020-06-16T16:13:20Z

src/coreclr/src/jit/emitarm64.cpp

@@ -6368,6 +6366,11 @@ void emitter::emitIns_R_R_R_I(instruction ins,
            assert((opt == INS_OPTS_4H) || (opt == INS_OPTS_2S));
            elemsize = optGetElemsize(opt);
            assert(isValidVectorIndex(EA_16BYTE, elemsize, imm));
+            // Restricted to V0-V15 when element size is H
+            if ((elemsize == EA_2BYTE) && (reg3 >= REG_V16))


To make the conditions consistent, perhaps this could be

Suggested change

if ((elemsize == EA_2BYTE) && (reg3 >= REG_V16))

if ((elemsize == EA_2BYTE) && ((genRegMask(reg3) & RBM_ASIMD_INDEXED_H_ELEMENT_ALLOWED_REGS) == 0))

CarolEidt · 2020-06-16T16:18:14Z

src/coreclr/src/jit/gentree.h

@@ -4728,8 +4728,8 @@ struct GenTreeJitIntrinsic : public GenTreeOp
    ClassLayout* m_layout;

    union {
-        var_types      gtOtherBaseType; // For AVX2 Gather* intrinsics
-        regNumberSmall gtOtherReg;      // For intrinsics that return 2 registers
+        var_types gtAuxiliaryType; // For intrinsics than need another type (e.g. Avx2.Gather* or SIMD (by element))


I like the renaming. I think it's reasonable to keep the reg as gtOtherReg, as that naming is used on other GenTree nodes. In any case, if we didn't change it to gtAuxiliaryType we might want to name it gtOtherType (though I'm happy with this change as-is).

CarolEidt · 2020-06-16T16:21:13Z

src/coreclr/src/jit/hwintrinsic.cpp

@@ -671,6 +680,50 @@ static bool isSupportedBaseType(NamedIntrinsic intrinsic, var_types baseType)
    return false;
 }

+struct HWIntrinsicSignatureReader final


This needs a comment

Added the comment.

src/coreclr/src/jit/hwintrinsic.cpp

CarolEidt · 2020-06-16T16:30:13Z

src/coreclr/src/jit/hwintrinsic.cpp

    {
        assert(sig->numArgs == 3);
        immOp = impStackTop(1).val;
        assert(HWIntrinsicInfo::isImmOp(intrinsic, immOp));
    }
+    else if (intrinsic == NI_AdvSimd_Arm64_InsertSelectedScalar)


Since this method is already rather large, you might consider extracting this code block, and the one for HW_Category_SIMDByIndexedElement into separate methods.

Actually, I was thinking about splitting the method into multiple methods - or even have a helper class that solely does hardware intrinsic importation - but this is more major refactoring than I want for this PR and likely affect x86/x64 side of the intrinsics, so I decided to defer such change to a later point when all the Arm64 intrinsics are implemented.

CarolEidt · 2020-06-16T16:32:18Z

src/coreclr/src/jit/hwintrinsic.cpp

+#ifdef TARGET_XARCH
+                if ((intrinsic == NI_SSE42_Crc32) || (intrinsic == NI_SSE42_X64_Crc32))
+                {
+                    // TODO - currently we use the BaseType to bring the type of the second argument


Nit: This should probably be TODO-ARM64-Cleanup

Actually, this should be TODO-XArch-Cleanup - I updated the comment

CarolEidt · 2020-06-16T18:32:40Z

src/coreclr/src/jit/hwintrinsiclistarm64.h

-HARDWARE_INTRINSIC(AdvSimd,       Xor,                                                        -1,      2,     {INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor,            INS_eor},               HW_Category_SimpleSIMD,   HW_Flag_Commutative)
-HARDWARE_INTRINSIC(AdvSimd,       ZeroExtendWideningLower,                                     8,      1,     {INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_uxtl,           INS_invalid,        INS_invalid,        INS_invalid,        INS_invalid},           HW_Category_SimpleSIMD,   HW_Flag_BaseTypeFromFirstArg)
-HARDWARE_INTRINSIC(AdvSimd,       ZeroExtendWideningUpper,                                     16,     1,     {INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_uxtl2,          INS_invalid,        INS_invalid,        INS_invalid,        INS_invalid},           HW_Category_SimpleSIMD,   HW_Flag_BaseTypeFromFirstArg)
+HARDWARE_INTRINSIC(AdvSimd,       Abs,                                                        -1,      1,     {INS_abs,            INS_invalid,        INS_abs,            INS_invalid,        INS_abs,            INS_invalid,        INS_invalid,        INS_invalid,        INS_fabs,           INS_invalid},     HW_Category_SIMD,                  HW_Flag_BaseTypeFromFirstArg)


It also seems that you added HW_Flag_NoFloatingPointUsed in some places. I confess that I probably haven't carefully scrutinized the changes in this file.

…tarm64.h

… return value type in AdvSimd.PlatformNotSupported.cs

This reverts commit 2dc7cd8.

CarolEidt

LGTM - thanks!

Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels May 23, 2020

echesakov added arch-arm64 area-System.Runtime.Intrinsics and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels May 23, 2020

echesakov force-pushed the Arm64-ASIMD-BySelectedScalar-ByScalar branch 2 times, most recently from 522dbaa to f745a7e Compare June 11, 2020 01:37

echesakov changed the title ~~[Arm64] ASIMD By Element Arithmetic Intrinsics~~ [Arm64] ASIMD By Element Intrinsics Jun 11, 2020

jaredpar mentioned this pull request Jun 11, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

echesakov mentioned this pull request Jun 11, 2020

Implement Shift and Inserts scalar and SIMD intrinsics. #36818

Merged

echesakov force-pushed the Arm64-ASIMD-BySelectedScalar-ByScalar branch from f745a7e to ac6d6c0 Compare June 13, 2020 02:13

echesakov added 15 commits June 15, 2020 13:00

Don't use fully qualified name for DuplicateSelectedScalarToVector128…

66f2c16

… return value type in AdvSimd.cs

Rename acc->addend in FusedMultiplyAdd* in AdvSimd.cs AdvSimd.Platfor…

e1f7713

…mNotSupported.cs

Rename acc->minuend in FusedMultiplySubtract* in AdvSimd.cs AdvSimd.P…

58cc8ad

…latformNotSupported.cs

Add FusedMultiplyAddByScalar and FusedMultiplyAddBySelectedScalar in …

1c815af

…AdvSimd.cs AdvSimd.PlatformNotSupported.cs

Add FusedMultiplySubtractByScalar and FusedMultiplyBySelectedScalar i…

ee63943

…n AdvSimd.cs AdvSimd.PlatformNotSupported.cs

Rename acc->addend in MultiplyAdd* in AdvSimd.cs AdvSimd.PlatformNotS…

ae0226d

…upported.cs

Rename acc->minuend in MultiplySubtract* in AdvSimd.cs AdvSimd.Platfo…

b2375a3

…rmNotSupported.cs

Re-order MultiplyScalar in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

dce2801

Add MultiplyScalarBySelectedScalar and MultiplyExtendedScalarBySelect…

afd0116

…edScalar in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

Add MultiplyByScalar and MultiplyBySelectedScalar in AdvSimd.cs AdvSi…

7319bff

…md.PlatformNotSupported.cs

Add MultiplyExtendedBySelectedScalar and MultiplyExtendedByScalar in …

4007bcc

…AdvSimd.cs AdvSimd.PlatformNotSupported.cs

Add MultiplyAddByScalar and MultiplyAddBySelectedScalar in Arm32 in A…

9221b42

…dvSimd.cs AdvSimd.PlatformNotSupported.cs

Add MultiplySubtractByScalar and MultiplySubtractBySelectedScalar in …

44dd74b

…Arm32 in AdvSimd.cs AdvSimd.PlatformNotSupported.cs

Add MultiplyBySelectedScalarWideningLower in AdvSimd.cs AdvSimd.Platf…

f331f34

…ormNotSupported.cs

Add MultiplyBySelectedScalarWideningLowerAndAdd in AdvSimd.cs AdvSimd…

e275811

….PlatformNotSupported.cs

echesakov added 2 commits June 15, 2020 13:09

Add InsertSelectedScalar in hwintrinsiccodegenarm64.cpp

716c20c

Support SIMD By Element intrinsics in HWIntrinsicImmOpHelper in hwint…

283e686

…rinsiccodegenarm64.cpp

echesakov force-pushed the Arm64-ASIMD-BySelectedScalar-ByScalar branch from a6cc4cd to 283e686 Compare June 15, 2020 20:09

echesakov marked this pull request as ready for review June 15, 2020 20:13

tannergooding reviewed Jun 15, 2020

View reviewed changes

tannergooding reviewed Jun 16, 2020

View reviewed changes

CarolEidt suggested changes Jun 16, 2020

View reviewed changes

echesakov added 8 commits June 16, 2020 12:29

Remove SIMDScalar from helpers in hwintrinsiclistarm64.h

bd40e7c

Add SIMDScalar to AdvSimd.Arm64.DuplicateToVector64 in hwintrinsiclis…

ee7db3c

…tarm64.h

Address Carol's feedback regarding assertions in emitarm64.cpp

b01be48

Add comment for HWIntrinsicSignatureReader in hwintrinsic.cpp

bb35382

Fix comment in hwintrinsic.cpp

a4b6601

Fix TODO-comment in hwintrinsic.cpp

950613f

Don't use fully qualified name for DuplicateSelectedScalarToVector128…

c312d27

… return value type in AdvSimd.PlatformNotSupported.cs

Revert "Remove redundant asserts in emitarm64.cpp"

af9f146

This reverts commit 2dc7cd8.

CarolEidt approved these changes Jun 16, 2020

View reviewed changes

echesakov closed this Jun 17, 2020

echesakov reopened this Jun 17, 2020

echesakov closed this Jun 17, 2020

echesakov reopened this Jun 17, 2020

echesakov requested a review from tannergooding June 17, 2020 17:07

tannergooding approved these changes Jun 17, 2020

View reviewed changes

echesakov merged commit a23d3b2 into dotnet:master Jun 17, 2020

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

echesakov deleted the Arm64-ASIMD-BySelectedScalar-ByScalar branch April 13, 2021 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Arm64] ASIMD By Element Intrinsics #36916

[Arm64] ASIMD By Element Intrinsics #36916

echesakov commented May 23, 2020 •

edited

Loading

Dotnet-GitSync-Bot commented May 23, 2020

ghost commented May 23, 2020

echesakov commented Jun 11, 2020

echesakov commented Jun 15, 2020

tannergooding commented Jun 15, 2020

tannergooding Jun 15, 2020

echesakov Jun 16, 2020

tannergooding Jun 15, 2020

echesakov Jun 16, 2020

tannergooding Jun 16, 2020

CarolEidt Jun 16, 2020

tannergooding Jun 16, 2020

echesakov Jun 16, 2020

tannergooding Jun 16, 2020

tannergooding Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt left a comment

CarolEidt Jun 16, 2020

CarolEidt Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt Jun 16, 2020

CarolEidt Jun 16, 2020

CarolEidt Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt Jun 16, 2020

echesakov Jun 16, 2020

CarolEidt Jun 16, 2020

CarolEidt left a comment

	if ((elemsize == EA_2BYTE) && (reg3 >= REG_V16))
	if ((elemsize == EA_2BYTE) && ((genRegMask(reg3) & RBM_ASIMD_INDEXED_H_ELEMENT_ALLOWED_REGS) == 0))

[Arm64] ASIMD By Element Intrinsics #36916

[Arm64] ASIMD By Element Intrinsics #36916

Conversation

echesakov commented May 23, 2020 • edited Loading

Dotnet-GitSync-Bot commented May 23, 2020

ghost commented May 23, 2020

echesakov commented Jun 11, 2020

echesakov commented Jun 15, 2020

tannergooding commented Jun 15, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarolEidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarolEidt left a comment

Choose a reason for hiding this comment

echesakov commented May 23, 2020 •

edited

Loading