Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow #19420

fiigii · 2018-08-10T22:46:18Z

This PR adds all integer overloads for AVX2/SSSE3 AlignRight and Avx2/SSE4.1 BlendVariable that originally only has byte/sbyte overloads. We make this change because these intrinsics are very common to use in integer SIMD programming, so the change will significantly improve user experience from avoiding StaticCast

Note, although integer BlendVariable works on the vector mask of "8-bit", that usually directly consumes the mask generated by other intrinsic (like Compare). The "generated" mask has all-one bits on the corresponding selected elements, so that can perfectly work with other (wider) base types. If users want to manually create the mask, the IMM version (Blend) should be the better choice.

Additionally, there are some discussions (https://github.com/dotnet/coreclr/issues/18910#issuecomment-404963232) that request the same change on AVX2/SSSE3 Shuffle (e.g., Vector128<byte> Shuffle(Vector128<byte> value, Vector128<byte> mask)). However, this one is different, which Shuffle usually needs a specially built "mask" rather than "generated mask" to indicate the selected elements.

This PR also temporarily disables the tests of SSE4.1 BlendVariable.

Close https://github.com/dotnet/coreclr/issues/18910

fiigii · 2018-08-13T18:13:02Z

@CarolEidt @tannergooding @eerhardt

tannergooding · 2018-08-13T18:19:58Z

CC. @terrajobst as well.

This probably deserves some more API design discussion. Effectively, this is just avoiding the need to insert StaticCast and could probably be solved trivially with helper methods if a user needed them.

fiigii · 2018-09-19T19:11:59Z

This change was approved by the API review meeting. Can we approve here? @tannergooding @CarolEidt @eerhardt

tannergooding · 2018-09-19T19:22:08Z

src/System.Private.CoreLib/shared/System/Runtime/Intrinsics/X86/Avx2.PlatformNotSupported.cs

+
+        /// <summary>
+        /// __m256i _mm256_alignr_epi8 (__m256i a, __m256i b, const int count)
+        ///   VPALIGNR ymm, ymm, ymm/m256, imm8


nit: It might be nice to have a comment on these explaining that the actual instruction operates on bytes and that you should adjust the mask appropriately

Good point, will do.

tannergooding · 2018-09-19T19:23:10Z

src/System.Private.CoreLib/shared/System/Runtime/Intrinsics/X86/Sse41.PlatformNotSupported.cs

@@ -44,23 +44,12 @@ public abstract class Sse41 : Ssse3
        /// <summary>
        /// __m128i _mm_blendv_epi8 (__m128i a, __m128i b, __m128i mask)
        ///   PBLENDVB xmm, xmm/m128, xmm
-        /// </summary>
-        public static Vector128<sbyte> BlendVariable(Vector128<sbyte> left, Vector128<sbyte> right, Vector128<sbyte> mask) { throw new PlatformNotSupportedException(); }


I thought the API review determined that we should just explode these, rather than making them generic?

Yes, since AVX2 BlendVariable is for integer vectors and floating versions are in AVX.

Right. But the diff looks like you are making them generic, instead of exploding them...

Maybe, I'm missing something?

Ah, you are talking about SSE4.1 versions. SSE4.1 supports integer and floating both, so after exploding integer types, that will support all the 10 base types. And we have decided that should be generic.

I believe the last API review, and the ARM Review before that, determined that we should just always explode and be explicit.

This also makes it clear when/if support for new types (like Half, Quad, etc) are added in the future that they aren't supported in the older ISAs

CC. @terrajobst for confirmation.

I see, but that is another story, and we have had some other generic APIs. Shall we change them at all?

Btw, I believe that Intel has different Half roadmaps from ARM.

Shall we change them at all

I believe so, but that is likely for a separate PR (as is moving the helper functions to the Vector64, Vector128, Vector256 types). I pinged Immo asking for the meeting notes.

I believe that Intel has different Half roadmaps from ARM.

Right, but I believe this was one of the "generalized" API design decisions that was made for intrinsics at the last two meetings. Which was basically: match the underlying instruction/hardware as closely as possible and always explode to avoid ambiguities from possible future expansions.

Got it, thanks.

I believe so, but that is likely for a separate PR

Let me put these changes together and open a new issue.

tannergooding

LGTM

fiigii · 2018-09-19T23:32:30Z

Add unsigned overloads for MultiplyLow in this PR to solve https://github.com/dotnet/coreclr/issues/18905

fiigii · 2018-09-20T08:05:26Z

@dotnet-bot test this please

tannergooding · 2018-09-20T20:43:28Z

Merging, Two of the pending jobs actually completed. The other was an out of disk space issue, but a similar job (testing additional functionality) succeeded.

tannergooding · 2018-09-20T20:43:53Z

Thansk @fiigii

fiigii mentioned this pull request Sep 19, 2018

Add pointer overloads for Avx2.BroadcastScalarToVector128 #20055

Merged

tannergooding reviewed Sep 19, 2018

View reviewed changes

FeiPengIntel added 2 commits September 19, 2018 15:49

Add all integer overloads for Avx2/SSE4.1 BlendVariable

16c6eaa

Add all integer overloads for AVX2/SSSE3 AlignRight

eb381e6

fiigii force-pushed the intoverload branch from 458978d to eb381e6 Compare September 19, 2018 22:58

tannergooding approved these changes Sep 19, 2018

View reviewed changes

Add unsigned overloads for MultiplyLow

dab7534

fiigii changed the title ~~Add all integer overloads for AVX2/SSSE3 AlignRight and Avx2/SSE4.1 BlendVariable~~ Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow Sep 19, 2018

fiigii mentioned this pull request Sep 20, 2018

Add missing overloads of hardware intrinsic dotnet/corefx#32370

Merged

tannergooding merged commit fcebb9b into dotnet:master Sep 20, 2018

fiigii deleted the intoverload branch September 20, 2018 20:45

fiigii mentioned this pull request Oct 2, 2018

Implement the remaining AVX2 intrinsic #20210

Merged

fiigii mentioned this pull request Jan 31, 2020

Intel hardware intrinsic API change dotnet/runtime#11124

Closed

AndyAyersMS mentioned this pull request Jan 31, 2020

PMI assert in system.private.corelib dotnet/runtime#11200

Closed

fiigii mentioned this pull request Jan 31, 2020

Intel hardware intrinsic API change dotnet/runtime#27584

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow #19420

Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow #19420

fiigii commented Aug 10, 2018

fiigii commented Aug 13, 2018

tannergooding commented Aug 13, 2018

fiigii commented Sep 19, 2018

tannergooding Sep 19, 2018

fiigii Sep 19, 2018

fiigii Sep 19, 2018

tannergooding Sep 19, 2018

fiigii Sep 19, 2018

tannergooding Sep 19, 2018

fiigii Sep 19, 2018

tannergooding Sep 19, 2018

tannergooding Sep 19, 2018

fiigii Sep 19, 2018

tannergooding Sep 19, 2018

fiigii Sep 19, 2018

tannergooding left a comment

fiigii commented Sep 19, 2018

fiigii commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow #19420

Add all integer overloads for AlignRight/BlendVariable and unsigned overloads for MultiplyLow #19420

Conversation

fiigii commented Aug 10, 2018

fiigii commented Aug 13, 2018

tannergooding commented Aug 13, 2018

fiigii commented Sep 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding left a comment

Choose a reason for hiding this comment

fiigii commented Sep 19, 2018

fiigii commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018