-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vectorized paths for Span<T>.Reverse #64412
Merged
+384
−24
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
904bb85
Adding vectorized path for Span<byte>.Reverse that uses SSSE3 and AVX…
alexcovington 088771f
Added vectorized paths for Span<T>.Reverse for primitive types that a…
alexcovington fd882e9
Apply suggestions from code review
alexcovington 74deff7
Added vectorized paths for Span.Reverse to Array.Reverse. Added expli…
alexcovington 2899789
Consolidate fall back case into single method, use one wrapper for bo…
alexcovington a6c8101
Remove redundant AggressiveInlining, add AggressiveInlining to single…
alexcovington 3aa7cf1
Simplify method names, add comments
alexcovington 86caa86
Just overload Reverse
alexcovington 7c4cd52
Use Unsafe.Subtract where it is semantically more intuitive, camelCas…
alexcovington 3c3f140
Camel case formatting, add condition check for Array.Reverse to avoid…
alexcovington 94f45cc
Rework loops to use new LoadUnsafe/StoreUnsafe vector APIs. Use Permu…
alexcovington 39e906f
Improve readability of code
alexcovington 1ed5bef
Fix formatting, fix typos in comments
alexcovington 80ae8ab
Use temporary variable for generic case instead for better IL
alexcovington File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2238,5 +2238,96 @@ private static uint FindFirstMatchedLane(Vector128<byte> compareResult) | |
// Find the first lane that is set inside compareResult. | ||
return (uint)BitOperations.TrailingZeroCount(selectedLanes) >> 2; | ||
} | ||
|
||
public static void Reverse(ref byte buf, nuint length) | ||
{ | ||
if (Avx2.IsSupported && (nuint)Vector256<byte>.Count * 2 <= length) | ||
{ | ||
Vector256<byte> reverseMask = Vector256.Create( | ||
(byte)15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, // first 128-bit lane | ||
15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0); // second 128-bit lane | ||
nuint numElements = (nuint)Vector256<byte>.Count; | ||
nuint numIters = (length / numElements) / 2; | ||
for (nuint i = 0; i < numIters; i++) | ||
{ | ||
nuint firstOffset = i * numElements; | ||
nuint lastOffset = length - ((1 + i) * numElements); | ||
|
||
// Load in values from beginning and end of the array. | ||
Vector256<byte> tempFirst = Vector256.LoadUnsafe(ref buf, firstOffset); | ||
Vector256<byte> tempLast = Vector256.LoadUnsafe(ref buf, lastOffset); | ||
|
||
// Avx2 operates on two 128-bit lanes rather than the full 256-bit vector. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thank you for adding a great explanation here 👍 |
||
// Perform a shuffle to reverse each 128-bit lane, then permute to finish reversing the vector: | ||
// +-------------------------------------------------------------------------------+ | ||
// | A1 | B1 | C1 | D1 | E1 | F1 | G1 | H1 | I1 | J1 | K1 | L1 | M1 | N1 | O1 | P1 | | ||
// +-------------------------------------------------------------------------------+ | ||
// | A2 | B2 | C2 | D2 | E2 | F2 | G2 | H2 | I2 | J2 | K2 | L2 | M2 | N2 | O2 | P2 | | ||
// +-------------------------------------------------------------------------------+ | ||
// Shuffle ---> | ||
// +-------------------------------------------------------------------------------+ | ||
// | P1 | O1 | N1 | M1 | L1 | K1 | J1 | I1 | H1 | G1 | F1 | E1 | D1 | C1 | B1 | A1 | | ||
// +-------------------------------------------------------------------------------+ | ||
// | P2 | O2 | N2 | M2 | L2 | K2 | J2 | I2 | H2 | G2 | F2 | E2 | D2 | C2 | B2 | A2 | | ||
// +-------------------------------------------------------------------------------+ | ||
// Permute ---> | ||
// +-------------------------------------------------------------------------------+ | ||
// | P2 | O2 | N2 | M2 | L2 | K2 | J2 | I2 | H2 | G2 | F2 | E2 | D2 | C2 | B2 | A2 | | ||
// +-------------------------------------------------------------------------------+ | ||
// | P1 | O1 | N1 | M1 | L1 | K1 | J1 | I1 | H1 | G1 | F1 | E1 | D1 | C1 | B1 | A1 | | ||
// +-------------------------------------------------------------------------------+ | ||
tempFirst = Avx2.Shuffle(tempFirst, reverseMask); | ||
tempFirst = Avx2.Permute2x128(tempFirst, tempFirst, 0b00_01); | ||
tempLast = Avx2.Shuffle(tempLast, reverseMask); | ||
tempLast = Avx2.Permute2x128(tempLast, tempLast, 0b00_01); | ||
|
||
// Store the reversed vectors | ||
tempLast.StoreUnsafe(ref buf, firstOffset); | ||
tempFirst.StoreUnsafe(ref buf, lastOffset); | ||
} | ||
buf = ref Unsafe.Add(ref buf, numIters * numElements); | ||
length -= numIters * numElements * 2; | ||
} | ||
else if (Sse2.IsSupported && (nuint)Vector128<byte>.Count * 2 <= length) | ||
{ | ||
Vector128<byte> reverseMask = Vector128.Create((byte)15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0); | ||
nuint numElements = (nuint)Vector128<byte>.Count; | ||
nuint numIters = (length / numElements) / 2; | ||
for (nuint i = 0; i < numIters; i++) | ||
{ | ||
nuint firstOffset = i * numElements; | ||
nuint lastOffset = length - ((1 + i) * numElements); | ||
|
||
// Load in values from beginning and end of the array. | ||
Vector128<byte> tempFirst = Vector128.LoadUnsafe(ref buf, firstOffset); | ||
Vector128<byte> tempLast = Vector128.LoadUnsafe(ref buf, lastOffset); | ||
|
||
// Shuffle to reverse each vector: | ||
// +---------------------------------------------------------------+ | ||
// | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | | ||
// +---------------------------------------------------------------+ | ||
// ---> | ||
// +---------------------------------------------------------------+ | ||
// | P | O | N | M | L | K | J | I | H | G | F | E | D | C | B | A | | ||
// +---------------------------------------------------------------+ | ||
tempFirst = Ssse3.Shuffle(tempFirst, reverseMask); | ||
tempLast = Ssse3.Shuffle(tempLast, reverseMask); | ||
|
||
// Store the reversed vectors | ||
tempLast.StoreUnsafe(ref buf, firstOffset); | ||
tempFirst.StoreUnsafe(ref buf, lastOffset); | ||
} | ||
buf = ref Unsafe.Add(ref buf, numIters * numElements); | ||
length -= numIters * numElements * 2; | ||
} | ||
|
||
// Store any remaining values one-by-one | ||
for (nuint i = 0; i < (length / 2); i++) | ||
{ | ||
ref byte first = ref Unsafe.Add(ref buf, i); | ||
ref byte last = ref Unsafe.Add(ref buf, length - 1 - i); | ||
(last, first) = (first, last); | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before we merge this change, we need to update our tests, as some of them are using
Array.Reverse
to get the expected output forSpan.Reverse
:https://github.com/dotnet/runtime/blob/c663fa429a88d3089b740a89181c2b6f033b2839/src/libraries/System.Memory/tests/Span/Reverse.cs#L134-L137
I am going to send a PR for that in a few minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: #68493
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I didn't notice that. Sounds like a good idea to wait on this PR until the tests are updated, just to be sure.