Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vectorized paths for Span<T>.Reverse #64412

Merged
merged 14 commits into from
Apr 26, 2022

Conversation

alexcovington
Copy link
Contributor

Adds vectorized paths to Span<T>.Reverse for types that are supported. Falls back to previous behavior if T is not a value type or too big for a vector.

Compared against 65a5d0e.

Using this microbenchmark to compare performance, modified to use more buffer sizes and types:

Microbenchmark changes
diff --git a/src/benchmarks/micro/libraries/System.Memory/Span.cs b/src/benchmarks/micro/libraries/System.Memory/Span.cs
index e696e141..75d28d7d 100644
--- a/src/benchmarks/micro/libraries/System.Memory/Span.cs
+++ b/src/benchmarks/micro/libraries/System.Memory/Span.cs
@@ -14,11 +14,20 @@ namespace System.Memory
     [GenericTypeArguments(typeof(byte))]
     [GenericTypeArguments(typeof(char))]
     [GenericTypeArguments(typeof(int))]
+    [GenericTypeArguments(typeof(long))]
+    [GenericTypeArguments(typeof(float))]
+    [GenericTypeArguments(typeof(double))]
     [BenchmarkCategory(Categories.Runtime, Categories.Libraries, Categories.Span)]
     public class Span<T>
         where T : struct, IComparable<T>, IEquatable<T>
     {
-        [Params(Utils.DefaultCollectionSize)]
+        [Params(
+            8 /* No vectorization */,
+            34 /* SSSE3 with leftover */,
+            68 /* AVX2 path with leftover bytes */,
+            Utils.DefaultCollectionSize,
+            Utils.DefaultCollectionSize * 2
+            )]
         public int Size;

         private T[] _array, _same, _emptyWithSingleValue;

Performance results:

$ py .\scripts\benchmarks_ci.py -c Release -f net7.0 --filter *Span*Reverse* --corerun C:\Users\acovingt\source\repos\runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe --bdn-artifacts C:\Users\acovingt\Documents\vectorize-span-reverse-base
$ py .\scripts\benchmarks_ci.py -c Release -f net7.0 --filter *Span*Reverse* --corerun C:\Users\acovingt\source\repos\runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe --bdn-artifacts C:\Users\acovingt\Documents\vectorize-span-reverse-diff
$ cd .\src\tools\ResultsComparer\ 
$ dotnet run -- --base C:\Users\acovingt\Documents\vectorize-span-reverse-base\ --diff C:\Users\acovingt\Documents\vectorize-span-reverse-diff\ --threshold 3% --noise 5ns
summary:
better: 23, geomean: 3.946
total diff: 23

No Slower results for the provided threshold = 3% and noise filter = 5ns.

| Faster                                         | base/diff | Base Median (ns) | Diff Median (ns) | Modality|
| ---------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.Memory.Span<Byte>.Reverse(Size: 1024)   |     33.45 |           616.32 |            18.42 |         |
| System.Memory.Span<Byte>.Reverse(Size: 512)    |     28.16 |           306.00 |            10.87 |         |
| System.Memory.Span<Char>.Reverse(Size: 512)    |     16.26 |           300.26 |            18.47 |         |
| System.Memory.Span<Char>.Reverse(Size: 1024)   |     14.25 |           609.56 |            42.77 |         |
| System.Memory.Span<Byte>.Reverse(Size: 68)     |      5.75 |            33.59 |             5.84 |         |
| System.Memory.Span<Char>.Reverse(Size: 68)     |      5.54 |            39.56 |             7.14 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 512)  |      4.57 |           152.19 |            33.33 |         |
| System.Memory.Span<Char>.Reverse(Size: 34)     |      4.13 |            21.56 |             5.22 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 1024) |      3.93 |           303.01 |            77.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 1024)  |      3.88 |           307.61 |            79.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 512)   |      3.67 |           154.75 |            42.20 |         |
| System.Memory.Span<Byte>.Reverse(Size: 34)     |      3.03 |            15.56 |             5.14 | several?|
| System.Memory.Span<Int32>.Reverse(Size: 68)    |      2.59 |            23.28 |             8.98 |         |
| System.Memory.Span<Single>.Reverse(Size: 68)   |      2.45 |            19.77 |             8.06 |         |
| System.Memory.Span<Single>.Reverse(Size: 34)   |      2.21 |            12.41 |             5.62 | several?|
| System.Memory.Span<Int64>.Reverse(Size: 1024)  |      2.01 |           302.07 |           150.24 |         |
| System.Memory.Span<Int32>.Reverse(Size: 34)    |      1.99 |            15.11 |             7.61 |         |
| System.Memory.Span<Int64>.Reverse(Size: 512)   |      1.98 |           153.95 |            77.64 |         |
| System.Memory.Span<Double>.Reverse(Size: 512)  |      1.97 |           152.21 |            77.18 |         |
| System.Memory.Span<Double>.Reverse(Size: 1024) |      1.93 |           301.68 |           156.33 |         |
| System.Memory.Span<Int64>.Reverse(Size: 68)    |      1.91 |            23.54 |            12.35 |         |
| System.Memory.Span<Int64>.Reverse(Size: 34)    |      1.76 |            14.90 |             8.49 |         |
| System.Memory.Span<Double>.Reverse(Size: 68)   |      1.64 |            19.80 |            12.08 |         |

Please let me know if I can provide any other information. Thanks!

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 27, 2022
@ghost
Copy link

ghost commented Jan 27, 2022

Tagging subscribers to this area: @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

Adds vectorized paths to Span<T>.Reverse for types that are supported. Falls back to previous behavior if T is not a value type or too big for a vector.

Compared against 65a5d0e.

Using this microbenchmark to compare performance, modified to use more buffer sizes and types:

Microbenchmark changes
diff --git a/src/benchmarks/micro/libraries/System.Memory/Span.cs b/src/benchmarks/micro/libraries/System.Memory/Span.cs
index e696e141..75d28d7d 100644
--- a/src/benchmarks/micro/libraries/System.Memory/Span.cs
+++ b/src/benchmarks/micro/libraries/System.Memory/Span.cs
@@ -14,11 +14,20 @@ namespace System.Memory
     [GenericTypeArguments(typeof(byte))]
     [GenericTypeArguments(typeof(char))]
     [GenericTypeArguments(typeof(int))]
+    [GenericTypeArguments(typeof(long))]
+    [GenericTypeArguments(typeof(float))]
+    [GenericTypeArguments(typeof(double))]
     [BenchmarkCategory(Categories.Runtime, Categories.Libraries, Categories.Span)]
     public class Span<T>
         where T : struct, IComparable<T>, IEquatable<T>
     {
-        [Params(Utils.DefaultCollectionSize)]
+        [Params(
+            8 /* No vectorization */,
+            34 /* SSSE3 with leftover */,
+            68 /* AVX2 path with leftover bytes */,
+            Utils.DefaultCollectionSize,
+            Utils.DefaultCollectionSize * 2
+            )]
         public int Size;

         private T[] _array, _same, _emptyWithSingleValue;

Performance results:

$ py .\scripts\benchmarks_ci.py -c Release -f net7.0 --filter *Span*Reverse* --corerun C:\Users\acovingt\source\repos\runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe --bdn-artifacts C:\Users\acovingt\Documents\vectorize-span-reverse-base
$ py .\scripts\benchmarks_ci.py -c Release -f net7.0 --filter *Span*Reverse* --corerun C:\Users\acovingt\source\repos\runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe --bdn-artifacts C:\Users\acovingt\Documents\vectorize-span-reverse-diff
$ cd .\src\tools\ResultsComparer\ 
$ dotnet run -- --base C:\Users\acovingt\Documents\vectorize-span-reverse-base\ --diff C:\Users\acovingt\Documents\vectorize-span-reverse-diff\ --threshold 3% --noise 5ns
summary:
better: 23, geomean: 3.946
total diff: 23

No Slower results for the provided threshold = 3% and noise filter = 5ns.

| Faster                                         | base/diff | Base Median (ns) | Diff Median (ns) | Modality|
| ---------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.Memory.Span<Byte>.Reverse(Size: 1024)   |     33.45 |           616.32 |            18.42 |         |
| System.Memory.Span<Byte>.Reverse(Size: 512)    |     28.16 |           306.00 |            10.87 |         |
| System.Memory.Span<Char>.Reverse(Size: 512)    |     16.26 |           300.26 |            18.47 |         |
| System.Memory.Span<Char>.Reverse(Size: 1024)   |     14.25 |           609.56 |            42.77 |         |
| System.Memory.Span<Byte>.Reverse(Size: 68)     |      5.75 |            33.59 |             5.84 |         |
| System.Memory.Span<Char>.Reverse(Size: 68)     |      5.54 |            39.56 |             7.14 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 512)  |      4.57 |           152.19 |            33.33 |         |
| System.Memory.Span<Char>.Reverse(Size: 34)     |      4.13 |            21.56 |             5.22 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 1024) |      3.93 |           303.01 |            77.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 1024)  |      3.88 |           307.61 |            79.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 512)   |      3.67 |           154.75 |            42.20 |         |
| System.Memory.Span<Byte>.Reverse(Size: 34)     |      3.03 |            15.56 |             5.14 | several?|
| System.Memory.Span<Int32>.Reverse(Size: 68)    |      2.59 |            23.28 |             8.98 |         |
| System.Memory.Span<Single>.Reverse(Size: 68)   |      2.45 |            19.77 |             8.06 |         |
| System.Memory.Span<Single>.Reverse(Size: 34)   |      2.21 |            12.41 |             5.62 | several?|
| System.Memory.Span<Int64>.Reverse(Size: 1024)  |      2.01 |           302.07 |           150.24 |         |
| System.Memory.Span<Int32>.Reverse(Size: 34)    |      1.99 |            15.11 |             7.61 |         |
| System.Memory.Span<Int64>.Reverse(Size: 512)   |      1.98 |           153.95 |            77.64 |         |
| System.Memory.Span<Double>.Reverse(Size: 512)  |      1.97 |           152.21 |            77.18 |         |
| System.Memory.Span<Double>.Reverse(Size: 1024) |      1.93 |           301.68 |           156.33 |         |
| System.Memory.Span<Int64>.Reverse(Size: 68)    |      1.91 |            23.54 |            12.35 |         |
| System.Memory.Span<Int64>.Reverse(Size: 34)    |      1.76 |            14.90 |             8.49 |         |
| System.Memory.Span<Double>.Reverse(Size: 68)   |      1.64 |            19.80 |            12.08 |         |

Please let me know if I can provide any other information. Thanks!

Author: alexcovington
Assignees: -
Labels:

area-System.Memory, community-contribution

Milestone: -

@stephentoub
Copy link
Member

Thanks for sharing the perf tests. The results only show down to a size of 32 elements, and all of them show improvements. Is there a smaller size at which this is actually a regression?

@stephentoub
Copy link
Member

Array.Reverse<T> has its own almost identical implementation that won't benefit from these improvements. Can we change Array to delegate to the same underlying implementation being added here so that both arrays and spans benefit equally?

public static void Reverse<T>(T[] array)
{
if (array == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
Reverse(array, 0, array.Length);
}
public static void Reverse<T>(T[] array, int index, int length)
{
if (array == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
if (index < 0)
ThrowHelper.ThrowIndexArgumentOutOfRange_NeedNonNegNumException();
if (length < 0)
ThrowHelper.ThrowLengthArgumentOutOfRange_ArgumentOutOfRange_NeedNonNegNum();
if (array.Length - index < length)
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_InvalidOffLen);
if (length <= 1)
return;
ref T first = ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(array), index);
ref T last = ref Unsafe.Add(ref Unsafe.Add(ref first, length), -1);
do
{
T temp = first;
first = last;
last = temp;
first = ref Unsafe.Add(ref first, 1);
last = ref Unsafe.Add(ref last, -1);
} while (Unsafe.IsAddressLessThan(ref first, ref last));
}

@alexcovington
Copy link
Contributor Author

Thanks for sharing the perf tests. The results only show down to a size of 32 elements, and all of them show improvements. Is there a smaller size at which this is actually a regression?

@stephentoub I didn't notice that the table didn't include the results for 8 byte spans. I changed my filter to be a little more inclusive when comparing results:

PS C:\Users\acovingt\source\repos\performance\src\tools\ResultsComparer> dotnet run -- --base C:\Users\acovingt\Documents\vectorize-span-reverse-base\ --diff C:\Users\acovingt\Documents\vectorize-span-reverse-diff\ --threshold 1% --noise 1ns
summary:
better: 24, geomean: 3.806
worse: 4, geomean: 1.463
total diff: 28

| Slower                                      | diff/base | Base Median (ns) | Diff Median (ns) | Modality|
| ------------------------------------------- | ---------:| ----------------:| ----------------:| --------:|
| System.Memory.Span<Double>.Reverse(Size: 8) |      1.69 |             4.20 |             7.08 |         |
| System.Memory.Span<Single>.Reverse(Size: 8) |      1.55 |             4.21 |             6.53 |         |
| System.Memory.Span<Int32>.Reverse(Size: 8)  |      1.34 |             5.02 |             6.74 |         |
| System.Memory.Span<Byte>.Reverse(Size: 8)   |      1.30 |             5.24 |             6.82 |         |

| Faster                                         | base/diff | Base Median (ns) | Diff Median (ns) | Modality|
| ---------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.Memory.Span<Byte>.Reverse(Size: 1024)   |     33.45 |           616.32 |            18.42 |         |
| System.Memory.Span<Byte>.Reverse(Size: 512)    |     28.16 |           306.00 |            10.87 |         |
| System.Memory.Span<Char>.Reverse(Size: 512)    |     16.26 |           300.26 |            18.47 |         |
| System.Memory.Span<Char>.Reverse(Size: 1024)   |     14.25 |           609.56 |            42.77 |         |
| System.Memory.Span<Byte>.Reverse(Size: 68)     |      5.75 |            33.59 |             5.84 |         |
| System.Memory.Span<Char>.Reverse(Size: 68)     |      5.54 |            39.56 |             7.14 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 512)  |      4.57 |           152.19 |            33.33 |         |
| System.Memory.Span<Char>.Reverse(Size: 34)     |      4.13 |            21.56 |             5.22 | bimodal |
| System.Memory.Span<Single>.Reverse(Size: 1024) |      3.93 |           303.01 |            77.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 1024)  |      3.88 |           307.61 |            79.20 |         |
| System.Memory.Span<Int32>.Reverse(Size: 512)   |      3.67 |           154.75 |            42.20 |         |
| System.Memory.Span<Byte>.Reverse(Size: 34)     |      3.03 |            15.56 |             5.14 | several?|
| System.Memory.Span<Int32>.Reverse(Size: 68)    |      2.59 |            23.28 |             8.98 |         |
| System.Memory.Span<Single>.Reverse(Size: 68)   |      2.45 |            19.77 |             8.06 |         |
| System.Memory.Span<Single>.Reverse(Size: 34)   |      2.21 |            12.41 |             5.62 | several?|
| System.Memory.Span<Int64>.Reverse(Size: 1024)  |      2.01 |           302.07 |           150.24 |         |
| System.Memory.Span<Int32>.Reverse(Size: 34)    |      1.99 |            15.11 |             7.61 |         |
| System.Memory.Span<Int64>.Reverse(Size: 512)   |      1.98 |           153.95 |            77.64 |         |
| System.Memory.Span<Double>.Reverse(Size: 512)  |      1.97 |           152.21 |            77.18 |         |
| System.Memory.Span<Double>.Reverse(Size: 1024) |      1.93 |           301.68 |           156.33 |         |
| System.Memory.Span<Int64>.Reverse(Size: 68)    |      1.91 |            23.54 |            12.35 |         |
| System.Memory.Span<Int64>.Reverse(Size: 34)    |      1.76 |            14.90 |             8.49 |         |
| System.Memory.Span<Double>.Reverse(Size: 34)   |      1.66 |            12.35 |             7.44 |         |
| System.Memory.Span<Double>.Reverse(Size: 68)   |      1.64 |            19.80 |            12.08 |         |

There's a 2-3ns regression for the really small spans possibly due to the overhead of the extra conditional checks, or it could just be noise.

Let me know if you'd like more analysis on it.

@alexcovington
Copy link
Contributor Author

Array.Reverse<T> has its own almost identical implementation that won't benefit from these improvements. Can we change Array to delegate to the same underlying implementation being added here so that both arrays and spans benefit equally?

public static void Reverse<T>(T[] array)
{
if (array == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
Reverse(array, 0, array.Length);
}
public static void Reverse<T>(T[] array, int index, int length)
{
if (array == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
if (index < 0)
ThrowHelper.ThrowIndexArgumentOutOfRange_NeedNonNegNumException();
if (length < 0)
ThrowHelper.ThrowLengthArgumentOutOfRange_ArgumentOutOfRange_NeedNonNegNum();
if (array.Length - index < length)
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_InvalidOffLen);
if (length <= 1)
return;
ref T first = ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(array), index);
ref T last = ref Unsafe.Add(ref Unsafe.Add(ref first, length), -1);
do
{
T temp = first;
first = last;
last = temp;
first = ref Unsafe.Add(ref first, 1);
last = ref Unsafe.Add(ref last, -1);
} while (Unsafe.IsAddressLessThan(ref first, ref last));
}

I didn't notice this, but you're right that Array can also benefit from this. I'll try adding the paths there as well and send an update.

@alexcovington
Copy link
Contributor Author

@stephentoub Updated PR to include the optimizations for Array.Reverse as well.

To verify the performance, I did a quick copy of the same benchmark and had it reverse the original array instead.

Microbenchmark changes
@@ -14,11 +14,20 @@ namespace System.Memory
     [GenericTypeArguments(typeof(byte))]
     [GenericTypeArguments(typeof(char))]
     [GenericTypeArguments(typeof(int))]
+    [GenericTypeArguments(typeof(long))]
+    [GenericTypeArguments(typeof(float))]
+    [GenericTypeArguments(typeof(double))]
     [BenchmarkCategory(Categories.Runtime, Categories.Libraries, Categories.Span)]
     public class Span<T>
         where T : struct, IComparable<T>, IEquatable<T>
     {
-        [Params(Utils.DefaultCollectionSize)]
+        [Params(
+            8 /* No vectorization */,
+            34 /* SSSE3 with leftover */,
+            68 /* AVX2 path with leftover bytes */,
+            Utils.DefaultCollectionSize,
+            Utils.DefaultCollectionSize * 2
+            )]
         public int Size;

         private T[] _array, _same, _emptyWithSingleValue;
@@ -41,6 +50,9 @@ namespace System.Memory
         [Benchmark]
         public void Reverse() => new System.Span<T>(_array).Reverse();

+        [Benchmark]
+        public void ReverseArray() => _array.Reverse();
+
         [Benchmark]
         public T[] ToArray() => new System.Span<T>(_array).ToArray();
Performance results
BenchmarkDotNet=v0.13.1.1669-nightly, OS=Windows 10 (10.0.19044.1415/21H2/November2021Update)
AMD Ryzen 5 3600, 1 CPU, 12 logical and 6 physical cores
.NET SDK=7.0.100-preview.2.22078.1
  [Host]     : .NET 7.0.0 (7.0.22.7408), X64 RyuJIT
  Job-FHRFIJ : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT
  Job-KGLSXA : .NET 7.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable,-bl:benchmarkdotnet.binlog  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1  
Type Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Allocated Alloc Ratio
Span<Byte> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.520 ns 0.3235 ns 0.3595 ns 5.408 ns 5.130 ns 6.543 ns 1.00 0.00 - - NA
Span<Byte> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.266 ns 0.1361 ns 0.1456 ns 5.237 ns 5.047 ns 5.545 ns 0.96 0.07 - - NA
Span<Char> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.577 ns 0.2455 ns 0.2827 ns 5.525 ns 5.261 ns 6.255 ns 1.00 0.00 - - NA
Span<Char> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.160 ns 0.1787 ns 0.1987 ns 5.107 ns 4.780 ns 5.524 ns 0.93 0.06 - - NA
Span<Double> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.019 ns 0.2419 ns 0.2689 ns 5.010 ns 4.598 ns 5.424 ns 1.00 0.00 - - NA
Span<Double> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 3.372 ns 0.0810 ns 0.0676 ns 3.369 ns 3.237 ns 3.494 ns 0.67 0.03 - - NA
Span<Int32> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.367 ns 0.3346 ns 0.3853 ns 5.219 ns 4.900 ns 6.279 ns 1.00 0.00 - - NA
Span<Int32> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 4.585 ns 0.1160 ns 0.1028 ns 4.581 ns 4.423 ns 4.781 ns 0.84 0.07 - - NA
Span<Int64> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.366 ns 0.1760 ns 0.1956 ns 5.313 ns 5.069 ns 5.743 ns 1.00 0.00 - - NA
Span<Int64> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 3.281 ns 0.0995 ns 0.1065 ns 3.258 ns 3.104 ns 3.492 ns 0.61 0.03 - - NA
Span<Single> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.096 ns 0.3850 ns 0.4279 ns 4.928 ns 4.449 ns 6.003 ns 1.00 0.00 - - NA
Span<Single> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 5.450 ns 0.2560 ns 0.2846 ns 5.336 ns 4.920 ns 6.008 ns 1.08 0.12 - - NA
Span<Byte> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.587 ns 0.2562 ns 0.2847 ns 7.534 ns 7.171 ns 8.182 ns 1.00 0.00 0.0057 48 B 1.00
Span<Byte> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 8.353 ns 0.2116 ns 0.2352 ns 8.403 ns 7.905 ns 8.828 ns 1.10 0.04 0.0057 48 B 1.00
Span<Char> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.876 ns 0.2675 ns 0.2863 ns 7.863 ns 7.416 ns 8.510 ns 1.00 0.00 0.0057 48 B 1.00
Span<Char> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 8.004 ns 0.4029 ns 0.4478 ns 7.962 ns 7.293 ns 8.817 ns 1.02 0.06 0.0057 48 B 1.00
Span<Double> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 8.090 ns 0.3468 ns 0.3711 ns 8.000 ns 7.534 ns 8.745 ns 1.00 0.00 0.0057 48 B 1.00
Span<Double> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.939 ns 0.3305 ns 0.3806 ns 7.929 ns 7.438 ns 8.833 ns 0.99 0.08 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 8.231 ns 0.4143 ns 0.4771 ns 8.007 ns 7.617 ns 9.212 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 8.063 ns 0.3655 ns 0.4210 ns 8.143 ns 7.353 ns 9.113 ns 0.98 0.08 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 9.724 ns 0.5913 ns 0.6809 ns 9.604 ns 8.832 ns 11.157 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.725 ns 0.3031 ns 0.3369 ns 7.730 ns 7.026 ns 8.475 ns 0.80 0.06 0.0057 48 B 1.00
Span<Single> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.824 ns 0.2631 ns 0.2702 ns 7.771 ns 7.417 ns 8.564 ns 1.00 0.00 0.0057 48 B 1.00
Span<Single> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 8 7.809 ns 0.3318 ns 0.3821 ns 7.761 ns 7.206 ns 8.428 ns 1.01 0.05 0.0057 48 B 1.00
Span<Byte> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 19.987 ns 1.4561 ns 1.6768 ns 20.243 ns 16.980 ns 22.752 ns 1.00 0.00 - - NA
Span<Byte> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 3.772 ns 0.2250 ns 0.2592 ns 3.765 ns 3.346 ns 4.285 ns 0.19 0.01 - - NA
Span<Char> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 18.797 ns 0.3681 ns 0.3443 ns 18.846 ns 18.237 ns 19.413 ns 1.00 0.00 - - NA
Span<Char> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 4.134 ns 0.2428 ns 0.2597 ns 4.101 ns 3.823 ns 4.735 ns 0.22 0.02 - - NA
Span<Double> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 14.286 ns 0.3942 ns 0.4540 ns 14.150 ns 13.689 ns 15.292 ns 1.00 0.00 - - NA
Span<Double> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.089 ns 0.3223 ns 0.3582 ns 7.022 ns 6.581 ns 7.855 ns 0.50 0.03 - - NA
Span<Int32> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 15.881 ns 0.4342 ns 0.5000 ns 15.744 ns 15.185 ns 16.850 ns 1.00 0.00 - - NA
Span<Int32> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 4.948 ns 0.2778 ns 0.3199 ns 4.954 ns 4.458 ns 5.643 ns 0.31 0.02 - - NA
Span<Int64> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 15.256 ns 0.3355 ns 0.3729 ns 15.151 ns 14.861 ns 16.243 ns 1.00 0.00 - - NA
Span<Int64> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 6.799 ns 0.2343 ns 0.2698 ns 6.805 ns 6.303 ns 7.297 ns 0.45 0.02 - - NA
Span<Single> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 14.213 ns 0.4007 ns 0.4453 ns 14.075 ns 13.660 ns 15.193 ns 1.00 0.00 - - NA
Span<Single> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 4.810 ns 0.2720 ns 0.2910 ns 4.724 ns 4.462 ns 5.345 ns 0.34 0.02 - - NA
Span<Byte> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 8.077 ns 0.6692 ns 0.7706 ns 8.044 ns 7.043 ns 9.612 ns 1.00 0.00 0.0057 48 B 1.00
Span<Byte> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.788 ns 0.4962 ns 0.5714 ns 7.681 ns 7.107 ns 8.911 ns 0.97 0.08 0.0057 48 B 1.00
Span<Char> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.844 ns 0.5070 ns 0.5839 ns 7.713 ns 7.224 ns 9.234 ns 1.00 0.00 0.0057 48 B 1.00
Span<Char> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 9.578 ns 1.5863 ns 1.8268 ns 8.997 ns 7.225 ns 13.210 ns 1.23 0.25 0.0057 48 B 1.00
Span<Double> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.779 ns 0.1663 ns 0.1474 ns 7.751 ns 7.554 ns 8.004 ns 1.00 0.00 0.0057 48 B 1.00
Span<Double> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.673 ns 0.2380 ns 0.2646 ns 7.632 ns 7.274 ns 8.135 ns 0.98 0.04 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.794 ns 0.2142 ns 0.2381 ns 7.749 ns 7.446 ns 8.395 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.672 ns 0.3186 ns 0.3541 ns 7.593 ns 7.256 ns 8.481 ns 0.98 0.04 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 9.618 ns 0.3395 ns 0.3909 ns 9.571 ns 9.099 ns 10.511 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.367 ns 0.3446 ns 0.3538 ns 7.273 ns 6.963 ns 8.370 ns 0.76 0.06 0.0057 48 B 1.00
Span<Single> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.862 ns 0.3622 ns 0.4171 ns 7.650 ns 7.316 ns 8.645 ns 1.00 0.00 0.0057 48 B 1.00
Span<Single> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 34 7.850 ns 0.3969 ns 0.4411 ns 7.840 ns 7.281 ns 8.807 ns 1.00 0.09 0.0057 48 B 1.00
Span<Byte> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 34.032 ns 1.1898 ns 1.2731 ns 33.821 ns 32.554 ns 36.661 ns 1.00 0.00 - - NA
Span<Byte> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 4.381 ns 0.1225 ns 0.1311 ns 4.425 ns 4.091 ns 4.608 ns 0.13 0.01 - - NA
Span<Char> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 34.295 ns 0.7447 ns 0.6966 ns 34.213 ns 33.174 ns 35.414 ns 1.00 0.00 - - NA
Span<Char> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 5.452 ns 0.2268 ns 0.2611 ns 5.440 ns 5.078 ns 5.893 ns 0.16 0.01 - - NA
Span<Double> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 24.392 ns 1.0467 ns 1.0748 ns 24.309 ns 23.170 ns 27.153 ns 1.00 0.00 - - NA
Span<Double> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 15.108 ns 0.2478 ns 0.2069 ns 15.092 ns 14.856 ns 15.580 ns 0.62 0.03 - - NA
Span<Int32> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 24.522 ns 0.7755 ns 0.8930 ns 24.090 ns 23.431 ns 26.373 ns 1.00 0.00 - - NA
Span<Int32> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.718 ns 0.2908 ns 0.3349 ns 7.766 ns 7.222 ns 8.354 ns 0.32 0.02 - - NA
Span<Int64> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 24.862 ns 0.7633 ns 0.8484 ns 24.687 ns 23.698 ns 26.954 ns 1.00 0.00 - - NA
Span<Int64> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 10.736 ns 0.4633 ns 0.5149 ns 10.555 ns 10.111 ns 11.744 ns 0.43 0.03 - - NA
Span<Single> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 24.264 ns 1.1811 ns 1.3602 ns 23.699 ns 22.872 ns 26.753 ns 1.00 0.00 - - NA
Span<Single> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 8.674 ns 0.2843 ns 0.3160 ns 8.662 ns 8.160 ns 9.195 ns 0.36 0.02 - - NA
Span<Byte> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.554 ns 0.2177 ns 0.2507 ns 7.497 ns 7.024 ns 8.018 ns 1.00 0.00 0.0057 48 B 1.00
Span<Byte> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.463 ns 0.2131 ns 0.2369 ns 7.450 ns 7.114 ns 7.988 ns 0.99 0.05 0.0057 48 B 1.00
Span<Char> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.896 ns 0.3654 ns 0.3909 ns 7.884 ns 7.302 ns 8.653 ns 1.00 0.00 0.0057 48 B 1.00
Span<Char> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.856 ns 0.4322 ns 0.4978 ns 7.765 ns 7.275 ns 9.036 ns 0.99 0.06 0.0057 48 B 1.00
Span<Double> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 8.021 ns 0.3683 ns 0.4242 ns 8.009 ns 7.412 ns 8.776 ns 1.00 0.00 0.0057 48 B 1.00
Span<Double> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.520 ns 0.1368 ns 0.1142 ns 7.543 ns 7.334 ns 7.714 ns 0.94 0.05 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.878 ns 0.4089 ns 0.4709 ns 7.818 ns 7.351 ns 8.969 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.759 ns 0.3057 ns 0.3398 ns 7.656 ns 7.306 ns 8.605 ns 0.99 0.06 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 9.062 ns 0.2920 ns 0.3362 ns 9.197 ns 8.153 ns 9.445 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.693 ns 0.4299 ns 0.4951 ns 7.554 ns 7.178 ns 9.011 ns 0.85 0.05 0.0057 48 B 1.00
Span<Single> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.918 ns 0.2114 ns 0.2262 ns 7.910 ns 7.506 ns 8.302 ns 1.00 0.00 0.0057 48 B 1.00
Span<Single> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 68 7.989 ns 0.4591 ns 0.5287 ns 7.913 ns 7.267 ns 9.029 ns 1.01 0.08 0.0057 48 B 1.00
Span<Byte> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 308.148 ns 1.9129 ns 1.7894 ns 307.750 ns 305.312 ns 311.041 ns 1.00 0.00 - - NA
Span<Byte> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 9.791 ns 0.5977 ns 0.6883 ns 9.736 ns 8.779 ns 11.055 ns 0.03 0.00 - - NA
Span<Char> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 308.872 ns 4.4762 ns 4.1870 ns 307.695 ns 303.549 ns 317.944 ns 1.00 0.00 - - NA
Span<Char> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 24.216 ns 0.4776 ns 0.5110 ns 24.171 ns 23.394 ns 25.113 ns 0.08 0.00 - - NA
Span<Double> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 167.078 ns 6.1122 ns 6.7936 ns 167.062 ns 155.270 ns 177.899 ns 1.00 0.00 - - NA
Span<Double> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 79.162 ns 1.4420 ns 1.2783 ns 79.150 ns 76.854 ns 81.755 ns 0.47 0.02 - - NA
Span<Int32> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 159.908 ns 3.2113 ns 3.2978 ns 159.484 ns 154.711 ns 165.094 ns 1.00 0.00 - - NA
Span<Int32> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 42.780 ns 1.8013 ns 2.0744 ns 42.565 ns 37.955 ns 46.418 ns 0.27 0.02 - - NA
Span<Int64> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 159.236 ns 3.6277 ns 3.8816 ns 159.271 ns 152.683 ns 166.493 ns 1.00 0.00 - - NA
Span<Int64> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 74.144 ns 2.0413 ns 2.3508 ns 73.866 ns 70.691 ns 78.616 ns 0.46 0.02 - - NA
Span<Single> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 166.105 ns 6.8180 ns 7.8516 ns 163.366 ns 157.062 ns 182.688 ns 1.00 0.00 - - NA
Span<Single> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 41.930 ns 1.2742 ns 1.4674 ns 41.705 ns 39.787 ns 44.695 ns 0.25 0.01 - - NA
Span<Byte> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.418 ns 0.2660 ns 0.2846 ns 7.373 ns 6.995 ns 8.044 ns 1.00 0.00 0.0057 48 B 1.00
Span<Byte> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.452 ns 0.3129 ns 0.3477 ns 7.431 ns 7.026 ns 8.084 ns 1.00 0.07 0.0057 48 B 1.00
Span<Char> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.895 ns 0.2964 ns 0.3294 ns 7.934 ns 7.330 ns 8.437 ns 1.00 0.00 0.0057 48 B 1.00
Span<Char> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.862 ns 0.2800 ns 0.3224 ns 7.770 ns 7.219 ns 8.578 ns 1.00 0.07 0.0057 48 B 1.00
Span<Double> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 8.273 ns 0.3909 ns 0.4501 ns 8.225 ns 7.613 ns 9.046 ns 1.00 0.00 0.0057 48 B 1.00
Span<Double> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 8.043 ns 0.3316 ns 0.3686 ns 7.953 ns 7.540 ns 8.741 ns 0.98 0.07 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.767 ns 0.2597 ns 0.2887 ns 7.772 ns 7.318 ns 8.222 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.767 ns 0.2868 ns 0.3069 ns 7.805 ns 7.165 ns 8.346 ns 1.00 0.07 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 9.678 ns 0.5640 ns 0.6269 ns 9.615 ns 8.812 ns 11.005 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.483 ns 0.3955 ns 0.4396 ns 7.413 ns 6.938 ns 8.390 ns 0.78 0.07 0.0057 48 B 1.00
Span<Single> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 8.161 ns 0.3820 ns 0.4087 ns 8.140 ns 7.579 ns 8.966 ns 1.00 0.00 0.0057 48 B 1.00
Span<Single> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 512 7.750 ns 0.2831 ns 0.3147 ns 7.681 ns 7.376 ns 8.542 ns 0.95 0.06 0.0057 48 B 1.00
Span<Byte> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 633.223 ns 8.3046 ns 7.7682 ns 632.430 ns 620.162 ns 647.521 ns 1.00 0.00 - - NA
Span<Byte> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 21.942 ns 2.2872 ns 2.6340 ns 23.206 ns 17.887 ns 26.030 ns 0.03 0.00 - - NA
Span<Char> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 616.793 ns 10.7145 ns 10.0224 ns 615.495 ns 603.050 ns 639.831 ns 1.00 0.00 - - NA
Span<Char> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 41.262 ns 1.0656 ns 1.1845 ns 41.207 ns 39.861 ns 44.710 ns 0.07 0.00 - - NA
Span<Double> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 322.755 ns 13.7626 ns 15.8490 ns 316.026 ns 300.807 ns 353.163 ns 1.00 0.00 - - NA
Span<Double> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 160.278 ns 3.6299 ns 4.1802 ns 160.077 ns 155.045 ns 170.722 ns 0.50 0.03 - - NA
Span<Int32> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 319.603 ns 8.6402 ns 9.9500 ns 321.806 ns 302.775 ns 333.367 ns 1.00 0.00 - - NA
Span<Int32> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 75.303 ns 1.8537 ns 2.1347 ns 75.466 ns 71.836 ns 79.226 ns 0.24 0.01 - - NA
Span<Int64> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 312.095 ns 7.5790 ns 8.7280 ns 308.705 ns 302.279 ns 331.895 ns 1.00 0.00 - - NA
Span<Int64> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 162.231 ns 3.8566 ns 4.2866 ns 162.476 ns 155.171 ns 170.240 ns 0.52 0.02 - - NA
Span<Single> Reverse Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 320.737 ns 10.5037 ns 11.6748 ns 321.167 ns 302.824 ns 345.795 ns 1.00 0.00 - - NA
Span<Single> Reverse Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 79.105 ns 1.5449 ns 1.4451 ns 79.080 ns 76.844 ns 81.631 ns 0.25 0.01 - - NA
Span<Byte> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.848 ns 0.6288 ns 0.7241 ns 7.610 ns 6.913 ns 9.330 ns 1.00 0.00 0.0057 48 B 1.00
Span<Byte> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.390 ns 0.2733 ns 0.3148 ns 7.373 ns 6.872 ns 8.104 ns 0.95 0.09 0.0057 48 B 1.00
Span<Char> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 8.250 ns 0.4993 ns 0.5750 ns 8.298 ns 7.401 ns 9.308 ns 1.00 0.00 0.0057 48 B 1.00
Span<Char> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.871 ns 0.2963 ns 0.3412 ns 7.870 ns 7.388 ns 8.526 ns 0.96 0.07 0.0057 48 B 1.00
Span<Double> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.799 ns 0.3453 ns 0.3837 ns 7.745 ns 7.263 ns 8.400 ns 1.00 0.00 0.0057 48 B 1.00
Span<Double> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.917 ns 0.4342 ns 0.5000 ns 7.966 ns 7.250 ns 8.866 ns 1.02 0.09 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 8.124 ns 0.6008 ns 0.6919 ns 8.000 ns 7.292 ns 9.662 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int32> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.962 ns 0.4545 ns 0.5052 ns 7.826 ns 7.385 ns 9.180 ns 0.98 0.08 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 9.337 ns 0.3945 ns 0.4385 ns 9.382 ns 8.083 ns 10.056 ns 1.00 0.00 0.0057 48 B 1.00
Span<Int64> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.301 ns 0.2937 ns 0.3382 ns 7.216 ns 6.798 ns 8.042 ns 0.78 0.06 0.0057 48 B 1.00
Span<Single> ReverseArray Job-FHRFIJ \runtime-master\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 10.709 ns 0.6433 ns 0.7150 ns 10.478 ns 9.719 ns 12.393 ns 1.00 0.00 0.0057 48 B 1.00
Span<Single> ReverseArray Job-KGLSXA \runtime\artifacts\bin\testhost\net7.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\corerun.exe 1024 7.895 ns 0.4347 ns 0.4831 ns 7.682 ns 7.318 ns 8.894 ns 0.74 0.05 0.0057 48 B 1.00

Please let me know if there is anything else I can look at.

return;
}
}
ReverseInner(array, index, length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to duplicate all of the above? How about instead having a single method in SpanHelpers:

public static void Reverse<T>(ref T elements, nuint length);

or something similar. That method can do all the of the delegation to the other helpers (ReverseByRef, ReverseInner, etc.), and then this method can be:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static void Reverse<T>(T[] array, int index, int length)
{
    ... // argument validation
    SpanHelpers.Reverse(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(array), index), length);
}

and Span.Reverse can be:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static void Reverse<T>(this Span<T> span) =>
    SpanHelpers.Reverse(ref MemoryMarshal.GetReference(span), span.Length);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yes I agree that would be a lot cleaner! I'll go ahead and make the changes and push an update.

Unsafe.As<byte, Vector256<byte>>(ref last) = tempFirst;
first = ref Unsafe.Add(ref first, Vector256<byte>.Count);
last = ref Unsafe.Add(ref last, -Vector256<byte>.Count);
numBytesWritten += Vector256<byte>.Count * 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments describing what this dense block of code is doing would be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments to hopefully clear up what the operation is. Please let me know if I can clarify anything.

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Other than my remaining comments, this LGTM, but @tannergooding should sign-off as well.

Copy link
Contributor

@deeprobin deeprobin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not be alarmed. I just noted a few coding style things.

For the things where I changed the naming from CamelCase to lowerCamelCase please check again that the usage is also adjusted and not that you just take over the diff 1:1.

Otherwise it looks good to me in the first place.

@saucecontrol
Copy link
Member

Did you benchmark the AVX2 version against the SSSE3 version (run with DOTNET_EnableAVX2=0 to allow VEX encoding but force the SSSE3 fallback)? I would expect that with the additional permutes required for 256-bit vectors, the perf difference might not justify the extra code paths.

@alexcovington
Copy link
Contributor Author

Did you benchmark the AVX2 version against the SSSE3 version (run with DOTNET_EnableAVX2=0 to allow VEX encoding but force the SSSE3 fallback)? I would expect that with the additional permutes required for 256-bit vectors, the perf difference might not justify the extra code paths.

@saucecontrol Haven't tried yet, I'll give it a shot and update this thread with results once finished.

@MichalStrehovsky MichalStrehovsky removed their request for review March 21, 2022 01:30
@alexcovington alexcovington force-pushed the vectorize-span-reverse branch from c663fa4 to 7ef62ab Compare April 25, 2022 15:59
ref T last = ref Unsafe.Subtract(ref Unsafe.Add(ref first, (int)length), 1);
do
{
(last, first) = (first, last);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trick is neat, but it generates worse IL:

https://sharplab.io/#v2:EYLgtghglgdgNAFxFANgHwAIBYAEAVAUwGcEBGAHjwD4AKAJwIDN8dGo6S4cHm8cUIJAJQBYAFABvcThksEBMAAccAXlbsSAbmmy2HBKv6CE2sbKMlD8paYC+48dnzEEAJkq0eLPZ25MWAsLiUmayNIEIXD4IQoY00VwRQnZAA==

Notice that there is one local temp for the existing code and two local temps with the tuples trick.

Is the JIT able to optimize out the extra local temp in all cases, even for larger structs? We do not seem to have a coverage for larger structs in dotnet/performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuples trick was mostly just for styling, but if explicitly using the temp generates better IL then we should probably go with that. I've updated the PR.

Copy link
Member

@tannergooding tannergooding Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a case where either Roslyn or the JIT could be updated.

Edit: I missed that the other JIT example above was for byref which is probably something only Roslyn can fix...


At a high level....

The (last, first) = (first, last) generates:

IL_0000: ldarg.1
IL_0001: ldarg.2
IL_0002: stloc.0
IL_0003: starg.s last
IL_0005: ldloc.0
IL_0006: starg.s first
IL_0008: ret

The var temp = last; last = first; first = temp; generates

IL_0000: ldarg.2
IL_0001: ldarg.1
IL_0002: starg.s last
IL_0004: starg.s first
IL_0006: ret

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the JIT optimize this to the same code? All this code does is shuffle values around. There is no memory or computation involved.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed an issue in dotnet/roslyn: dotnet/roslyn#61127

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the JIT optimize this to the same code? All this code does is shuffle values around. There is no memory or computation involved.

Likely because it can't "see" that the larger struct case is the same pattern as the smaller one. There's also the question of whether it's a worthwhile pattern to spend time trying to recognise.

@adamsitnik
Copy link
Member

@alexcovington I've merged #68493, could you please sync your branch?

alexcovington and others added 14 commits April 25, 2022 11:46
…re the same size as char, int, or long that use AVX2 or SSSE3 where possible
Co-authored-by: Theodore Tsirpanis <[email protected]>
…cit inlining and moved generic fallbacks into their own private methods.
… reversing empty or single-element array, better shuffle for int and long Reverse using bit control mask instead of vector control mask
…te4x64 and PermuteVar8x32 for Int32 and Int64 respectively to reduce total operations.
@alexcovington alexcovington force-pushed the vectorize-span-reverse branch from 17702cd to 80ae8ab Compare April 25, 2022 18:52
@alexcovington
Copy link
Contributor Author

@adamsitnik I've synced the PR to include your updated tests, please let me know if there is anything else I can look at.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @alexcovington !

@dakersnar
Copy link
Contributor

dakersnar commented Jun 8, 2022

The preview 5 perf report has detected a lot of improvement from this change on a variety of benchmarks, most notably:

System.Memory.Span<Byte>.Reverse(Size: 512), System.Memory.Span<Char>.Reverse(Size: 512), System.Memory.Span<Int32>.Reverse(Size: 512), System.Tests.Perf_Array.Reverse

I've included the raw data below in "details". Notably, while the x64 configs have speed up, we do see some slowdown on Arm64 configs. This is being tracked here: #68667

x64:
image

Arm64:
image

I also see some speedup in System.Memory.Span<Byte>.Fill(Size: 512), System.Memory.Span<Byte>.Clear(Size: 512). Do you know if these are related?

System.Memory.Span.Reverse(Size: 512)

Result Ratio Operating System Bit Processor Name
Slower 0.74 debian 11 Arm64 Unknown processor
Slower 0.80 ubuntu 18.04 Arm64 Unknown processor
Slower 0.71 ubuntu 20.04 Arm64 Unknown processor
Slower 0.74 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.89 macOS Monterey 12.3 Arm64 Apple M1 Max
Faster 11.72 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 10.60 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Faster 11.56 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 31.21 Windows 11 X64 AMD Ryzen 9 5900X
Faster 31.78 Windows 11 X64 AMD Ryzen 9 5950X
Faster 11.16 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 12.09 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Faster 12.43 Windows 11 X64 Intel Core i9-9900T CPU 2.10GHz
Faster 9.05 ubuntu 18.04 X64 Intel Xeon CPU E5530 2.40GHz
Faster 7.64 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Faster 12.95 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 9.42 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 12.58 macOS Big Sur 11.6.6 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Tests.Perf_Array.Reverse

Result Ratio Operating System Bit Processor Name
Slower 0.60 debian 11 Arm64 Unknown processor
Slower 0.68 ubuntu 18.04 Arm64 Unknown processor
Slower 0.58 ubuntu 20.04 Arm64 Unknown processor
Slower 0.60 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.75 macOS Monterey 12.3 Arm64 Apple M1 Max
Faster 4.76 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 3.20 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Faster 4.47 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 2.93 Windows 11 X64 AMD Ryzen 9 5900X
Faster 5.13 Windows 11 X64 AMD Ryzen 9 5950X
Faster 3.39 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 3.80 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Faster 3.46 Windows 11 X64 Intel Core i9-9900T CPU 2.10GHz
Faster 2.06 ubuntu 18.04 X64 Intel Xeon CPU E5530 2.40GHz
Faster 1.57 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Faster 4.12 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 3.99 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 2.21 macOS Big Sur 11.6.6 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Memory.Span.Reverse(Size: 512)

Result Ratio Operating System Bit Processor Name
Slower 0.57 debian 11 Arm64 Unknown processor
Slower 0.67 ubuntu 18.04 Arm64 Unknown processor
Slower 0.58 ubuntu 20.04 Arm64 Unknown processor
Slower 0.60 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.77 macOS Monterey 12.3 Arm64 Apple M1 Max
Faster 4.30 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 3.64 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Faster 3.76 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 3.41 Windows 11 X64 AMD Ryzen 9 5900X
Faster 4.02 Windows 11 X64 AMD Ryzen 9 5950X
Faster 4.05 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 3.42 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Faster 3.99 Windows 11 X64 Intel Core i9-9900T CPU 2.10GHz
Faster 2.25 ubuntu 18.04 X64 Intel Xeon CPU E5530 2.40GHz
Faster 1.79 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Faster 3.56 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 4.00 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 2.84 macOS Big Sur 11.6.6 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Memory.Span.Reverse(Size: 512)

Result Ratio Operating System Bit Processor Name
Slower 0.57 debian 11 Arm64 Unknown processor
Slower 0.67 ubuntu 18.04 Arm64 Unknown processor
Slower 0.58 ubuntu 20.04 Arm64 Unknown processor
Slower 0.59 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.33 macOS Monterey 12.3 Arm64 Apple M1 Max
Faster 7.07 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 6.52 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Faster 6.29 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 11.22 Windows 11 X64 AMD Ryzen 9 5900X
Faster 15.82 Windows 11 X64 AMD Ryzen 9 5950X
Faster 6.58 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 7.88 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Faster 6.65 Windows 11 X64 Intel Core i9-9900T CPU 2.10GHz
Faster 4.59 ubuntu 18.04 X64 Intel Xeon CPU E5530 2.40GHz
Faster 2.97 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Faster 7.16 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Faster 6.09 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 7.24 macOS Big Sur 11.6.6 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Memory community-contribution Indicates that the PR has been added by a community member tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.