Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List-like union as struct to save allocation + virtual dispatch #472

Merged
merged 6 commits into from
Jan 26, 2023

Conversation

atifaziz
Copy link
Member

@atifaziz atifaziz commented Apr 27, 2018

This PR converts the list-like union via a class hierarchy to a struct for the purpose of saving GC allocations and additional virtual hops. I don't expect there will ever be more than one allocation saved per list and per invocation of an operator so the savings there are really pathetic. The implementation also feels a bit clumsy but there is not much choice there.

Copy link
Member

@fsateler fsateler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the performance benefit has not been demonstrated to offset the more awkward code.

MoreLinq/ListLike.cs Outdated Show resolved Hide resolved
T this[int index] { get; }
}
readonly IList<T> _l;
readonly IReadOnlyList<T> _r;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These member names are quite cryptic.

Copy link
Member Author

@atifaziz atifaziz Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to address this with 064fb61. Perhaps rw and rx are still cryptic, but wanted to keep them the same length (trying to hard?) so these are *nix-inspired (rw = read-write, rx = read-only).

@@ -21,34 +21,35 @@ namespace MoreLinq
using System.Collections.Generic;

/// <summary>
/// Represents an list-like (indexable) data structure.
/// Represents a union over list types implementing either
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a union, what about a delegating tuple?

static class ListLike 
{
  public static (Func<int> GetCount, Func<int, T> GetItem) => (() => list.Count, (i) => list[i]);
  public static (Func<int> GetCount, Func<int, T> GetItem) => (() => list.Count, (i) => list[i]);
}

Copy link
Member Author

@atifaziz atifaziz Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't syntactically correct, but I think I get your idea nonetheless. This would cause two closures and indirect calls so I have my doubts it's as optimal.

@codecov
Copy link

codecov bot commented Jan 22, 2023

Codecov Report

Merging #472 (4f6685e) into master (5b49471) will decrease coverage by 0.04%.
The diff coverage is 88.88%.

@@            Coverage Diff             @@
##           master     #472      +/-   ##
==========================================
- Coverage   92.39%   92.36%   -0.04%     
==========================================
  Files         112      112              
  Lines        3434     3443       +9     
  Branches     1021     1023       +2     
==========================================
+ Hits         3173     3180       +7     
  Misses        199      199              
- Partials       62       64       +2     
Impacted Files Coverage Δ
MoreLinq/AggregateRight.cs 100.00% <ø> (ø)
MoreLinq/CountDown.cs 90.00% <ø> (ø)
MoreLinq/ScanRight.cs 100.00% <ø> (ø)
MoreLinq/ListLike.cs 86.95% <88.88%> (-5.91%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@atifaziz
Copy link
Member Author

I believe the performance benefit has not been demonstrated to offset the more awkward code.

Benchmark code:

[ParamsSource(nameof(Sources))]
public IEnumerable<int> Source { get; set; }

public IEnumerable<IEnumerable<int>> Sources
{
    get
    {
        var xs = Enumerable.Range(1, 1_000_000);
        yield return xs;
        yield return xs.ToArray();
    }
}

[Benchmark]
public void CountDown()
{
    Source.CountDown(1, (_, _) => 0).Consume();
}

Benchmarks for CountDown using last release (3.3.2):

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22621.1105)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.102
  [Host] : .NET 7.0.2 (7.0.222.60605), X86 RyuJIT AVX2
Method Source Mean Error StdDev Allocated
CountDown Int32[1000000] 22.20 ms 0.507 ms 1.488 ms 119 B
CountDown Syste(...)rator [36] 20.09 ms 0.479 ms 1.413 ms 179 B

Notice that the run with an array as a source is slower.

Following is the benchmark for changes in this PR up to 064fb61:

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22621.1105)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.102
  [Host] : .NET 7.0.2 (7.0.222.60605), X86 RyuJIT AVX2
Method Source Mean Error StdDev Median Allocated
CountDown Int32[1000000] 18.76 ms 0.455 ms 1.334 ms 18.45 ms 115 B
CountDown Syste(...)rator [36] 19.77 ms 0.477 ms 1.383 ms 19.20 ms 179 B

The run with an array as a source is slightly faster.


@viceroypenguin
Copy link
Contributor

viceroypenguin commented Jan 22, 2023

@atifaziz If you would, please compare your latest results to SuperLinq 4.6.0? I did a comparison against latest public release (3.3.2), and I show that SuperLinq is 40% faster on array and essentially equivalent (4% faster) on enumerables.

For CountDown specifically, it appears to be faster to use the corelib Reverse() operator than to do any allocation of our own. (was thinking ScanRight)

For CountDown specifically, it appears to be faster to use the collection foreach rather than use the list-like structure.

// * Summary *

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22623.1180)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.102
  [Host] : .NET 7.0.2 (7.0.222.60605), X64 RyuJIT AVX2


|             Method |               Source |     Mean |    Error |   StdDev | Allocated |
|------------------- |--------------------- |---------:|---------:|---------:|----------:|
| SuperLinqCountDown |       Int32[1000000] | 10.30 ms | 0.187 ms | 0.166 ms |     138 B |
|  MoreLinqCountDown |       Int32[1000000] | 17.24 ms | 0.305 ms | 0.339 ms |     186 B |
| SuperLinqCountDown | Syste(...)rator [36] | 13.02 ms | 0.250 ms | 0.234 ms |     229 B |
|  MoreLinqCountDown | Syste(...)rator [36] | 13.58 ms | 0.240 ms | 0.224 ms |     237 B |

Data: https://gist.github.com/viceroypenguin/ec71e7adb8124c455e8a2925bee1b987

@viceroypenguin
Copy link
Contributor

viceroypenguin commented Jan 23, 2023

I had already removed ListLike<> from SuperLinq code, which allows for the possibility of comparison. I did a further comparison for all three methods (AggregateRight, CountDown, ScanRight). I did find a performance improvement for SuperLinq in ScanRight; after fixing ScanRight, I found that in only one case does the ListLike<> behavior improve performance (tested against viceroypenguin/SuperLinq@c98576b):

// * Summary *

BenchmarkDotNet=v0.13.2, OS=Windows 11 (10.0.22623.1180)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.102
  [Host] : .NET 7.0.2 (7.0.222.60605), X64 RyuJIT AVX2
Method Source Mean Error StdDev Gen0 Gen1 Gen2 Allocated
SuperLinqCountDown Int32[1000000] 10.055 ms 0.1166 ms 0.1090 ms - - - 138 B
MoreLinqCountDown Int32[1000000] 16.673 ms 0.2415 ms 0.2259 ms - - - 186 B
SuperLinqScanRight Int32[1000000] 12.884 ms 0.1779 ms 0.1577 ms 281.2500 281.2500 281.2500 4000642 B
MoreLinqScanRight Int32[1000000] 13.821 ms 0.1599 ms 0.1335 ms 375.0000 375.0000 375.0000 4000912 B
SuperLinqAggregateRight Int32[1000000] 6.466 ms 0.0895 ms 0.0837 ms 265.6250 265.6250 265.6250 4000500 B
MoreLinqAggregateRight Int32[1000000] 5.326 ms 0.0596 ms 0.0557 ms - - - 38 B
SuperLinqCountDown Syste(...)rator [36] 10.958 ms 0.2174 ms 0.2232 ms - - - 149 B
MoreLinqCountDown Syste(...)rator [36] 13.263 ms 0.2488 ms 0.2206 ms - - - 237 B
SuperLinqScanRight Syste(...)rator [36] 15.848 ms 0.3051 ms 0.3858 ms 375.0000 375.0000 375.0000 8000878 B
MoreLinqScanRight Syste(...)rator [36] 16.365 ms 0.3096 ms 0.3180 ms 343.7500 343.7500 343.7500 8000949 B
SuperLinqAggregateRight Syste(...)rator [36] 6.637 ms 0.0931 ms 0.0871 ms 289.0625 289.0625 289.0625 4000552 B
MoreLinqAggregateRight Syste(...)rator [36] 7.955 ms 0.1582 ms 0.1693 ms 328.1250 328.1250 328.1250 4000636 B

Data: https://gist.github.com/viceroypenguin/ec71e7adb8124c455e8a2925bee1b987

@atifaziz
Copy link
Member Author

atifaziz commented Jan 23, 2023

@atifaziz If you would, please compare your latest results to SuperLinq 4.6.0? I did a comparison against latest public release (3.3.2), and I show that SuperLinq is 40% faster on array and essentially equivalent (4% faster) on enumerables.

For CountDown specifically, it appears to be faster to use the collection foreach rather than use the list-like structure.

I shared those benchmarks to get back to the comment from @fsateler, where he rightly challenged that the performance & allocation benefits were not demonstrated (trade-off being code legibility). So this PR is not about CountDown, but since it uses ListLike<T> and scans the entire source, I wanted to use it as the basis of the demonstration. The (separate) problem with CountDown is that its loop is poorly written with respect to performance. It accesses the list Count property 3 times and twice during each iteration of the loop. Each access is a virtual dispatch so it probably adds up. If the list count is taken into a local variable like so:

diff --git a/MoreLinq/CountDown.cs b/MoreLinq/CountDown.cs
index f7b8ade1..1f5f72da 100644
--- a/MoreLinq/CountDown.cs
+++ b/MoreLinq/CountDown.cs
@@ -63,17 +63,13 @@ static partial class MoreEnumerable
                      ? IterateCollection(collectionCount)
                      : IterateSequence();
 
-            IEnumerable<TResult> IterateList(ListLike<T> list)
+            IEnumerable<TResult> IterateList(IListLike<T> list)
             {
-                var countdown = Math.Min(count, list.Count);
+                var listCount = list.Count;
+                var countdown = Math.Min(count, listCount);
 
-                for (var i = 0; i < list.Count; i++)
-                {
-                    var cd = list.Count - i <= count
-                           ? --countdown
-                           : (int?) null;
-                    yield return resultSelector(list[i], cd);
-                }
+                for (var i = 0; i < listCount; i++)
+                    yield return resultSelector(list[i], listCount - i <= count ? --countdown : null);
             }
 
             IEnumerable<TResult> IterateCollection(int i)

then it makes a stark improvement:

Method Source Mean Error StdDev Median Allocated
MoreLinqCountDown Int32[1000000] 9.781 ms 0.3512 ms 1.0245 ms 9.691 ms 96 B
SuperLinqCountDown Int32[1000000] 12.477 ms 0.2794 ms 0.8195 ms 12.274 ms 93 B
MoreLinqCountDown Syste(...)rator [36] 18.791 ms 0.4423 ms 1.2972 ms 18.507 ms 179 B
SuperLinqCountDown Syste(...)rator [36] 18.113 ms 0.4845 ms 1.4132 ms 17.483 ms 179 B

In fact, it's almost 15% faster than SuperLinq (which I'm guessing is just iterating the list via the enumerator).

For CountDown specifically, it appears to be faster to use the collection foreach rather than use the list-like structure.

So perhaps not? I haven't investigated in detail as to why, but if I were to take a somewhat educated guess, then I'd say iterating a sequence makes two virtual dispatches per loop iteration (MoveNext + Current) whereas with a list, there's just one, the indexer. Moreover, even if ListLike<T> indexer is coded for maximum probability of getting in-lined, it makes no discernible difference.


@atifaziz atifaziz changed the title List-like union as a struct to save on allocations List-like union as struct to save allocation + virtual dispatch Jan 24, 2023
@atifaziz atifaziz requested a review from fsateler January 24, 2023 11:39
Copy link
Member Author

@atifaziz atifaziz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsateler I'm going to assume you don't have the time to review, so I will go ahead and merge this since there's been at least another pair of eyes (@viceroypenguin) over it. If there's any strong objection (at some future point) to some aspect or the whole refactoring then feel free to open a follow-up issue. Thanks!

@atifaziz atifaziz merged commit 9521869 into morelinq:master Jan 26, 2023
@atifaziz atifaziz deleted the list-union-struct branch January 26, 2023 17:19
Copy link
Member

@fsateler fsateler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the review but I forgot to actually submit it :(

Nothing critical but here it is anyway.

MoreLinq/ListLike.cs Show resolved Hide resolved
MoreLinq/ListLike.cs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants