Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.x: Improve the performance of Range() #684

Merged
merged 3 commits into from
Jun 28, 2018

Conversation

akarnokd
Copy link
Collaborator

@akarnokd akarnokd commented Jun 28, 2018

This PR improves the performance and reduces allocations in Range() as well as splits the operator to recursive and non-recursive variants due to a workaround required in the recursive version. See further explanations in the diff.

The PR also adds a benchmark for verifying the gains:

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-4790 CPU 3.60GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
Frequency=3513586 Hz, Resolution=284.6095 ns, Timer=TSC
  [Host]     : .NET Framework 4.6.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3110.0
  DefaultJob : .NET Framework 4.6.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3110.0

// =============================
// Before

 Method |       N |           Mean |         Error |        StdDev |      Gen 0 |   Allocated |
------- |-------- |---------------:|--------------:|--------------:|-----------:|------------:|
  Range |       1 |       1.097 us |     0.0212 us |     0.0218 us |     0.2270 |       960 B |
  Range |      10 |       4.377 us |     0.0854 us |     0.0757 us |     0.5035 |      2136 B |
  Range |     100 |      37.894 us |     0.1508 us |     0.1411 us |     3.2959 |     13936 B |
  Range |    1000 |     385.828 us |     2.6659 us |     2.4937 us |    30.7617 |    130277 B |
  Range |   10000 |   3,779.018 us |    13.7689 us |    10.7498 us |   304.6875 |   1290436 B |
  Range |  100000 |  37,165.724 us |   167.5172 us |   148.4996 us |  3062.5000 |  12899950 B |
  Range | 1000000 | 372,437.302 us | 3,172.0311 us | 2,811.9227 us | 30750.0000 | 128990179 B |

// =============================
// After

 Method |       N |             Mean |           Error |          StdDev |      Gen 0 |  Allocated |
------- |-------- |-----------------:|----------------:|----------------:|-----------:|-----------:|
  Range |       1 |         774.0 ns |        10.81 ns |        10.11 ns |     0.1392 |      584 B |
  Range |      10 |       2,648.0 ns |        50.40 ns |        47.15 ns |     0.2747 |     1160 B |
  Range |     100 |      19,796.7 ns |       203.15 ns |       190.03 ns |     1.6479 |     6920 B |
  Range |    1000 |     191,159.1 ns |       657.33 ns |       614.86 ns |    15.3809 |    64521 B |
  Range |   10000 |   1,903,824.9 ns |    12,620.89 ns |    11,805.59 ns |   152.3438 |   640541 B |
  Range |  100000 |  19,427,114.8 ns |    90,760.45 ns |    75,789.08 ns |  1500.0000 |  6400644 B |
  Range | 1000000 | 233,398,566.0 ns | 1,215,269.71 ns | 1,014,805.15 ns | 15250.0000 | 64001854 B |

public _(Range parent, IObserver<int> observer)
int _index;

IDisposable _task;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the _upstream is not accessible so it can't be used to replace a previous task.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to dispose this. I'll update the PR shortly with a fresh benchmark as well.

_start = parent._start;
_count = parent._count;
_index = start;
_end = start + count;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculate the end index once (exclusive).

var longRunning = scheduler.AsLongRunning();
if (longRunning != null)
var first = scheduler.Schedule(this, (innerScheduler, @this) => @this.LoopRec(innerScheduler));
Disposable.TrySetSingle(ref _task, first);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the first schedule returns, a subsequent schedule might be underway with _task set on it. This will avoid overwriting that newer IDisposable.

Disposable.TrySetSingle(ref _task, first);
}

private IDisposable LoopRec(IScheduler scheduler)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to pass around an index state as it can be safely stored in a field. The inner schedulers guarantee there is only one LoopRec accessing it at a time.

else
_index = idx + 1;
ForwardOnNext(idx);
var next = scheduler.Schedule(this, (innerScheduler, @this) => @this.LoopRec(innerScheduler));
Copy link
Collaborator Author

@akarnokd akarnokd Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return next and use Schedule(TState, Func<IScheduler, TState, IDisposable>) in run? There is a surprising behavior associated with the default CurrentThreadScheduler. Each schedule call will create a ScheduledItem which links to a previous ScheduledItem through its _disposable field due to recursion. This can create a very long chain of linked ScheduledItems. When the range ends, the Dispose() will then try to walk that linked list which ends up overflowing the call stack.

This approach in the PR will keep reference only to the latest scheduled task while not linking the internal ScheduledItems together. I believe the original Action<Action<TState>>-based version did this basically with the help of that particular extension method, but with more overhead.

@akarnokd
Copy link
Collaborator Author

akarnokd commented Jun 28, 2018

Improvements:

N Time Memory
1 29% 39%
10 40% 46%
100 48% 50%
1000 50% 50%
10000 50% 50%
100000 48% 50%
1000000 37% 50%

@danielcweber danielcweber merged commit b232955 into dotnet:master Jun 28, 2018
@akarnokd akarnokd deleted the RangeCleanup branch June 28, 2018 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants