-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FileStream rewrite: Use IValueTaskSource instead of TaskCompletionSource #50802
Conversation
Tagging subscribers to this area: @carlossanlop Issue DetailsThe purpose of this PR is to remove a large allocation found in The allocation improvement can be achieved by creating a new type, called The behavior was largely left untouched, except for the following differences:
Unit testsManually ran unit tests from Benchmarks
.NET 5.0 vs .NET 6.0 before this PRdotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_before_pr' --threshold 0.0001% summary:
.NET 5.0 vs .NET 6.0 after this PRdotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_after_pr' --threshold 0.0001%
ProfilingI reused the same code we have in our benchmarks, but I executed it inside a console app that forces the consumption of the compiled runtime code. You'll notice the top
|
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 64,001 |
- System.IO.MemoryFileStreamCompletionSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,797 |
- System.String | 2,153 |
- System.Object[] | 952 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,803 |
- System.String | 2,144 |
- System.Object[] | 952 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
ReadAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 512,001 |
- System.IO.MemoryFileStreamCompletionSource | 512,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 512,000 |
- System.Threading.OverlappedData | 512,000 |
- System.SByte[] | 2,806 |
- System.String | 2,153 |
- System.Object[] | 957 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 512,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 512,000 |
- System.Threading.OverlappedData | 512,000 |
- System.SByte[] | 2,803 |
- System.String | 2,144 |
- System.Object[] | 964 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
- System.RuntimeType[] | 178 |
ReadAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 64,013 |
- System.IO.MemoryFileStreamCompletionSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,840 |
- System.String | 2,153 |
- System.Object[] | 955 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,836 |
- System.String | 2,144 |
- System.Object[] | 957 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.Strategies.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Runtime.CompilerServices.StrongBox<> | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
ReadAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 64,235 |
- System.Threading.Tasks.Task<> | 64,015 |
- System.IO.FileStreamCompletionSource | 64,000 |
- System.SByte[] | 2,916 |
- System.String | 2,155 |
- System.Object[] | 956 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 250 |
- System.Threading.OverlappedData | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource | 64,000 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 63,812 |
- System.SByte[] | 2,909 |
- System.String | 2,144 |
- System.Object[] | 964 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.Strategies.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 250 |
- System.Threading.OverlappedData | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
- System.Runtime.CompilerServices.StrongBox<> | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
WriteAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 64,001 |
- System.IO.MemoryFileStreamCompletionSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,798 |
- System.String | 2,153 |
- System.Object[] | 954 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 64,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 64,000 |
- System.Threading.OverlappedData | 64,000 |
- System.SByte[] | 2,790 |
- System.String | 2,144 |
- System.Object[] | 959 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
- System.RuntimeType[] | 178 |
WriteAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 512,001 |
- System.IO.MemoryFileStreamCompletionSource | 512,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 512,000 |
- System.Threading.OverlappedData | 512,000 |
- System.SByte[] | 2,812 |
- System.String | 2,153 |
- System.Object[] | 959 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 512,000 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 512,000 |
- System.Threading.OverlappedData | 512,000 |
- System.SByte[] | 2,790 |
- System.String | 2,144 |
- System.Object[] | 958 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Byte[] | 539 |
- System.Reflection.ParameterInfo[] | 281 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 251 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
- System.RuntimeType[] | 178 |
WriteAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 64,014 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 32,250 |
- System.Threading.OverlappedData | 32,250 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 32,249 |
- System.IO.MemoryFileStreamCompletionSource | 32,000 |
- System.IO.FileStreamCompletionSource | 32,000 |
- System.SByte[] | 2,915 |
- System.String | 2,155 |
- System.Object[] | 952 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.Threading.ThreadPoolBoundHandleOverlapped | 32,250 |
- System.Threading.OverlappedData | 32,250 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 32,246 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource | 32,000 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource | 32,000 |
- System.SByte[] | 2,893 |
- System.String | 2,144 |
- System.Object[] | 954 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.Strategies.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
- System.Runtime.CompilerServices.StrongBox<> | 250 |
- System.Reflection.RuntimeMethodInfo | 241 |
- System.Signature | 225 |
WriteAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Type | Allocations |
---|---|
- System.Threading.Tasks.Task<> | 64,014 |
- System.IO.FileStreamCompletionSource | 64,000 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 63,963 |
- System.SByte[] | 2,915 |
- System.String | 2,155 |
- System.Object[] | 956 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Diagnostics.Tracing.ScalarTypeInfo | 489 |
- System.Func<,,> | 485 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 250 |
- System.Threading.OverlappedData | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
Allocations after this PR
Type | Allocations |
---|---|
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource | 64,000 |
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> | 63,830 |
- System.SByte[] | 2,912 |
- System.String | 2,144 |
- System.Object[] | 966 |
- System.Byte[] | 789 |
- System.Reflection.RuntimeParameterInfo | 650 |
- System.Reflection.ParameterInfo[] | 281 |
- System.IO.Strategies.BufferedFileStreamStrategy | 252 |
- Microsoft.Win32.SafeHandles.SafeFileHandle | 252 |
- System.IO.FileStream | 252 |
- System.IO.Strategies.AsyncWindowsFileStreamStrategy | 250 |
- System.Threading.ThreadPoolBoundHandleOverlapped | 250 |
- System.Threading.OverlappedData | 250 |
- System.Threading.SemaphoreSlim | 250 |
- System.Threading.ThreadPoolBoundHandle | 250 |
- System.Threading.PreAllocatedOverlapped | 250 |
- System.Runtime.CompilerServices.StrongBox<> | 250 |
- System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask | 249 |
- System.Threading.QueueUserWorkItemCallbackDefaultContext<> | 249 |
@adamsitnik @jozkee @jeffhandley @stephentoub
Also: Don't forget to ignore whitespace changes so it's easier to review the nested type's changes 😃
Author: | carlossanlop |
---|---|
Assignees: | carlossanlop |
Labels: |
|
Milestone: | 6.0.0 |
Yes, thank you bot... 😆 |
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise; LGTM, will approve if/when comments are resolved, most of them are nits and/or questions.
internal const long NoResult = 0; | ||
internal const long ResultSuccess = (long)1 << 32; | ||
internal const long ResultError = (long)2 << 32; | ||
internal const long RegisteringCancellation = (long)4 << 32; | ||
internal const long CompletedCallback = (long)8 << 32; | ||
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider prefixing this constants with AsyncCompletionSource_
or something alike since these names are too ambiguous or keep them in their respective classes regardless of them being duplicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or put them into a nested static class called ReturnCodes
:
internal static partial class FileStreamHelpers
{
internal static class ReturnCodes
{
internal const long NoResult = 0;
internal const long ResultSuccess = (long)1 << 32;
internal const long ResultError = (long)2 << 32;
internal const long RegisteringCancellation = (long)4 << 32;
internal const long CompletedCallback = (long)8 << 32;
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
}
but it's a nit (not a blocker)
...ibraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamCompletionSource.Win32.cs
Outdated
Show resolved
Hide resolved
_strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ? | ||
_strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ? | |
_strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null | |
strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ? | |
strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null |
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
// be directly the AwaitableProvider that's completing (in the case where the preallocated | ||
// overlapped was already in use by another operation). | ||
object? state = ThreadPoolBoundHandle.GetNativeOverlappedState(pOverlapped); | ||
Debug.Assert(state is (AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug.Assert(state is (AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource)); | |
Debug.Assert(state is AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource); |
int errorCode = unchecked((int)(packedResult & uint.MaxValue)); | ||
if (errorCode == Interop.Errors.ERROR_OPERATION_ABORTED) | ||
{ | ||
_source.SetException(ExceptionDispatchInfo.SetCurrentStackTrace(new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_source.SetException(ExceptionDispatchInfo.SetCurrentStackTrace(new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true)))); | |
Exception e = new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true)); | |
e.SetCurrentStackTrace(); | |
_source.SetException(e); |
} | ||
|
||
private void Cancel(CancellationToken token) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comment and asserts:
Lines 211 to 215 in 57bbdee
// WARNING: This may potentially be called under a lock (during cancellation registration) | |
Debug.Assert(state is FileStreamCompletionSource, "Unknown state passed to cancellation"); | |
FileStreamCompletionSource completionSource = (FileStreamCompletionSource)state; | |
Debug.Assert(completionSource._overlapped != null && !completionSource.Task.IsCompleted, "IO should not have completed yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first assert is not needed because the method argument is not an object? state
.
I'll add the assert that checks for overlapped an completed status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state
param could be added to this method and allows you to conserve the first Debug.Assert.
long packedResult = Interlocked.CompareExchange(ref _result, FileStreamHelpers.RegisteringCancellation, FileStreamHelpers.NoResult); | ||
if (packedResult == FileStreamHelpers.NoResult) | ||
{ | ||
_cancellationRegistration = cancellationToken.UnsafeRegister(static (s, token) => ((FileStreamValueTaskSource)s!).Cancel(token), this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we pass the callback on a similar fashion to FileStreamCompletionSource
Line 92 in 57bbdee
_cancellationRegistration = cancellationToken.UnsafeRegister(cancelCallback, this); |
Or why did you opted for this way instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It made more sense to pass this
as the second argument of UnsafeRegister
, which ensures it's a ValueTaskSource
instance, and saves us from the check of the object? state
parameter like in the old code. The old code also allocates an Action<object?>
instance to save a reference to the Cancel
method.
We have a similar example in System.Threading.Channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It made more sense to pass this as the second argument of UnsafeRegister
FileStreamCompletionSource was also passing this
as 2nd argument.
The old code also allocates an Action<object?> instance to save a reference to the Cancel method.
Yes, seems that it was caching the method group to avoid allocating everytime it was passed to UnsafeRegister
.
However I think you can avoid the allocation without caching yourself by doing this:
_cancellationRegistration = cancellationToken.UnsafeRegister((s, token) => Cancel(s, token), this);
Just a suggestion, not something that you should block on.
...ries/System.Private.CoreLib/src/System/IO/Strategies/Net5CompatFileStreamStrategy.Windows.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
This is the change I was waiting for 👏🏾👏🏾 |
I haven't looked but I assume we'll reuse the operation objects if there's no concurrency? This is what sockets do today so in the 95% case of no overlapping writes or reads, there'll be a single reusable allocation on the FileStream |
It's not, actually. You want the next one where the IValueTaskSource instance is cached. |
...ibraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamCompletionSource.Win32.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
} | ||
// else: Some other thread stole the result, so now it is responsible to finish the callback | ||
} | ||
// else: Some other thread is registering a cancellation, so it *must* finish the callback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this logic looks more complicated than it needs to be. I realize you're just following the same flow as what was already there, so nothing needs to be changed in this PR. But we should revisit all of this synchronization later; I suspect we can do it more succinctly and efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened an issue to follow up on this and on the caching of the IValueTaskSource.
#50972
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamValueTaskSource.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, but before we merge I would like to see the following benchmark results:
.NET 6 with new strategy enabled: before and after your change. (not .NET 5 vs 6 as there are too many moving targets)
You should be able to achieve that by running the following command in the perf repo;
git clone https://github.com/dotnet/performance.git
cd performance\src\benchmarks\micro
dotnet run -c Release -f net6.0 --filter *Perf_FileStream* --coreRun $pathToCoreRunWithoutYourChanges $pathToCoreRunWithYourChanges --envVars "DOTNET_SYSTEM_IO_USENET5COMPATFILESTREAM:0"
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/Strategies/WindowsFileStreamStrategy.cs
Show resolved
Hide resolved
...ries/System.Private.CoreLib/src/System/IO/Strategies/Net5CompatFileStreamStrategy.Windows.cs
Show resolved
Hide resolved
/// <see cref="MemoryHandle"/> when the operation has completed. This should only be used | ||
/// when memory doesn't wrap a byte[]. | ||
/// </summary> | ||
private sealed class MemoryAwaitableProvider : FileStreamValueTaskSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be used when memory doesn't wrap a byte[].
I just realized that we don't have benchmarks for this scenario as Memory
used in our micro-benchmarks always wraps a byte array. Could you please add such benchmarks to https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/libraries/System.IO.FileSystem/Perf.FileStream.cs ? One test case should be enough as this should be rare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the benchmarks do consume this type. If you look at some of the profiling tests I executed, you'll see that MemoryValueTaskSource
shows up a few times. The profiling tests reuse the code we had in our benchmarks.
I don't mind adding an explicit one though. Dou you want me to add one anyway?
internal const long NoResult = 0; | ||
internal const long ResultSuccess = (long)1 << 32; | ||
internal const long ResultError = (long)2 << 32; | ||
internal const long RegisteringCancellation = (long)4 << 32; | ||
internal const long CompletedCallback = (long)8 << 32; | ||
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or put them into a nested static class called ReturnCodes
:
internal static partial class FileStreamHelpers
{
internal static class ReturnCodes
{
internal const long NoResult = 0;
internal const long ResultSuccess = (long)1 << 32;
internal const long ResultError = (long)2 << 32;
internal const long RegisteringCancellation = (long)4 << 32;
internal const long CompletedCallback = (long)8 << 32;
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
}
but it's a nit (not a blocker)
src/libraries/System.Private.CoreLib/src/System.Private.CoreLib.Shared.projitems
Outdated
Show resolved
Hide resolved
...m.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.ValueTaskSource.cs
Outdated
Show resolved
Hide resolved
...m.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.ValueTaskSource.cs
Show resolved
Hide resolved
// Using RunContinuationsAsynchronously for compat reasons (old API used Task.Factory.StartNew for continuations) | ||
_source.RunContinuationsAsynchronously = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you run the perf. benchmarks after doing this change and did you notice any impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed - There is a slight improvement.
I made sure to compare apples to apples: I used my latest commit, and built it setting RunContinuationsAsynchronously
to true
, then to false
. I compared it both times against our code in .NET 5.0.
The improvement is clear when setting it to false
(or not setting it at all).
RunContinuationsAsynchronously = true
❯ dotnet run -c release --base "D:\fs50" --diff "D:\fs60_NEW_ON_IVTS_asynctrue" --threshold 0.0001%
summary:
better: 24, geomean: 3.402
worse: 19, geomean: 2.244
total diff: 43
Slower | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) | 5.49 | 882252.57 | 4840275.00 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) | 5.12 | 1003671.88 | 5136350.00 | bimodal |
Perf_FileStream.WriteByte(fileSize: 1024, options: None) | 4.44 | 257218.23 | 1141827.01 | |
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) | 4.08 | 286859.38 | 1171600.96 | |
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) | 3.25 | 1795459.38 | 5832233.33 | |
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) | 3.11 | 1844293.36 | 5741847.92 | several? |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) | 3.11 | 1868294.53 | 5807825.00 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) | 3.08 | 2209384.38 | 6796443.75 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) | 2.46 | 557062.05 | 1371566.15 | |
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) | 2.33 | 478821.50 | 1116033.48 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) | 2.28 | 543512.61 | 1239956.25 | bimodal |
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) | 1.61 | 3479778.75 | 5612725.00 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.44 | 8197915.63 | 11814575.00 | |
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.33 | 168291900.00 | 224355000.00 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) | 1.33 | 5313143.75 | 7063218.75 | |
Perf_FileStream.Flush(fileSize: 1024, options: None) | 1.25 | 3448111.25 | 4313832.81 | |
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) | 1.12 | 50055150.00 | 56309025.00 | |
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) | 1.12 | 56033900.00 | 62803900.00 | |
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.12 | 133329050.00 | 148833000.00 | bimodal |
Faster | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) | 117.07 | 5489541.67 | 46889.31 | |
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) | 65.93 | 2811281.25 | 42642.74 | |
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) | 57.84 | 2571736.46 | 44462.45 | |
Perf_FileStream.SeekForward(fileSize: 1024, options: None) | 21.09 | 876923.26 | 41586.04 | |
Perf_FileStream.OpenClose(fileSize: 1024, options: None) | 5.00 | 152781.66 | 30548.62 | |
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) | 4.76 | 153731.81 | 32314.04 | |
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) | 4.03 | 2463834750.00 | 611183400.00 | |
Perf_FileStream.ReadByte(fileSize: 1024, options: None) | 3.93 | 165696.68 | 42208.34 | |
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) | 3.52 | 200423.36 | 56865.52 | |
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) | 2.62 | 230670.38 | 87990.73 | |
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) | 2.58 | 238920.22 | 92478.58 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) | 2.34 | 3222202.50 | 1377939.84 | |
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) | 1.87 | 156593600.00 | 83681962.50 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) | 1.66 | 19598723.08 | 11803641.67 | |
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) | 1.63 | 22492768.75 | 13818900.00 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) | 1.57 | 7025697.92 | 4487121.88 | |
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) | 1.47 | 686776300.00 | 466936500.00 | bimodal |
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) | 1.33 | 987460.55 | 743825.45 | |
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) | 1.24 | 975146.09 | 784156.88 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.19 | 5221529.17 | 4387085.94 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) | 1.18 | 1301925.00 | 1103803.79 | |
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.04 | 116486775.00 | 112407200.00 | |
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) | 1.03 | 21953590.91 | 21241122.73 | |
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.01 | 88348175.00 | 87467100.00 |
RunContinuationsAsynchronously = false
dotnet run -c release --base "D:\fs50" --diff "D:\fs60_NEW_ON_IVTS_asyncfalse" --threshold 0.0001%
summary:
better: 23, geomean: 3.592
worse: 20, geomean: 2.135
total diff: 43
Slower | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) | 5.36 | 882252.57 | 4730875.00 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) | 5.05 | 1003671.88 | 5071912.50 | bimodal |
Perf_FileStream.WriteByte(fileSize: 1024, options: None) | 4.25 | 257218.23 | 1094164.38 | several? |
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) | 4.05 | 286859.38 | 1161160.27 | |
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) | 3.23 | 1795459.38 | 5791481.25 | |
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) | 3.10 | 1844293.36 | 5709820.83 | several? |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) | 3.09 | 2209384.38 | 6836504.17 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) | 3.07 | 1868294.53 | 5733883.33 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) | 2.46 | 557062.05 | 1368034.90 | |
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) | 2.31 | 478821.50 | 1103696.25 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) | 2.24 | 543512.61 | 1218234.38 | bimodal |
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) | 1.60 | 3479778.75 | 5572682.29 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.42 | 8197915.63 | 11634850.00 | |
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.33 | 168291900.00 | 223141800.00 | bimodal |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) | 1.31 | 5313143.75 | 6949831.25 | |
Perf_FileStream.Flush(fileSize: 1024, options: None) | 1.26 | 3448111.25 | 4332915.63 | |
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.13 | 133329050.00 | 150223800.00 | bimodal |
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) | 1.11 | 50055150.00 | 55699050.00 | |
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) | 1.10 | 56033900.00 | 61710550.00 | |
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.00 | 88348175.00 | 88759625.00 |
Faster | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) | 116.97 | 5489541.67 | 46929.42 | |
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) | 65.76 | 2811281.25 | 42753.29 | |
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) | 57.56 | 2571736.46 | 44676.15 | |
Perf_FileStream.SeekForward(fileSize: 1024, options: None) | 21.52 | 876923.26 | 40756.47 | |
Perf_FileStream.OpenClose(fileSize: 1024, options: None) | 5.06 | 152781.66 | 30216.63 | |
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) | 4.81 | 153731.81 | 31975.57 | |
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) | 4.05 | 2463834750.00 | 608864800.00 | |
Perf_FileStream.ReadByte(fileSize: 1024, options: None) | 3.99 | 165696.68 | 41526.73 | |
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) | 3.54 | 200423.36 | 56610.08 | |
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) | 2.62 | 230670.38 | 87877.10 | |
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) | 2.59 | 238920.22 | 92291.75 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) | 2.40 | 3222202.50 | 1345116.25 | |
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) | 1.86 | 156593600.00 | 83969875.00 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) | 1.64 | 19598723.08 | 11925173.68 | |
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) | 1.62 | 22492768.75 | 13859764.71 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) | 1.56 | 7025697.92 | 4493881.25 | |
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) | 1.46 | 686776300.00 | 470158300.00 | bimodal |
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) | 1.32 | 987460.55 | 750294.64 | |
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) | 1.24 | 975146.09 | 784056.56 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) | 1.17 | 1301925.00 | 1108880.36 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.17 | 5221529.17 | 4470673.44 | |
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) | 1.07 | 21953590.91 | 20580725.00 | |
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.01 | 116486775.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the result comparison if we compare the benchmark results of setting it to true
vs false
:
base: RunContinuationsAsynchronously=true, diff: RunContinuationsAsynchronously=false
dotnet run -c release --base "D:\fs60_NEW_ON_IVTS_asynctrue" --diff "D:\fs60_NEW_ON_IVTS_asyncfalse" --threshold 0.0001%
summary:
better: 14, geomean: 1.018
worse: 11, geomean: 1.017
total diff: 25
Slower | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.WriteAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) | 1.05 | 67143550.00 | 70259187.50 | |
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.02 | 112407200.00 | 115107550.00 | |
Perf_FileStream.Read_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) | 1.02 | 240016.67 | 244921.93 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.02 | 4387085.94 | 4470673.44 | |
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) | 1.01 | 87467100.00 | 88759625.00 | |
Perf_FileStream.WriteAsync_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) | 1.01 | 21684354.55 | 21982809.09 | |
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: Asynchronous) | 1.01 | 1236709.90 | 1253242.19 | |
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 1024, options: None) | 1.01 | 39289.54 | 39730.73 | |
Perf_FileStream.Write_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) | 1.01 | 21445427.27 | 21646650.00 | |
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) | 1.01 | 743825.45 | 750294.64 | |
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) | 1.00 | 44462.45 | 44676.15 |
Faster | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Perf_FileStream.WriteByte(fileSize: 1024, options: None) | 1.04 | 1141827.01 | 1094164.38 | several? |
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) | 1.03 | 45688612.50 | 44474575.00 | |
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) | 1.02 | 1377939.84 | 1345116.25 | |
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) | 1.02 | 4840275.00 | 4730875.00 | |
Perf_FileStream.SeekForward(fileSize: 1024, options: None) | 1.02 | 41586.04 | 40756.47 | |
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: Asynchronous) | 1.02 | 130849050.00 | 128463775.00 | |
Perf_FileStream.ReadByte(fileSize: 1024, options: None) | 1.02 | 42208.34 | 41526.73 | |
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) | 1.02 | 11814575.00 | 11634850.00 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) | 1.01 | 5807825.00 | 5733883.33 | |
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) | 1.01 | 5136350.00 | 5071912.50 | |
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) | 1.01 | 56309025.00 | 55699050.00 | |
Perf_FileStream.Read_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) | 1.01 | 33747814.29 | 3338345.14 | |
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) | 1.01 | 32314.04 | 31975.57 | |
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 1024, options: Asynchronous) | 1.00 | 81988.09 | 81641.78 |
long packedResult = Interlocked.CompareExchange(ref _result, FileStreamHelpers.RegisteringCancellation, FileStreamHelpers.NoResult); | ||
if (packedResult == FileStreamHelpers.NoResult) | ||
{ | ||
_cancellationRegistration = cancellationToken.UnsafeRegister(static (s, token) => ((FileStreamValueTaskSource)s!).Cancel(token), this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It made more sense to pass this as the second argument of UnsafeRegister
FileStreamCompletionSource was also passing this
as 2nd argument.
The old code also allocates an Action<object?> instance to save a reference to the Cancel method.
Yes, seems that it was caching the method group to avoid allocating everytime it was passed to UnsafeRegister
.
However I think you can avoid the allocation without caching yourself by doing this:
_cancellationRegistration = cancellationToken.UnsafeRegister((s, token) => Cancel(s, token), this);
Just a suggestion, not something that you should block on.
return vt.IsCompleted? | ||
vt.Result : | ||
vt.AsTask().GetAwaiter().GetResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to apply this same pattern in void Write
?
https://github.com/dotnet/runtime/blob/9bebe1f4b5ab5ec63ff537140d89a166a44d1d7f/src/libraries/System.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.cs#L238-L239
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but a ValueTask
does not have a Result
property, like a ValueTask<T>
does, because it returns void. So I'm not sure if it's possible to apply an equivalent pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public override void Write(byte[] buffer, int offset, int count)
{
ValueTask vt = WriteAsyncInternal(new ReadOnlyMemory<byte>(buffer, offset, count), CancellationToken.None);
if (vt.IsCompleted)
{
vt.GetAwaiter().GetResult();
}
else
{
vt.AsTask().GetAwaiter().GetResult();
}
}
That said, it's not much of an issue for this case. The read side is an issue because AsTask() will almost certainly need to allocate a new Task<int>
in order to hand back the result, but for the write side, AsTask() on an already completed ValueTask will just return Task.Completed, assuming the operation completed successfully. So the only time the existing code for the write side will allocate is when the operation was canceled or faulted, in which case we're about to throw an exception anyway, and the extra task allocation won't really matter. Up to you whether you want to change it.
14754f6
to
be19e14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; thanks Carlos!
Comparison against 5.0 is a comparison of all of the changes that were merged to main since we have snapped 5.0 with your change against 5.0. What we care about is how your change (and nothing else) is affecting the perf, so we need to isolate the runs to only your change. To be able to do that you need to run the benchmarks using two CoreRuns:
My personal workflow:
build -c Release -subset clr+libs
build -c Release -subset clr.corelib+clr.nativecorelib+libs.PreTest
dotnet run -c Release -f net6.0 --filter *ABC* --corerun C:\Projects\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\before\CoreRun.exe C:\Projects\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\6.0.0\corerun.exe |
@adamsitnik @jozkee here's the table comparing the before (original) and after of this PR. Most results had allocation improvements, but worse speed when FileOptions = Asynchronous: ResultsCommand:
Note: The runtime2 corerun contains the baseline commit before my changes, the runtime corerun contains this PR's changes.
|
…Source only needed by Net5CompatFileStreamStrategy.
…ep in sync with the name of FileStreamCompletionSource.
As discussed offline, such a regression needs to be solved before merging the PR. I've even built your fork and run the benchmarks myself to be 100% sure, but unfortunately, I've confirmed the regression:
If I can suggest something, |
I compared the CPU usage before and after, and discovered we spend a considerable amount of time inside The new code (left) consumes 62% of the time calling The root cause seems to be this line, where we call Which is adding up the time it takes to run the rest of the There's a comment stating that I wasn't expecting all the continuation code to be counted as part of the asynchronous calls. I wonder if this has anything to do with the removal of |
We don't need to restore the ExecutionContext in the overlapped, that restore should be avoided in these APIs. The async state machine already does it. This is what you need #42549 😄 |
That's inclusive (what "Total" means in the VS view). The callback is actually running the continuation now (because of RunContinuationsAsynchronously=false), rather than queueing it to be run on a different thread, so the 62% is inclusive of the actual app code. |
...m.Private.CoreLib/src/System/IO/Strategies/AsyncWindowsFileStreamStrategy.ValueTaskSource.cs
Show resolved
Hide resolved
{ | ||
// Successfully got the callback, finish the callback | ||
valueTaskSource.CompleteCallback(packedResult); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you're copying this logic from what was there before, but it seems to be unnecessarily complex, with lots of interlocked operations for transitioning from one state to another to another. If we're looking for places to reduce overheads, revisiting this whole implementation is likely a good place to start.
I've run all the async benchmarks for 3 configurations:
With the following settings:
and confirmed that with @stephentoub should we:
|
be19e14
to
d5344be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @carlossanlop !
With the unit tests passing, the micro benchmarks having no regressions after reverting |
The purpose of this PR is to remove a large allocation found in
FileStream.ReadAsync
andFileStream.WriteAsync
, which we confirmed in our profiling investigation around the newFileStream
rewrite code.The allocation improvement can be achieved by creating a new type, called
FileStreamValueTaskSource
, that will have the same purpose asFileStreamCompletionSource
, but instead of implementingTaskCompletionSource
(which returns aTask
to represent the asynchronous work) it implementsIValueTaskSource
, which returns aValueTask
.The behavior was largely left untouched, except for the following differences:
Net5CompatFileStreamStrategy.Windows
will be the only strategy that consumes theFileStreamCompletionSource
, so it is now a private nested class.IValueTaskSource
(used for write) andIValueTaskSource<int>
(used for read).FileStreamCompletionSource
were moved toFileStreamHelpers
so they could be reused by the new type.Unit tests
Manually ran unit tests from
System.IO.FileSystem
,System.IO
,System.Runtime
andSystem.Runtime.Extensions
, they all passed. I made sure to run them with theNet5Compat
flag turned off, to ensure my new code was executed.Benchmarks
FileStream
performance benchmarks against .NET 5.0, then against the code of this PR's baseline commit, and then against this PR's new code. I compared the results, and there are subtle improvements in some benchmark tests:.NET 5.0 vs .NET 6.0 before this PR
dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_before_pr' --threshold 0.0001%
summary:
better: 39, geomean: 2.801
worse: 22, geomean: 2.098
total diff: 61
.NET 5.0 vs .NET 6.0 after this PR
dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_after_pr' --threshold 0.0001%
summary:
better: 36, geomean: 3.006
worse: 22, geomean: 2.080
total diff: 58
Profiling
I reused the same code we have in our benchmarks, but I executed it inside a console app that forces the consumption of the compiled runtime code.
You'll notice the top
Task
allocation went away, and theFileStreamCompletionSource
allocations were substituted byFileStreamValueTaskSource
allocations.ReadAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
ReadAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
ReadAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
ReadAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
WriteAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
WriteAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
WriteAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
WriteAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)
Allocations before this PR
Allocations after this PR
@adamsitnik @jozkee @jeffhandley @stephentoub
Also: Don't forget to ignore whitespace changes so it's easier to review the nested type's changes 😃