Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStream rewrite: Use IValueTaskSource instead of TaskCompletionSource #50802

Merged
merged 19 commits into from
Apr 15, 2021

Conversation

carlossanlop
Copy link
Member

The purpose of this PR is to remove a large allocation found in FileStream.ReadAsync and FileStream.WriteAsync, which we confirmed in our profiling investigation around the new FileStream rewrite code.

The allocation improvement can be achieved by creating a new type, called FileStreamValueTaskSource, that will have the same purpose as FileStreamCompletionSource, but instead of implementing TaskCompletionSource (which returns a Task to represent the asynchronous work) it implements IValueTaskSource, which returns a ValueTask.

The behavior was largely left untouched, except for the following differences:

  • Net5CompatFileStreamStrategy.Windows will be the only strategy that consumes the FileStreamCompletionSource, so it is now a private nested class.
  • One single new type implements both IValueTaskSource (used for write) and IValueTaskSource<int> (used for read).
  • The logic inside the callback and the cancellation token registration is the same, aside for the code that needs to report exceptions or return values.
  • We had interop errors redefined in multiple places.
  • The constants used by FileStreamCompletionSource were moved to FileStreamHelpers so they could be reused by the new type.

Unit tests

Manually ran unit tests from System.IO.FileSystem, System.IO, System.Runtime and System.Runtime.Extensions, they all passed. I made sure to run them with the Net5Compat flag turned off, to ensure my new code was executed.

Benchmarks

  • Ran the FileStream performance benchmarks against .NET 5.0, then against the code of this PR's baseline commit, and then against this PR's new code. I compared the results, and there are subtle improvements in some benchmark tests:
.NET 5.0 vs .NET 6.0 before this PR

dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_before_pr' --threshold 0.0001%

summary:
better: 39, geomean: 2.801
worse: 22, geomean: 2.098
total diff: 61

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 4.93 882252.57 4348913.28
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 4.62 1003671.88 4633945.31 bimodal
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 3.13 257218.23 805205.31 several?
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 3.13 286859.38 896524.31
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 3.06 1795459.38 5498006.25
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.05 1868294.53 5689532.29
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 512, options: None) 2.99 254148.29 758897.83
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.97 283860.42 843052.43
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 2.90 1844293.36 5353701.04 several?
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 2.89 2209384.38 6395554.17 bimodal
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.65 301619.83 798302.63
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 1.86 557062.05 1035575.21 bimodal
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 1.81 478821.50 868866.12
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.69 3479778.75 5891160.42 several?
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 1.66 543512.61 903388.82 bimodal
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.40 168291900.00 235314600.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.24 8197915.63 10200393.75
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.21 5313143.75 6444323.96
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.18 3448111.25 4052132.81
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.11 50055150.00 55473900.00
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.09 56033900.00 61052537.50
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 133329050.00 137489800.00
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 115.64 5489541.67 47470.79
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 64.90 2811281.25 43319.93
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.82 2571736.46 44479.08
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.29 876923.26 41197.72
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 5.06 2463834750.00 486578800.00
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 4.94 152781.66 30909.09
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.72 153731.81 32593.67
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 4096, options: None) 4.27 161781.84 37889.31 several?
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.98 165696.68 41632.18
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 512, options: None) 3.97 159526.58 40152.10
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 3.50 200423.36 57204.05
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.59 230670.38 89181.93
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.55 238920.22 93543.27
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: None) 2.50 193629.08 77357.59
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.44 3222202.50 1318286.88
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.39 194808.87 81591.76
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 2.36 216916.98 92096.58
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.24 325116900.00 145245400.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 2.15 7025697.92 3271982.50
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.10 199647.89 95019.35
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 2.04 686776300.00 336991400.00 bimodal
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.95 19598723.08 10074393.75
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: None) 1.85 508393800.00 274360000.00
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.74 796591200.00 457934500.00 bimodal
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.73 156593600.00 90313262.50
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.70 22492768.75 13248333.33
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.64 542213800.00 331399500.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.61 5221529.17 3244615.63
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.50 21953590.91 14593046.88
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.33 987460.55 742610.27
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.25 975146.09 777023.21
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.19 1301925.00 1096436.88
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: None) 1.12 142000.45 126451.42
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 512, options: None) 1.11 162435300.00 146191400.00
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 1.10 141476.75 128484.00
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 4096, options: None) 1.10 142083.81 129048.19
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 116486775.00 113456750.00
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.02 88348175.00 86894587.50
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 512, options: None) 1.01 92483275.00 91193650.00
.NET 5.0 vs .NET 6.0 after this PR

dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_after_pr' --threshold 0.0001%
summary:
better: 36, geomean: 3.006
worse: 22, geomean: 2.080
total diff: 58

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 4.92 882252.57 4341552.34
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 4.60 1003671.88 4615128.91 bimodal
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 3.24 283860.42 919970.59
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 512, options: None) 3.08 254148.29 783231.25
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.06 1868294.53 5710579.17
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 3.05 286859.38 874430.38
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 3.04 257218.23 780960.42
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 2.97 1795459.38 5329393.75
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 2.95 1844293.36 5432295.83 several?
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 2.92 2209384.38 6445677.08
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.52 301619.83 758761.88
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 1.95 557062.05 1087235.83
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 1.73 543512.61 937893.38 bimodal
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 1.66 478821.50 793440.46 bimodal
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.47 3479778.75 5113488.54 several?
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.32 168291900.00 221660000.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.23 5313143.75 6512252.08
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.22 8197915.63 10012295.65
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.21 3448111.25 4163660.16
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.13 56033900.00 63237650.00
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.09 50055150.00 54423625.00
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.01 88348175.00 88983100.00
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 115.50 5489541.67 47526.62
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 64.94 2811281.25 43292.23
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.06 2571736.46 45070.47
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.05 876923.26 41661.14
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 5.36 2463834750.00 459292800.00
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 4.95 152781.66 30850.74
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.75 153731.81 32349.21
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 4096, options: None) 4.27 161781.84 37926.07 several?
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 512, options: None) 3.98 159526.58 40096.31
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.96 165696.68 41824.65
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.59 230670.38 89051.28
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.56 238920.22 93276.60
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: None) 2.53 193629.08 76383.11
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.45 194808.87 79416.04
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.34 3222202.50 1376330.63
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 2.31 216916.98 93711.06
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.28 325116900.00 142880200.00
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 2.22 200423.36 90245.25
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 2.16 7025697.92 3250746.88
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.11 508393800.00 241032100.00
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.11 199647.89 94675.64
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 2.06 686776300.00 333108950.00 bimodal
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.93 19598723.08 10171781.82
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.74 156593600.00 90035450.00
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.65 542213800.00 329569650.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.64 5221529.17 3182177.50
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.64 796591200.00 486630000.00
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.52 22492768.75 14797453.85
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.50 21953590.91 14625460.00
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.33 987460.55 742460.71
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.25 975146.09 781337.50
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.20 1301925.00 1085495.19
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 1.11 141476.75 127158.59
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 4096, options: None) 1.11 142083.81 128134.40
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: None) 1.10 142000.45 129068.15
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 116486775.00 112942550.00

Profiling

I reused the same code we have in our benchmarks, but I executed it inside a console app that forces the consumption of the compiled runtime code.

You'll notice the top Task allocation went away, and the FileStreamCompletionSource allocations were substituted by FileStreamValueTaskSource allocations.

ReadAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,001
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,797
- System.String 2,153
- System.Object[] 952
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,803
- System.String 2,144
- System.Object[] 952
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

ReadAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 512,001
- System.IO.MemoryFileStreamCompletionSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,806
- System.String 2,153
- System.Object[] 957
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,803
- System.String 2,144
- System.Object[] 964
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

ReadAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,013
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,840
- System.String 2,153
- System.Object[] 955
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,836
- System.String 2,144
- System.Object[] 957
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

ReadAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 64,235
- System.Threading.Tasks.Task<> 64,015
- System.IO.FileStreamCompletionSource 64,000
- System.SByte[] 2,916
- System.String 2,155
- System.Object[] 956
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,812
- System.SByte[] 2,909
- System.String 2,144
- System.Object[] 964
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

WriteAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,001
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,798
- System.String 2,153
- System.Object[] 954
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,790
- System.String 2,144
- System.Object[] 959
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

WriteAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 512,001
- System.IO.MemoryFileStreamCompletionSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,812
- System.String 2,153
- System.Object[] 959
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,790
- System.String 2,144
- System.Object[] 958
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

WriteAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,014
- System.Threading.ThreadPoolBoundHandleOverlapped 32,250
- System.Threading.OverlappedData 32,250
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 32,249
- System.IO.MemoryFileStreamCompletionSource 32,000
- System.IO.FileStreamCompletionSource 32,000
- System.SByte[] 2,915
- System.String 2,155
- System.Object[] 952
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.Threading.ThreadPoolBoundHandleOverlapped 32,250
- System.Threading.OverlappedData 32,250
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 32,246
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 32,000
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 32,000
- System.SByte[] 2,893
- System.String 2,144
- System.Object[] 954
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

WriteAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,014
- System.IO.FileStreamCompletionSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,963
- System.SByte[] 2,915
- System.String 2,155
- System.Object[] 956
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,830
- System.SByte[] 2,912
- System.String 2,144
- System.Object[] 966
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask 249
- System.Threading.QueueUserWorkItemCallbackDefaultContext<> 249

@adamsitnik @jozkee @jeffhandley @stephentoub

Also: Don't forget to ignore whitespace changes so it's easier to review the nested type's changes 😃

@ghost
Copy link

ghost commented Apr 6, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

The purpose of this PR is to remove a large allocation found in FileStream.ReadAsync and FileStream.WriteAsync, which we confirmed in our profiling investigation around the new FileStream rewrite code.

The allocation improvement can be achieved by creating a new type, called FileStreamValueTaskSource, that will have the same purpose as FileStreamCompletionSource, but instead of implementing TaskCompletionSource (which returns a Task to represent the asynchronous work) it implements IValueTaskSource, which returns a ValueTask.

The behavior was largely left untouched, except for the following differences:

  • Net5CompatFileStreamStrategy.Windows will be the only strategy that consumes the FileStreamCompletionSource, so it is now a private nested class.
  • One single new type implements both IValueTaskSource (used for write) and IValueTaskSource<int> (used for read).
  • The logic inside the callback and the cancellation token registration is the same, aside for the code that needs to report exceptions or return values.
  • We had interop errors redefined in multiple places.
  • The constants used by FileStreamCompletionSource were moved to FileStreamHelpers so they could be reused by the new type.

Unit tests

Manually ran unit tests from System.IO.FileSystem, System.IO, System.Runtime and System.Runtime.Extensions, they all passed. I made sure to run them with the Net5Compat flag turned off, to ensure my new code was executed.

Benchmarks

  • Ran the FileStream performance benchmarks against .NET 5.0, then against the code of this PR's baseline commit, and then against this PR's new code. I compared the results, and there are subtle improvements in some benchmark tests:
.NET 5.0 vs .NET 6.0 before this PR

dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_before_pr' --threshold 0.0001%

summary:
better: 39, geomean: 2.801
worse: 22, geomean: 2.098
total diff: 61

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 4.93 882252.57 4348913.28
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 4.62 1003671.88 4633945.31 bimodal
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 3.13 257218.23 805205.31 several?
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 3.13 286859.38 896524.31
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 3.06 1795459.38 5498006.25
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.05 1868294.53 5689532.29
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 512, options: None) 2.99 254148.29 758897.83
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.97 283860.42 843052.43
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 2.90 1844293.36 5353701.04 several?
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 2.89 2209384.38 6395554.17 bimodal
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.65 301619.83 798302.63
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 1.86 557062.05 1035575.21 bimodal
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 1.81 478821.50 868866.12
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.69 3479778.75 5891160.42 several?
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 1.66 543512.61 903388.82 bimodal
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.40 168291900.00 235314600.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.24 8197915.63 10200393.75
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.21 5313143.75 6444323.96
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.18 3448111.25 4052132.81
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.11 50055150.00 55473900.00
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.09 56033900.00 61052537.50
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 133329050.00 137489800.00
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 115.64 5489541.67 47470.79
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 64.90 2811281.25 43319.93
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.82 2571736.46 44479.08
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.29 876923.26 41197.72
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 5.06 2463834750.00 486578800.00
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 4.94 152781.66 30909.09
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.72 153731.81 32593.67
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 4096, options: None) 4.27 161781.84 37889.31 several?
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.98 165696.68 41632.18
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 512, options: None) 3.97 159526.58 40152.10
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 3.50 200423.36 57204.05
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.59 230670.38 89181.93
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.55 238920.22 93543.27
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: None) 2.50 193629.08 77357.59
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.44 3222202.50 1318286.88
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.39 194808.87 81591.76
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 2.36 216916.98 92096.58
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.24 325116900.00 145245400.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 2.15 7025697.92 3271982.50
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.10 199647.89 95019.35
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 2.04 686776300.00 336991400.00 bimodal
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.95 19598723.08 10074393.75
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: None) 1.85 508393800.00 274360000.00
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.74 796591200.00 457934500.00 bimodal
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.73 156593600.00 90313262.50
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.70 22492768.75 13248333.33
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.64 542213800.00 331399500.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.61 5221529.17 3244615.63
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.50 21953590.91 14593046.88
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.33 987460.55 742610.27
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.25 975146.09 777023.21
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.19 1301925.00 1096436.88
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: None) 1.12 142000.45 126451.42
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 512, options: None) 1.11 162435300.00 146191400.00
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 1.10 141476.75 128484.00
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 4096, options: None) 1.10 142083.81 129048.19
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 116486775.00 113456750.00
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.02 88348175.00 86894587.50
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 512, options: None) 1.01 92483275.00 91193650.00
.NET 5.0 vs .NET 6.0 after this PR

dotnet run --project D:\performance\src\tools\ResultsComparer\ResultsComparer.csproj -c release --base 'D:\fs50' --diff 'D:\fs60_after_pr' --threshold 0.0001%
summary:
better: 36, geomean: 3.006
worse: 22, geomean: 2.080
total diff: 58

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 4.92 882252.57 4341552.34
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 4.60 1003671.88 4615128.91 bimodal
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 3.24 283860.42 919970.59
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 512, options: None) 3.08 254148.29 783231.25
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.06 1868294.53 5710579.17
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 3.05 286859.38 874430.38
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 3.04 257218.23 780960.42
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 2.97 1795459.38 5329393.75
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 2.95 1844293.36 5432295.83 several?
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 2.92 2209384.38 6445677.08
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.52 301619.83 758761.88
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 1.95 557062.05 1087235.83
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 1.73 543512.61 937893.38 bimodal
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 1.66 478821.50 793440.46 bimodal
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.47 3479778.75 5113488.54 several?
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.32 168291900.00 221660000.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.23 5313143.75 6512252.08
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.22 8197915.63 10012295.65
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.21 3448111.25 4163660.16
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.13 56033900.00 63237650.00
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.09 50055150.00 54423625.00
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.01 88348175.00 88983100.00
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 115.50 5489541.67 47526.62
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 64.94 2811281.25 43292.23
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.06 2571736.46 45070.47
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.05 876923.26 41661.14
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 5.36 2463834750.00 459292800.00
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 4.95 152781.66 30850.74
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.75 153731.81 32349.21
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 4096, options: None) 4.27 161781.84 37926.07 several?
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 512, options: None) 3.98 159526.58 40096.31
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.96 165696.68 41824.65
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.59 230670.38 89051.28
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.56 238920.22 93276.60
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: None) 2.53 193629.08 76383.11
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: None) 2.45 194808.87 79416.04
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.34 3222202.50 1376330.63
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 2.31 216916.98 93711.06
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.28 325116900.00 142880200.00
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 2.22 200423.36 90245.25
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 2.16 7025697.92 3250746.88
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: None) 2.11 508393800.00 241032100.00
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 512, options: Asynchronous) 2.11 199647.89 94675.64
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 2.06 686776300.00 333108950.00 bimodal
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.93 19598723.08 10171781.82
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.74 156593600.00 90035450.00
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.65 542213800.00 329569650.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.64 5221529.17 3182177.50
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 512, options: Asynchronous) 1.64 796591200.00 486630000.00
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.52 22492768.75 14797453.85
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.50 21953590.91 14625460.00
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.33 987460.55 742460.71
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.25 975146.09 781337.50
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.20 1301925.00 1085495.19
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: Asynchronous) 1.11 141476.75 127158.59
Perf_FileStream.Write(fileSize: 1024, userBufferSize: 4096, options: None) 1.11 142083.81 128134.40
Perf_FileStream.WriteAsync(fileSize: 1024, userBufferSize: 4096, options: None) 1.10 142000.45 129068.15
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.03 116486775.00 112942550.00

Profiling

I reused the same code we have in our benchmarks, but I executed it inside a console app that forces the consumption of the compiled runtime code.

You'll notice the top Task allocation went away, and the FileStreamCompletionSource allocations were substituted by FileStreamValueTaskSource allocations.

ReadAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,001
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,797
- System.String 2,153
- System.Object[] 952
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,803
- System.String 2,144
- System.Object[] 952
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

ReadAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 512,001
- System.IO.MemoryFileStreamCompletionSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,806
- System.String 2,153
- System.Object[] 957
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,803
- System.String 2,144
- System.Object[] 964
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

ReadAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,013
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,840
- System.String 2,153
- System.Object[] 955
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,836
- System.String 2,144
- System.Object[] 957
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

ReadAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 64,235
- System.Threading.Tasks.Task<> 64,015
- System.IO.FileStreamCompletionSource 64,000
- System.SByte[] 2,916
- System.String 2,155
- System.Object[] 956
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,812
- System.SByte[] 2,909
- System.String 2,144
- System.Object[] 964
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

WriteAsync(bufferSize: 1, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,001
- System.IO.MemoryFileStreamCompletionSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,798
- System.String 2,153
- System.Object[] 954
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 64,000
- System.Threading.ThreadPoolBoundHandleOverlapped 64,000
- System.Threading.OverlappedData 64,000
- System.SByte[] 2,790
- System.String 2,144
- System.Object[] 959
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

WriteAsync(bufferSize: 1, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 512,001
- System.IO.MemoryFileStreamCompletionSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,812
- System.String 2,153
- System.Object[] 959
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 512,000
- System.Threading.ThreadPoolBoundHandleOverlapped 512,000
- System.Threading.OverlappedData 512,000
- System.SByte[] 2,790
- System.String 2,144
- System.Object[] 958
- System.Reflection.RuntimeParameterInfo 650
- System.Byte[] 539
- System.Reflection.ParameterInfo[] 281
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 251
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225
- System.RuntimeType[] 178

WriteAsync(bufferSize: FourKibibytes, userBufferSize: FourKibibytes, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,014
- System.Threading.ThreadPoolBoundHandleOverlapped 32,250
- System.Threading.OverlappedData 32,250
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 32,249
- System.IO.MemoryFileStreamCompletionSource 32,000
- System.IO.FileStreamCompletionSource 32,000
- System.SByte[] 2,915
- System.String 2,155
- System.Object[] 952
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.Threading.ThreadPoolBoundHandleOverlapped 32,250
- System.Threading.OverlappedData 32,250
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 32,246
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.MemoryFileStreamValueTaskSource 32,000
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 32,000
- System.SByte[] 2,893
- System.String 2,144
- System.Object[] 954
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Reflection.RuntimeMethodInfo 241
- System.Signature 225

WriteAsync(bufferSize: FourKibibytes, userBufferSize: HalfKibibyte, FileOptions.Asynchronous)

Allocations before this PR
Type Allocations
- System.Threading.Tasks.Task<> 64,014
- System.IO.FileStreamCompletionSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,963
- System.SByte[] 2,915
- System.String 2,155
- System.Object[] 956
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Diagnostics.Tracing.ScalarTypeInfo 489
- System.Func<,,> 485
- System.Reflection.ParameterInfo[] 281
- System.IO.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
Allocations after this PR
Type Allocations
- System.IO.Strategies.AsyncWindowsFileStreamStrategy.FileStreamValueTaskSource 64,000
- System.Runtime.CompilerServices.AsyncTaskMethodBuilder<>.AsyncStateMachineBox<> 63,830
- System.SByte[] 2,912
- System.String 2,144
- System.Object[] 966
- System.Byte[] 789
- System.Reflection.RuntimeParameterInfo 650
- System.Reflection.ParameterInfo[] 281
- System.IO.Strategies.BufferedFileStreamStrategy 252
- Microsoft.Win32.SafeHandles.SafeFileHandle 252
- System.IO.FileStream 252
- System.IO.Strategies.AsyncWindowsFileStreamStrategy 250
- System.Threading.ThreadPoolBoundHandleOverlapped 250
- System.Threading.OverlappedData 250
- System.Threading.SemaphoreSlim 250
- System.Threading.ThreadPoolBoundHandle 250
- System.Threading.PreAllocatedOverlapped 250
- System.Runtime.CompilerServices.StrongBox<> 250
- System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask 249
- System.Threading.QueueUserWorkItemCallbackDefaultContext<> 249

@adamsitnik @jozkee @jeffhandley @stephentoub

Also: Don't forget to ignore whitespace changes so it's easier to review the nested type's changes 😃
Author: carlossanlop
Assignees: carlossanlop
Labels:

area-System.IO

Milestone: 6.0.0

@carlossanlop
Copy link
Member Author

Tagging subscribers to this area: @carlossanlop

Yes, thank you bot... 😆

Copy link
Member

@jozkee jozkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise; LGTM, will approve if/when comments are resolved, most of them are nits and/or questions.

Comment on lines 17 to 22
internal const long NoResult = 0;
internal const long ResultSuccess = (long)1 << 32;
internal const long ResultError = (long)2 << 32;
internal const long RegisteringCancellation = (long)4 << 32;
internal const long CompletedCallback = (long)8 << 32;
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider prefixing this constants with AsyncCompletionSource_ or something alike since these names are too ambiguous or keep them in their respective classes regardless of them being duplicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or put them into a nested static class called ReturnCodes:

    internal static partial class FileStreamHelpers
    {
        internal static class ReturnCodes
        {
            internal const long NoResult = 0;
            internal const long ResultSuccess = (long)1 << 32;
            internal const long ResultError = (long)2 << 32;
            internal const long RegisteringCancellation = (long)4 << 32;
            internal const long CompletedCallback = (long)8 << 32;
            internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
        }

but it's a nit (not a blocker)

Comment on lines 66 to 67
_strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ?
_strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ?
_strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null
strategy.CompareExchangeCurrentOverlappedOwner(this, null) == null ?
strategy._fileHandle.ThreadPoolBinding!.AllocateNativeOverlapped(preallocatedOverlapped!) : // allocated when buffer was created, and buffer is non-null

// be directly the AwaitableProvider that's completing (in the case where the preallocated
// overlapped was already in use by another operation).
object? state = ThreadPoolBoundHandle.GetNativeOverlappedState(pOverlapped);
Debug.Assert(state is (AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Debug.Assert(state is (AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource));
Debug.Assert(state is AsyncWindowsFileStreamStrategy or FileStreamValueTaskSource);

int errorCode = unchecked((int)(packedResult & uint.MaxValue));
if (errorCode == Interop.Errors.ERROR_OPERATION_ABORTED)
{
_source.SetException(ExceptionDispatchInfo.SetCurrentStackTrace(new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_source.SetException(ExceptionDispatchInfo.SetCurrentStackTrace(new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true))));
Exception e = new OperationCanceledException(cancellationToken.IsCancellationRequested ? cancellationToken : new CancellationToken(canceled: true));
e.SetCurrentStackTrace();
_source.SetException(e);

}

private void Cancel(CancellationToken token)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing comment and asserts:

// WARNING: This may potentially be called under a lock (during cancellation registration)
Debug.Assert(state is FileStreamCompletionSource, "Unknown state passed to cancellation");
FileStreamCompletionSource completionSource = (FileStreamCompletionSource)state;
Debug.Assert(completionSource._overlapped != null && !completionSource.Task.IsCompleted, "IO should not have completed yet");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first assert is not needed because the method argument is not an object? state.

I'll add the assert that checks for overlapped an completed status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state param could be added to this method and allows you to conserve the first Debug.Assert.

long packedResult = Interlocked.CompareExchange(ref _result, FileStreamHelpers.RegisteringCancellation, FileStreamHelpers.NoResult);
if (packedResult == FileStreamHelpers.NoResult)
{
_cancellationRegistration = cancellationToken.UnsafeRegister(static (s, token) => ((FileStreamValueTaskSource)s!).Cancel(token), this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pass the callback on a similar fashion to FileStreamCompletionSource

_cancellationRegistration = cancellationToken.UnsafeRegister(cancelCallback, this);

Or why did you opted for this way instead?

Copy link
Member Author

@carlossanlop carlossanlop Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made more sense to pass this as the second argument of UnsafeRegister, which ensures it's a ValueTaskSource instance, and saves us from the check of the object? state parameter like in the old code. The old code also allocates an Action<object?> instance to save a reference to the Cancel method.

We have a similar example in System.Threading.Channels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made more sense to pass this as the second argument of UnsafeRegister

FileStreamCompletionSource was also passing this as 2nd argument.

The old code also allocates an Action<object?> instance to save a reference to the Cancel method.

Yes, seems that it was caching the method group to avoid allocating everytime it was passed to UnsafeRegister.
However I think you can avoid the allocation without caching yourself by doing this:

_cancellationRegistration = cancellationToken.UnsafeRegister((s, token) => Cancel(s, token), this);

Just a suggestion, not something that you should block on.

@davidfowl
Copy link
Member

This is the change I was waiting for 👏🏾👏🏾

@davidfowl
Copy link
Member

I haven't looked but I assume we'll reuse the operation objects if there's no concurrency? This is what sockets do today so in the 95% case of no overlapping writes or reads, there'll be a single reusable allocation on the FileStream

@stephentoub
Copy link
Member

This is the change I was waiting for

It's not, actually. You want the next one where the IValueTaskSource instance is cached.

}
// else: Some other thread stole the result, so now it is responsible to finish the callback
}
// else: Some other thread is registering a cancellation, so it *must* finish the callback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this logic looks more complicated than it needs to be. I realize you're just following the same flow as what was already there, so nothing needs to be changed in this PR. But we should revisit all of this synchronization later; I suspect we can do it more succinctly and efficiently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue to follow up on this and on the caching of the IValueTaskSource.
#50972

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, but before we merge I would like to see the following benchmark results:

.NET 6 with new strategy enabled: before and after your change. (not .NET 5 vs 6 as there are too many moving targets)

You should be able to achieve that by running the following command in the perf repo;

git clone https://github.com/dotnet/performance.git
cd performance\src\benchmarks\micro
dotnet run -c Release -f net6.0 --filter *Perf_FileStream* --coreRun $pathToCoreRunWithoutYourChanges $pathToCoreRunWithYourChanges --envVars "DOTNET_SYSTEM_IO_USENET5COMPATFILESTREAM:0"

/// <see cref="MemoryHandle"/> when the operation has completed. This should only be used
/// when memory doesn't wrap a byte[].
/// </summary>
private sealed class MemoryAwaitableProvider : FileStreamValueTaskSource
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be used when memory doesn't wrap a byte[].

I just realized that we don't have benchmarks for this scenario as Memory used in our micro-benchmarks always wraps a byte array. Could you please add such benchmarks to https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/libraries/System.IO.FileSystem/Perf.FileStream.cs ? One test case should be enough as this should be rare.

Copy link
Member Author

@carlossanlop carlossanlop Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the benchmarks do consume this type. If you look at some of the profiling tests I executed, you'll see that MemoryValueTaskSource shows up a few times. The profiling tests reuse the code we had in our benchmarks.

I don't mind adding an explicit one though. Dou you want me to add one anyway?

Comment on lines 17 to 22
internal const long NoResult = 0;
internal const long ResultSuccess = (long)1 << 32;
internal const long ResultError = (long)2 << 32;
internal const long RegisteringCancellation = (long)4 << 32;
internal const long CompletedCallback = (long)8 << 32;
internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or put them into a nested static class called ReturnCodes:

    internal static partial class FileStreamHelpers
    {
        internal static class ReturnCodes
        {
            internal const long NoResult = 0;
            internal const long ResultSuccess = (long)1 << 32;
            internal const long ResultError = (long)2 << 32;
            internal const long RegisteringCancellation = (long)4 << 32;
            internal const long CompletedCallback = (long)8 << 32;
            internal const ulong ResultMask = ((ulong)uint.MaxValue) << 32;
        }

but it's a nit (not a blocker)

Comment on lines 62 to 63
// Using RunContinuationsAsynchronously for compat reasons (old API used Task.Factory.StartNew for continuations)
_source.RunContinuationsAsynchronously = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run the perf. benchmarks after doing this change and did you notice any impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed - There is a slight improvement.

I made sure to compare apples to apples: I used my latest commit, and built it setting RunContinuationsAsynchronously to true, then to false. I compared it both times against our code in .NET 5.0.

The improvement is clear when setting it to false (or not setting it at all).

RunContinuationsAsynchronously = true

❯ dotnet run -c release --base "D:\fs50" --diff "D:\fs60_NEW_ON_IVTS_asynctrue" --threshold 0.0001%
summary:
better: 24, geomean: 3.402
worse: 19, geomean: 2.244
total diff: 43

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 5.49 882252.57 4840275.00
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 5.12 1003671.88 5136350.00 bimodal
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 4.44 257218.23 1141827.01
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 4.08 286859.38 1171600.96
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 3.25 1795459.38 5832233.33
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 3.11 1844293.36 5741847.92 several?
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.11 1868294.53 5807825.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 3.08 2209384.38 6796443.75
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 2.46 557062.05 1371566.15
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 2.33 478821.50 1116033.48
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 2.28 543512.61 1239956.25 bimodal
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.61 3479778.75 5612725.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.44 8197915.63 11814575.00
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.33 168291900.00 224355000.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.33 5313143.75 7063218.75
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.25 3448111.25 4313832.81
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.12 50055150.00 56309025.00
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.12 56033900.00 62803900.00
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) 1.12 133329050.00 148833000.00 bimodal
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 117.07 5489541.67 46889.31
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 65.93 2811281.25 42642.74
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.84 2571736.46 44462.45
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.09 876923.26 41586.04
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 5.00 152781.66 30548.62
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.76 153731.81 32314.04
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 4.03 2463834750.00 611183400.00
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.93 165696.68 42208.34
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 3.52 200423.36 56865.52
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.62 230670.38 87990.73
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.58 238920.22 92478.58
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.34 3222202.50 1377939.84
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.87 156593600.00 83681962.50
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.66 19598723.08 11803641.67
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.63 22492768.75 13818900.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.57 7025697.92 4487121.88
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 1.47 686776300.00 466936500.00 bimodal
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.33 987460.55 743825.45
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.24 975146.09 784156.88
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.19 5221529.17 4387085.94
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.18 1301925.00 1103803.79
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.04 116486775.00 112407200.00
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.03 21953590.91 21241122.73
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.01 88348175.00 87467100.00
RunContinuationsAsynchronously = false

dotnet run -c release --base "D:\fs50" --diff "D:\fs60_NEW_ON_IVTS_asyncfalse" --threshold 0.0001%
summary:
better: 23, geomean: 3.592
worse: 20, geomean: 2.135
total diff: 43

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 5.36 882252.57 4730875.00
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 5.05 1003671.88 5071912.50 bimodal
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 4.25 257218.23 1094164.38 several?
Perf_FileStream.WriteByte(fileSize: 1024, options: Asynchronous) 4.05 286859.38 1161160.27
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 4096, options: None) 3.23 1795459.38 5791481.25
Perf_FileStream.Write(fileSize: 1048576, userBufferSize: 512, options: None) 3.10 1844293.36 5709820.83 several?
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 3.09 2209384.38 6836504.17
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 3.07 1868294.53 5733883.33
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: Asynchronous) 2.46 557062.05 1368034.90
Perf_FileStream.CopyToFile(fileSize: 1024, options: None) 2.31 478821.50 1103696.25
Perf_FileStream.CopyToFileAsync(fileSize: 1024, options: None) 2.24 543512.61 1218234.38 bimodal
Perf_FileStream.FlushAsync(fileSize: 1024, options: None) 1.60 3479778.75 5572682.29
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.42 8197915.63 11634850.00
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.33 168291900.00 223141800.00 bimodal
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.31 5313143.75 6949831.25
Perf_FileStream.Flush(fileSize: 1024, options: None) 1.26 3448111.25 4332915.63
Perf_FileStream.Write(fileSize: 104857600, userBufferSize: 4096, options: None) 1.13 133329050.00 150223800.00 bimodal
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.11 50055150.00 55699050.00
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: None) 1.10 56033900.00 61710550.00
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.00 88348175.00 88759625.00
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.SeekBackward(fileSize: 1024, options: Asynchronous) 116.97 5489541.67 46929.42
Perf_FileStream.SeekForward(fileSize: 1024, options: Asynchronous) 65.76 2811281.25 42753.29
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 57.56 2571736.46 44676.15
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 21.52 876923.26 40756.47
Perf_FileStream.OpenClose(fileSize: 1024, options: None) 5.06 152781.66 30216.63
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 4.81 153731.81 31975.57
Perf_FileStream.WriteAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 4.05 2463834750.00 608864800.00
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 3.99 165696.68 41526.73
Perf_FileStream.ReadByte(fileSize: 1024, options: Asynchronous) 3.54 200423.36 56610.08
Perf_FileStream.LockUnlock(fileSize: 1024, options: None) 2.62 230670.38 87877.10
Perf_FileStream.LockUnlock(fileSize: 1024, options: Asynchronous) 2.59 238920.22 92291.75
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 2.40 3222202.50 1345116.25
Perf_FileStream.CopyToFileAsync(fileSize: 104857600, options: Asynchronous) 1.86 156593600.00 83969875.00
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.64 19598723.08 11925173.68
Perf_FileStream.Flush(fileSize: 1024, options: Asynchronous) 1.62 22492768.75 13859764.71
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: Asynchronous) 1.56 7025697.92 4493881.25
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: Asynchronous) 1.46 686776300.00 470158300.00 bimodal
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.32 987460.55 750294.64
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 512, options: None) 1.24 975146.09 784056.56
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 4096, options: None) 1.17 1301925.00 1108880.36
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.17 5221529.17 4470673.44
Perf_FileStream.FlushAsync(fileSize: 1024, options: Asynchronous) 1.07 21953590.91 20580725.00
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.01 116486775.00

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the result comparison if we compare the benchmark results of setting it to true vs false:

base: RunContinuationsAsynchronously=true, diff: RunContinuationsAsynchronously=false

dotnet run -c release --base "D:\fs60_NEW_ON_IVTS_asynctrue" --diff "D:\fs60_NEW_ON_IVTS_asyncfalse" --threshold 0.0001%
summary:
better: 14, geomean: 1.018
worse: 11, geomean: 1.017
total diff: 25

Slower diff/base Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.WriteAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) 1.05 67143550.00 70259187.50
Perf_FileStream.ReadAsync(fileSize: 104857600, userBufferSize: 4096, options: None) 1.02 112407200.00 115107550.00
Perf_FileStream.Read_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) 1.02 240016.67 244921.93
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.02 4387085.94 4470673.44
Perf_FileStream.Read(fileSize: 104857600, userBufferSize: 4096, options: None) 1.01 87467100.00 88759625.00
Perf_FileStream.WriteAsync_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) 1.01 21684354.55 21982809.09
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: Asynchronous) 1.01 1236709.90 1253242.19
Perf_FileStream.Read(fileSize: 1024, userBufferSize: 1024, options: None) 1.01 39289.54 39730.73
Perf_FileStream.Write_NoBuffering(fileSize: 1048576, userBufferSize: 16384, options: None) 1.01 21445427.27 21646650.00
Perf_FileStream.Read(fileSize: 1048576, userBufferSize: 4096, options: None) 1.01 743825.45 750294.64
Perf_FileStream.SeekBackward(fileSize: 1024, options: None) 1.00 44462.45 44676.15
Faster base/diff Base Median (ns) Diff Median (ns) Modality
Perf_FileStream.WriteByte(fileSize: 1024, options: None) 1.04 1141827.01 1094164.38 several?
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) 1.03 45688612.50 44474575.00
Perf_FileStream.ReadAsync(fileSize: 1048576, userBufferSize: 512, options: None) 1.02 1377939.84 1345116.25
Perf_FileStream.CopyToFile(fileSize: 1048576, options: None) 1.02 4840275.00 4730875.00
Perf_FileStream.SeekForward(fileSize: 1024, options: None) 1.02 41586.04 40756.47
Perf_FileStream.ReadAsync_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: Asynchronous) 1.02 130849050.00 128463775.00
Perf_FileStream.ReadByte(fileSize: 1024, options: None) 1.02 42208.34 41526.73
Perf_FileStream.WriteAsync(fileSize: 1048576, userBufferSize: 512, options: Asynchronous) 1.02 11814575.00 11634850.00
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: Asynchronous) 1.01 5807825.00 5733883.33
Perf_FileStream.CopyToFileAsync(fileSize: 1048576, options: None) 1.01 5136350.00 5071912.50
Perf_FileStream.CopyToFile(fileSize: 104857600, options: None) 1.01 56309025.00 55699050.00
Perf_FileStream.Read_NoBuffering(fileSize: 104857600, userBufferSize: 16384, options: None) 1.01 33747814.29 33383457.14
Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) 1.01 32314.04 31975.57
Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 1024, options: Asynchronous) 1.00 81988.09 81641.78

long packedResult = Interlocked.CompareExchange(ref _result, FileStreamHelpers.RegisteringCancellation, FileStreamHelpers.NoResult);
if (packedResult == FileStreamHelpers.NoResult)
{
_cancellationRegistration = cancellationToken.UnsafeRegister(static (s, token) => ((FileStreamValueTaskSource)s!).Cancel(token), this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made more sense to pass this as the second argument of UnsafeRegister

FileStreamCompletionSource was also passing this as 2nd argument.

The old code also allocates an Action<object?> instance to save a reference to the Cancel method.

Yes, seems that it was caching the method group to avoid allocating everytime it was passed to UnsafeRegister.
However I think you can avoid the allocation without caching yourself by doing this:

_cancellationRegistration = cancellationToken.UnsafeRegister((s, token) => Cancel(s, token), this);

Just a suggestion, not something that you should block on.

Comment on lines 120 to 122
return vt.IsCompleted?
vt.Result :
vt.AsTask().GetAwaiter().GetResult();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but a ValueTask does not have a Result property, like a ValueTask<T> does, because it returns void. So I'm not sure if it's possible to apply an equivalent pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public override void Write(byte[] buffer, int offset, int count)
{
    ValueTask vt = WriteAsyncInternal(new ReadOnlyMemory<byte>(buffer, offset, count), CancellationToken.None);
    if (vt.IsCompleted)
    {
        vt.GetAwaiter().GetResult();
    }
    else
    {
        vt.AsTask().GetAwaiter().GetResult();
    }
}

That said, it's not much of an issue for this case. The read side is an issue because AsTask() will almost certainly need to allocate a new Task<int> in order to hand back the result, but for the write side, AsTask() on an already completed ValueTask will just return Task.Completed, assuming the operation completed successfully. So the only time the existing code for the write side will allocate is when the operation was canceled or faulted, in which case we're about to throw an exception anyway, and the extra task allocation won't really matter. Up to you whether you want to change it.

Copy link
Member

@jozkee jozkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks Carlos!

@adamsitnik
Copy link
Member

I compared it both times against our code in .NET 5.0.

Comparison against 5.0 is a comparison of all of the changes that were merged to main since we have snapped 5.0 with your change against 5.0. What we care about is how your change (and nothing else) is affecting the perf, so we need to isolate the runs to only your change.

To be able to do that you need to run the benchmarks using two CoreRuns:

  • as a baseline the corerun that was built from main when you have created a new branch, without any of your changes
  • your branch with your changes

My personal workflow:

  1. When I create a new branch, I perform a full build in Release:
build -c Release -subset clr+libs
  1. I copy the folder that contains CoreRun.exe (runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\6.0.0) to a separate folder. Typically I call it "before".
  2. I apply some changes and perform an incremental Release build. In case of System.Private.CoreLib.dll it's:
build -c Release -subset clr.corelib+clr.nativecorelib+libs.PreTest
  1. I run the benchmarks using the old and new CoreRun.exe:
dotnet run -c Release -f net6.0 --filter *ABC* --corerun C:\Projects\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\before\CoreRun.exe C:\Projects\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\6.0.0\corerun.exe

@carlossanlop
Copy link
Member Author

carlossanlop commented Apr 13, 2021

  • as a baseline the corerun that was built from main when you have created a new branch, without any of your changes
  • your branch with your changes

@adamsitnik @jozkee here's the table comparing the before (original) and after of this PR. Most results had allocation improvements, but worse speed when FileOptions = Asynchronous:

Results

Command:

 dotnet run
    -c Release
    -f net6.0
    --corerun
        D:\runtime2\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\6.0.0\corerun.exe
        D:\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\6.0.0\corerun.exe
    --filter System.IO.Tests.Perf_FileStream.*Async*
    --statisticalTest 3ms
    --minIterationCount 100
    --maxIterationCount 101
    --artifacts "D:\finalresults"

Note: The runtime2 corerun contains the baseline commit before my changes, the runtime corerun contains this PR's changes.

Method Toolchain fileSize userBufferSize options Mean Error StdDev Median Min Max Ratio Allocated
ReadAsync BEFORE 1024 1024 None 78.69 μs 0.255 μs 0.736 μs 78.53 μs 77.61 μs 80.66 μs 1.00 5 KB
ReadAsync AFTER 1024 1024 None 79.10 μs 0.237 μs 0.673 μs 79.08 μs 76.96 μs 80.71 μs 1.01 5 KB
WriteAsync BEFORE 1024 1024 None 316.97 μs 2.512 μs 6.919 μs 315.92 μs 305.11 μs 339.35 μs 1.00 4 KB
WriteAsync AFTER 1024 1024 None 300.13 μs 5.643 μs 15.730 μs 297.59 μs 272.46 μs 348.15 μs 0.95 4 KB
ReadAsync BEFORE 1024 1024 Asynchronous 94.13 μs 0.259 μs 0.731 μs 94.02 μs 92.27 μs 96.16 μs 1.00 5 KB
ReadAsync AFTER 1024 1024 Asynchronous 85.48 μs 0.842 μs 2.291 μs 85.11 μs 82.58 μs 92.09 μs 0.91 5 KB
WriteAsync BEFORE 1024 1024 Asynchronous 292.63 μs 5.659 μs 15.869 μs 290.59 μs 264.25 μs 338.98 μs 1.00 5 KB
WriteAsync AFTER 1024 1024 Asynchronous 324.35 μs 13.906 μs 39.222 μs 311.27 μs 274.34 μs 428.64 μs 1.11 5 KB
FlushAsync BEFORE 1024 ? None 4,606.81 μs 46.736 μs 133.341 μs 4,633.92 μs 4,374.04 μs 4,952.63 μs 1.00 269 KB
FlushAsync AFTER 1024 ? None 4,587.88 μs 58.046 μs 168.403 μs 4,638.98 μs 4,322.79 μs 5,034.48 μs 1.00 269 KB
CopyToFileAsync BEFORE 1024 ? None 474.57 μs 5.497 μs 15.141 μs 474.66 μs 440.31 μs 520.31 μs 1.00 5 KB
CopyToFileAsync AFTER 1024 ? None 488.94 μs 3.631 μs 10.061 μs 488.62 μs 469.93 μs 518.76 μs 1.03 5 KB
FlushAsync BEFORE 1024 ? Asynchronous 14,680.35 μs 150.967 μs 437.982 μs 14,637.76 μs 13,623.50 μs 15,645.01 μs 1.00 301 KB
FlushAsync AFTER 1024 ? Asynchronous 20,333.22 μs 116.277 μs 339.185 μs 20,299.93 μs 19,724.25 μs 21,148.73 μs 1.39 261 KB
CopyToFileAsync BEFORE 1024 ? Asynchronous 520.97 μs 5.982 μs 16.674 μs 519.35 μs 490.05 μs 572.29 μs 1.00 6 KB
CopyToFileAsync AFTER 1024 ? Asynchronous 523.30 μs 6.294 μs 17.650 μs 520.91 μs 479.14 μs 583.85 μs 1.00 6 KB
ReadAsync BEFORE 1048576 512 None 1,375.28 μs 5.930 μs 17.015 μs 1,376.60 μs 1,340.44 μs 1,413.73 μs 1.00 85 KB
ReadAsync AFTER 1048576 512 None 1,347.16 μs 5.128 μs 14.958 μs 1,345.14 μs 1,323.57 μs 1,382.14 μs 0.98 85 KB
WriteAsync BEFORE 1048576 512 None 3,530.03 μs 38.149 μs 108.222 μs 3,521.78 μs 3,299.91 μs 3,823.63 μs 1.00 76 KB
WriteAsync AFTER 1048576 512 None 3,473.85 μs 47.943 μs 134.438 μs 3,469.00 μs 3,222.90 μs 3,832.17 μs 0.98 76 KB
ReadAsync BEFORE 1048576 512 Asynchronous 3,278.98 μs 30.616 μs 87.844 μs 3,257.98 μs 3,151.46 μs 3,507.69 μs 1.00 93 KB
ReadAsync AFTER 1048576 512 Asynchronous 4,399.34 μs 16.153 μs 47.119 μs 4,401.64 μs 4,294.71 μs 4,531.75 μs 1.34 83 KB
WriteAsync BEFORE 1048576 512 Asynchronous 5,304.10 μs 73.935 μs 214.500 μs 5,309.27 μs 4,771.82 μs 5,852.43 μs 1.00 85 KB
WriteAsync AFTER 1048576 512 Asynchronous 6,506.15 μs 94.189 μs 271.756 μs 6,505.61 μs 6,042.59 μs 7,132.02 μs 1.23 75 KB
ReadAsync BEFORE 1048576 4096 None 1,090.98 μs 6.156 μs 18.152 μs 1,096.14 μs 1,039.20 μs 1,120.19 μs 1.00 29 KB
ReadAsync AFTER 1048576 4096 None 1,084.03 μs 8.601 μs 25.225 μs 1,086.13 μs 1,042.07 μs 1,131.85 μs 0.99 29 KB
WriteAsync BEFORE 1048576 4096 None 3,247.60 μs 43.259 μs 124.117 μs 3,243.69 μs 2,954.42 μs 3,556.50 μs 1.00 55 KB
WriteAsync AFTER 1048576 4096 None 3,250.35 μs 36.664 μs 100.985 μs 3,234.77 μs 3,039.74 μs 3,505.95 μs 1.00 55 KB
ReadAsync BEFORE 1048576 4096 Asynchronous 3,273.61 μs 17.591 μs 50.753 μs 3,264.72 μs 3,190.18 μs 3,410.97 μs 1.00 79 KB
ReadAsync AFTER 1048576 4096 Asynchronous 4,547.38 μs 73.438 μs 216.535 μs 4,398.34 μs 4,327.10 μs 5,035.98 μs 1.39 69 KB
WriteAsync BEFORE 1048576 4096 Asynchronous 5,384.46 μs 83.792 μs 240.415 μs 5,346.69 μs 4,897.35 μs 5,955.92 μs 1.00 84 KB
WriteAsync AFTER 1048576 4096 Asynchronous 6,845.63 μs 88.794 μs 256.192 μs 6,883.75 μs 6,079.92 μs 7,286.73 μs 1.27 74 KB
ReadAsync_NoBuffering BEFORE 1048576 16384 None 406.34 μs 2.413 μs 7.113 μs 403.28 μs 387.16 μs 419.08 μs 1.00 7 KB
ReadAsync_NoBuffering AFTER 1048576 16384 None 380.35 μs 3.413 μs 10.063 μs 378.36 μs 363.73 μs 402.82 μs 0.94 7 KB
WriteAsync_NoBuffering BEFORE 1048576 16384 None 2,545.17 μs 30.174 μs 83.613 μs 2,526.42 μs 2,397.67 μs 2,776.47 μs 1.00 7 KB
WriteAsync_NoBuffering AFTER 1048576 16384 None 2,537.77 μs 27.489 μs 76.628 μs 2,528.02 μs 2,421.65 μs 2,750.68 μs 1.00 7 KB
ReadAsync_NoBuffering BEFORE 1048576 16384 Asynchronous 949.54 μs 6.336 μs 18.077 μs 943.69 μs 926.29 μs 1,002.20 μs 1.00 20 KB
ReadAsync_NoBuffering AFTER 1048576 16384 Asynchronous 1,195.21 μs 3.070 μs 8.507 μs 1,194.94 μs 1,179.45 μs 1,218.92 μs 1.26 17 KB
WriteAsync_NoBuffering BEFORE 1048576 16384 Asynchronous 2,965.91 μs 40.616 μs 114.558 μs 2,952.10 μs 2,677.93 μs 3,244.55 μs 1.00 20 KB
WriteAsync_NoBuffering AFTER 1048576 16384 Asynchronous 3,021.90 μs 40.697 μs 116.110 μs 3,000.74 μs 2,803.04 μs 3,343.84 μs 1.02 17 KB
CopyToFileAsync BEFORE 1048576 ? None 2,440.43 μs 17.056 μs 49.754 μs 2,437.81 μs 2,343.18 μs 2,576.37 μs 1.00 3 KB
CopyToFileAsync AFTER 1048576 ? None 2,421.23 μs 15.376 μs 43.869 μs 2,413.43 μs 2,350.26 μs 2,521.45 μs 0.99 3 KB
CopyToFileAsync BEFORE 1048576 ? Asynchronous 2,878.14 μs 16.682 μs 47.863 μs 2,876.69 μs 2,771.42 μs 3,001.49 μs 1.00 4 KB
CopyToFileAsync AFTER 1048576 ? Asynchronous 2,769.09 μs 15.649 μs 43.882 μs 2,760.85 μs 2,691.56 μs 2,885.67 μs 0.96 4 KB
ReadAsync BEFORE 104857600 4096 None 115,657.20 μs 352.576 μs 1,039.577 μs 115,639.45 μs 113,670.15 μs 117,832.15 μs 1.00 2,801 KB
ReadAsync AFTER 104857600 4096 None 116,612.88 μs 249.577 μs 731.966 μs 116,751.05 μs 114,615.00 μs 117,903.05 μs 1.01 2,801 KB
WriteAsync BEFORE 104857600 4096 None 205,378.59 μs 7,785.258 μs 22,462.248 μs 194,584.50 μs 184,125.30 μs 268,874.60 μs 1.00 5,005 KB
WriteAsync AFTER 104857600 4096 None 209,453.56 μs 10,355.532 μs 30,043.276 μs 194,064.20 μs 173,778.40 μs 293,593.30 μs 1.03 5,005 KB
ReadAsync BEFORE 104857600 4096 Asynchronous 327,225.97 μs 1,532.011 μs 4,346.059 μs 326,564.00 μs 320,187.10 μs 339,508.20 μs 1.00 7,801 KB
ReadAsync AFTER 104857600 4096 Asynchronous 457,654.89 μs 902.286 μs 2,559.634 μs 457,207.00 μs 451,652.60 μs 465,093.00 μs 1.40 6,801 KB
WriteAsync BEFORE 104857600 4096 Asynchronous 454,994.00 μs 7,661.181 μs 21,857.780 μs 449,253.50 μs 421,512.80 μs 511,199.10 μs 1.00 7,905 KB
WriteAsync AFTER 104857600 4096 Asynchronous 600,076.62 μs 10,088.501 μs 29,268.570 μs 592,035.20 μs 559,265.50 μs 677,403.80 μs 1.32 6,905 KB
ReadAsync_NoBuffering BEFORE 104857600 16384 None 45,086.23 μs 156.169 μs 453.074 μs 45,005.65 μs 43,497.93 μs 46,041.93 μs 1.00 701 KB
ReadAsync_NoBuffering AFTER 104857600 16384 None 42,133.53 μs 174.922 μs 457.740 μs 42,093.18 μs 41,187.75 μs 43,731.75 μs 0.93 701 KB
WriteAsync_NoBuffering BEFORE 104857600 16384 None 61,814.53 μs 685.277 μs 1,887.453 μs 61,419.69 μs 59,079.38 μs 68,035.43 μs 1.00 701 KB
WriteAsync_NoBuffering AFTER 104857600 16384 None 61,435.05 μs 720.962 μs 1,973.622 μs 61,127.18 μs 58,506.18 μs 67,827.25 μs 0.99 701 KB
ReadAsync_NoBuffering BEFORE 104857600 16384 Asynchronous 97,220.19 μs 667.147 μs 1,881.701 μs 96,464.10 μs 94,386.65 μs 102,972.80 μs 1.00 1,951 KB
ReadAsync_NoBuffering AFTER 104857600 16384 Asynchronous 128,169.40 μs 556.512 μs 1,587.760 μs 127,771.80 μs 125,455.30 μs 133,061.15 μs 1.32 1,701 KB
WriteAsync_NoBuffering BEFORE 104857600 16384 Asynchronous 136,593.81 μs 1,249.681 μs 3,462.860 μs 136,822.00 μs 127,534.30 μs 146,281.60 μs 1.00 1,951 KB
WriteAsync_NoBuffering AFTER 104857600 16384 Asynchronous 169,656.11 μs 2,245.877 μs 6,515.696 μs 168,284.30 μs 157,404.20 μs 189,342.10 μs 1.25 1,701 KB
CopyToFileAsync BEFORE 104857600 ? None 56,537.38 μs 245.101 μs 695.312 μs 56,440.93 μs 55,567.65 μs 58,581.80 μs 1.00 177 KB
CopyToFileAsync AFTER 104857600 ? None 56,184.32 μs 275.348 μs 785.585 μs 56,116.43 μs 54,790.00 μs 58,574.72 μs 0.99 177 KB
CopyToFileAsync BEFORE 104857600 ? Asynchronous 87,668.39 μs 312.004 μs 905.181 μs 87,555.38 μs 85,324.65 μs 89,838.85 μs 1.00 246 KB
CopyToFileAsync AFTER 104857600 ? Asynchronous 78,490.65 μs 501.805 μs 1,431.678 μs 78,350.84 μs 75,379.32 μs 82,799.10 μs 0.90 214 KB

@adamsitnik
Copy link
Member

Most results had allocation improvements, but worse speed when FileOptions = Asynchronous:

As discussed offline, such a regression needs to be solved before merging the PR.

I've even built your fork and run the benchmarks myself to be 100% sure, but unfortunately, I've confirmed the regression:

Method Toolchain fileSize userBufferSize options Mean Ratio Allocated
ReadAsync_NoBuffering \after\corerun.exe 1048576 16384 Asynchronous 838.1 us 1.12 17 KB
ReadAsync_NoBuffering \before\corerun.exe 1048576 16384 Asynchronous 748.1 us 1.00 20 KB
ReadAsync_NoBuffering \after\corerun.exe 104857600 16384 Asynchronous 93,214.6 us 1.12 1,700 KB
ReadAsync_NoBuffering \before\corerun.exe 104857600 16384 Asynchronous 83,445.4 us 1.00 1,950 KB

If I can suggest something, ReadAsync_NoBuffering with fileSize=1048576 and options=Asynchronous should be the best benchmark for creating a repo app and profiling it. It's the simplest benchmark (buffering is not involved) and ReadAsync has a smaller deviation than WriteAsync. If you fail to find the regression with VS Profiler, you can use Perfview.

@adamsitnik adamsitnik added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Apr 14, 2021
@carlossanlop
Copy link
Member Author

I compared the CPU usage before and after, and discovered we spend a considerable amount of time inside Overlapped.PerformIOCompletionCallback.

The new code (left) consumes 62% of the time calling ExecutionContext.RunInternal, while the old code (right) only took 3.82%.

Screenshot

image

The root cause seems to be this line, where we call _source.SetResult:

Screenshot

image

Which is adding up the time it takes to run the rest of the ReadAsync calls in the loop, including the disposal of the FileStream object in the main method.

There's a comment stating that ExecutionContext.RunInternal (part of the callstack) warning that the code is a very hot path:

Screenshot

image

I wasn't expecting all the continuation code to be counted as part of the asynchronous calls. I wonder if this has anything to do with the removal of RunContinuationsAsynchronously = true flag. I'll bring that flag back and compare the results.

cc @jozkee @adamsitnik @stephentoub

@davidfowl
Copy link
Member

We don't need to restore the ExecutionContext in the overlapped, that restore should be avoided in these APIs. The async state machine already does it. This is what you need #42549 😄

@stephentoub
Copy link
Member

stephentoub commented Apr 14, 2021

The new code (left) consumes 62% of the time calling ExecutionContext.RunInternal, while the old code (right) only took 3.82%.

That's inclusive (what "Total" means in the VS view). The callback is actually running the continuation now (because of RunContinuationsAsynchronously=false), rather than queueing it to be run on a different thread, so the 62% is inclusive of the actual app code.

{
// Successfully got the callback, finish the callback
valueTaskSource.CompleteCallback(packedResult);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're copying this logic from what was there before, but it seems to be unnecessarily complex, with lots of interlocked operations for transitioning from one state to another to another. If we're looking for places to reduce overheads, revisiting this whole implementation is likely a good place to start.

@adamsitnik
Copy link
Member

The callback is actually running the continuation now (because of RunContinuationsAsynchronously=false)

I've run all the async benchmarks for 3 configurations:

  • "before" (@carlossanlop fork main branch)
  • "false" (the current state of this PR as RunContinuationsAsynchronously is not set so it's false)
  • "true" ( _source.RunContinuationsAsynchronously = true; in ctor)

With the following settings:

  • 100 instead of 20 iterations (--minIterationCount 100 --maxIterationCount 101)
  • no outliers removal (--outliers DontRemove)

and confirmed that with _source.RunContinuationsAsynchronously = true; we have no regression in the microbenchmarks

@stephentoub should we:

  • set RunContinuationsAsynchronously = true; and include this PR in preview4
  • get some "real world" numbers from ASP.NET benchmarks (if there are none, write some and contribute to https://github.com/aspnet/benchmarks) and see how RunContinuationsAsynchronously setting affects non-micro benchmarks. This means preview5 for this PR.
Method Toolchain fileSize userBufferSize options Mean StdDev Median Min Max Ratio Allocated
ReadAsync \before\corerun.exe 1024 1024 Asynchronous 82.69 us 1.946 us 82.22 us 81.60 us 94.90 us 1.00 5 KB
ReadAsync \false\corerun.exe 1024 1024 Asynchronous 78.76 us 0.718 us 78.56 us 78.04 us 82.13 us 0.95 5 KB
ReadAsync \true\corerun.exe 1024 1024 Asynchronous 83.28 us 0.825 us 83.12 us 82.48 us 89.82 us 1.01 5 KB
WriteAsync \before\corerun.exe 1024 1024 Asynchronous 404.05 us 22.574 us 400.81 us 379.20 us 494.53 us 1.00 5 KB
WriteAsync \false\corerun.exe 1024 1024 Asynchronous 400.76 us 17.211 us 397.80 us 380.07 us 472.21 us 0.99 5 KB
WriteAsync \true\corerun.exe 1024 1024 Asynchronous 405.20 us 25.469 us 401.47 us 383.59 us 510.74 us 1.01 5 KB
ReadAsync \before\corerun.exe 1048576 512 Asynchronous 2,418.83 us 55.684 us 2,413.59 us 2,361.95 us 2,804.46 us 1.00 93 KB
ReadAsync \false\corerun.exe 1048576 512 Asynchronous 2,648.89 us 73.366 us 2,626.45 us 2,550.29 us 2,959.92 us 1.10 83 KB
ReadAsync \true\corerun.exe 1048576 512 Asynchronous 2,386.25 us 9.985 us 2,386.16 us 2,356.79 us 2,419.58 us 0.99 83 KB
WriteAsync \before\corerun.exe 1048576 512 Asynchronous 4,254.70 us 277.778 us 4,213.63 us 3,967.55 us 6,203.71 us 1.00 85 KB
WriteAsync \false\corerun.exe 1048576 512 Asynchronous 4,583.08 us 183.949 us 4,559.42 us 4,317.29 us 5,334.19 us 1.08 75 KB
WriteAsync \true\corerun.exe 1048576 512 Asynchronous 4,207.77 us 179.340 us 4,173.53 us 3,896.65 us 4,865.02 us 0.99 75 KB
ReadAsync \before\corerun.exe 1048576 4096 Asynchronous 2,343.06 us 35.339 us 2,355.62 us 2,225.73 us 2,379.72 us 1.00 79 KB
ReadAsync \false\corerun.exe 1048576 4096 Asynchronous 2,603.91 us 40.916 us 2,603.11 us 2,513.55 us 2,715.90 us 1.11 69 KB
ReadAsync \true\corerun.exe 1048576 4096 Asynchronous 2,324.68 us 38.225 us 2,337.29 us 2,226.14 us 2,374.04 us 0.99 69 KB
WriteAsync \before\corerun.exe 1048576 4096 Asynchronous 4,110.48 us 265.902 us 4,050.53 us 3,844.25 us 6,007.74 us 1.00 84 KB
WriteAsync \false\corerun.exe 1048576 4096 Asynchronous 4,572.16 us 320.658 us 4,542.09 us 4,256.30 us 7,440.25 us 1.12 74 KB
WriteAsync \true\corerun.exe 1048576 4096 Asynchronous 4,048.01 us 184.553 us 4,015.01 us 3,739.28 us 4,754.61 us 0.99 74 KB
ReadAsync_NoBuffering \before\corerun.exe 1048576 16384 Asynchronous 747.53 us 5.076 us 747.87 us 729.80 us 755.69 us 1.00 20 KB
ReadAsync_NoBuffering \false\corerun.exe 1048576 16384 Asynchronous 822.87 us 6.877 us 822.90 us 809.94 us 838.18 us 1.10 17 KB
ReadAsync_NoBuffering \true\corerun.exe 1048576 16384 Asynchronous 726.82 us 11.262 us 726.44 us 699.96 us 754.36 us 0.97 17 KB
WriteAsync_NoBuffering \before\corerun.exe 1048576 16384 Asynchronous 2,726.54 us 136.755 us 2,712.48 us 2,534.40 us 3,234.12 us 1.00 20 KB
WriteAsync_NoBuffering \false\corerun.exe 1048576 16384 Asynchronous 2,761.05 us 146.361 us 2,792.11 us 2,507.70 us 3,188.16 us 1.01 17 KB
WriteAsync_NoBuffering \true\corerun.exe 1048576 16384 Asynchronous 2,736.46 us 194.495 us 2,705.98 us 2,415.11 us 4,022.13 us 1.01 17 KB
ReadAsync \before\corerun.exe 104857600 4096 Asynchronous 250,490.85 us 2,604.707 us 250,256.45 us 244,814.30 us 256,607.90 us 1.00 7,801 KB
ReadAsync \false\corerun.exe 104857600 4096 Asynchronous 278,260.89 us 4,931.050 us 278,009.45 us 269,882.60 us 297,338.40 us 1.11 6,803 KB
ReadAsync \true\corerun.exe 104857600 4096 Asynchronous 252,682.05 us 19,763.442 us 250,702.50 us 243,911.20 us 447,534.10 us 1.01 6,801 KB
WriteAsync \before\corerun.exe 104857600 4096 Asynchronous 370,893.59 us 6,590.799 us 369,781.05 us 352,834.20 us 390,366.40 us 1.00 7,905 KB
WriteAsync \false\corerun.exe 104857600 4096 Asynchronous 397,619.03 us 8,020.098 us 397,052.95 us 384,609.00 us 446,559.70 us 1.07 6,905 KB
WriteAsync \true\corerun.exe 104857600 4096 Asynchronous 358,128.05 us 6,536.171 us 357,755.45 us 346,707.00 us 382,735.70 us 0.97 6,905 KB

@adamsitnik adamsitnik removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Apr 15, 2021
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @carlossanlop !

@carlossanlop
Copy link
Member Author

With the unit tests passing, the micro benchmarks having no regressions after reverting RunContinuationsAsynchronously to true, and having allocation improvements, I feel confident in merging this change to include it in Preview4, and will continue working on #50972 to add caching and improve this code for Preview5.

@carlossanlop carlossanlop merged commit a38d0c2 into dotnet:main Apr 15, 2021
@carlossanlop carlossanlop deleted the IValueTaskSource branch April 15, 2021 22:51
@ghost ghost locked as resolved and limited conversation to collaborators May 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants