Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous random file access with FileStream Read vs ReadAsync in a loop / parallel (ReadAsync is slow) #27047

Closed
virzak opened this issue Aug 2, 2018 · 4 comments
Assignees
Labels
area-System.IO enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Milestone

Comments

@virzak
Copy link
Contributor

virzak commented Aug 2, 2018

There seems to be a performance issue with readasync in a loop. It is 8 times slower than the synchronous read. A sample solution is here: https://github.com/virzak/DotNetPerformance

Detailed description of the issuw in SO: https://stackoverflow.com/questions/51560443/asynchronous-random-file-access-with-filestream-read-vs-readasync-in-a-loop-pa

@JeremyKuhne
Copy link
Member

cc: @stephentoub

@JeremyKuhne
Copy link
Member

FileStream overhaul is on our bucket list, but it is a non-trivial endeavor. :)

@virzak
Copy link
Contributor Author

virzak commented Feb 23, 2019

@JeremyKuhne Thanks for response Jeremy,

Hopefully not too distant future ( 3.1? :) ). This was by far the biggest issue we faced converting our rather large code base from sync to async. It sucked having to convert back and use synchronized code blocks in async functions.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@JeremyKuhne JeremyKuhne removed the untriaged New issue has not been triaged by the area owner label Mar 3, 2020
@adamsitnik adamsitnik modified the milestones: Future, 6.0.0 Mar 23, 2021
@adamsitnik adamsitnik self-assigned this Apr 17, 2021
@adamsitnik
Copy link
Member

We have recently invested a LOT in FileStream and IMO now it's as fast as possible. Every ReadAsync performs a single syscall (previously it was at least three syscalls) and might not allocate managed memory at all (as long as you are using the ValueTask<int> returning overload in a sequential way and have buffering disabled).

I've fixed the provided benchmarks as they were:

And extended them with:

  • having buffered enabled and disabled
  • using buffers of various sizes (1kb, 8kb, 64kb).

The code:

public class ReadAsyncImprovements
{
    const long FileSize = 8000 * 1000; // "we need to retrieve 8000 samples of 1000 bytes"
    private readonly byte[] _1kb = new byte[1_000];
    private readonly byte[] _8kb = new byte[8_000];
    private readonly byte[] _64kb = new byte[64_000];

    string _filePath;

    [Params(true, false)]
    public bool BufferingEnabled { get; set; }

    [Params(1_000, 8_000, 64_000)]
    public int UserBufferSize { get; set; }

    [GlobalSetup]
    public void Setup()
    {
        _filePath = Path.Combine(Path.GetTempPath(), Path.GetTempFileName());

        if (File.Exists(_filePath))
        {
            File.Delete(_filePath);
        }

        File.WriteAllBytes(_filePath, new byte[FileSize]);
    }

    [GlobalCleanup]
    public void Cleanup() => File.Delete(_filePath);

    [Benchmark]
    public void ReadSync()
    {
        // don't allocate the buffer in the benchmark, otherwise allocation stats include it
        byte[] userBuffer = UserBufferSize == 1_000 ? _1kb : (UserBufferSize == 8_000 ? _8kb : _64kb);

        using FileStream fs = new FileStream(_filePath, FileMode.Open, FileAccess.Read, FileShare.Read, BufferingEnabled ? 4096 : 1, FileOptions.None);
        while (fs.Read(userBuffer, 0, userBuffer.Length) > 0) ;
    }

    [Benchmark]
    public async Task ReadAsync()
    {
        byte[] userBuffer = UserBufferSize == 1_000 ? _1kb : (UserBufferSize == 8_000 ? _8kb : _64kb);

        using FileStream fs = new FileStream(_filePath, FileMode.Open, FileAccess.Read, FileShare.Read, BufferingEnabled ? 4096 : 1, FileOptions.Asynchronous);
#if NETFRAMEWORK
        while (await fs.ReadAsync(userBuffer, 0, userBuffer.Length) > 0) ;
#else
        // it's recommended to use ValueTask-returning overloads
        while (await fs.ReadAsync(new Memory<byte>(userBuffer, 0, userBuffer.Length)) > 0) ;
#endif
    }
}
dotnet run -c Release -f net6.0 --filter *ReadAsyncImprovements* --runtimes net48 net5.0 net6.0

Using my hardware (SSD drive with Windows BitLocker enabled) and your config (buffering enabled, reading data into 1kb array) we can see that ReadAsync is now just two times slower (it's 11 times slower for .NET 4.8 and 6 times for .NET 5)

Method Toolchain BufferingEnabled UserBufferSize Mean Ratio Allocated
ReadSync net5.0 True 1000 9.330 ms 1.04 4,288 B
ReadSync net6.0 True 1000 9.428 ms 1.05 4,348 B
ReadSync net48 True 1000 8.985 ms 1.00 4,352 B
ReadAsync net5.0 True 1000 55.997 ms 0.53 734,842 B
ReadAsync net6.0 True 1000 19.102 ms 0.18 411,707 B
ReadAsync net48 True 1000 104.822 ms 1.00 2,600,960 B

If we disable the buffering and keep using a small buffer (1kb) it's now 3 times slower (9 times for .NET 4.8 and 7 for .NET 5):

Method Toolchain BufferingEnabled UserBufferSize Mean Ratio Allocated
ReadSync net5.0 False 1000 21.165 ms 1.11 265 B
ReadSync net6.0 False 1000 20.048 ms 1.02 172 B
ReadSync net48 False 1000 19.568 ms 1.00 683 B
ReadAsync net5.0 False 1000 147.746 ms 0.81 2,496,720 B
ReadAsync net6.0 False 1000 64.924 ms 0.36 756 B
ReadAsync net48 False 1000 182.782 ms 1.00 3,137,536 B

But if we increase the array size to 8kb and reduce the number of syscalls we get much better perf:

Method Toolchain BufferingEnabled UserBufferSize Mean Ratio Allocated
ReadSync net5.0 False 8000 3.940 ms 1.10 168 B
ReadSync net6.0 False 8000 3.945 ms 1.09 162 B
ReadSync net48 False 8000 3.600 ms 1.00 205 B
ReadAsync net5.0 False 8000 20.835 ms 0.83 312,720 B
ReadAsync net6.0 False 8000 9.994 ms 0.40 726 B
ReadAsync net48 False 8000 24.964 ms 1.00 394,035 B

And using even a bigger array (64kb) we can see that ReadAsync is just 1.7x slower than Read:

Method Toolchain BufferingEnabled UserBufferSize Mean Ratio Allocated
ReadSync net5.0 False 64000 1.670 ms 1.16 168 B
ReadSync net6.0 False 64000 1.653 ms 1.14 161 B
ReadSync net48 False 64000 1.446 ms 1.00 186 B
ReadAsync net5.0 False 64000 3.981 ms 0.88 39,720 B
ReadAsync net6.0 False 64000 2.814 ms 0.62 722 B
ReadAsync net48 False 64000 4.520 ms 1.00 50,048 B

We are soon going to write a dedicated blog post about our recent improvements. We are also planning to release a doc that explains how to use FileStream for the best possible perf.

Since the async FileStream implementation for Windows is now optimal, I am going to close the issue. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators May 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants