Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the default value of PipeOptions.MinimumSegmentSize #43480

Open
halter73 opened this issue Oct 16, 2020 · 9 comments
Open

Increase the default value of PipeOptions.MinimumSegmentSize #43480

halter73 opened this issue Oct 16, 2020 · 9 comments
Labels
Milestone

Comments

@halter73
Copy link
Member

halter73 commented Oct 16, 2020

Description

Copying data is a lot faster with larger buffers. And copying data is a large use case for System.IO.Pipelines. For example, in ASP.NET Core, we plan to use PipeWriter to write files to response bodies (dotnet/aspnetcore#24851).

@brporter got us looking at the performance of using Pipes to copy files, and this once again demonstrated how crucial large buffers are. This is why System.IO.Stream's DefaultCopyBufferSize is 81920 bytes.

// We pick a value that is the largest multiple of 4096 that is still smaller than the large object heap threshold (85K).
// The CopyTo/CopyToAsync buffer is short-lived and is likely to be collected at Gen0, and it offers a significant
// improvement in Copy performance.
private const int DefaultCopyBufferSize = 81920;

PipeOptions.DefaultMinimumSegmentSize is only 4096 bytes, and this leads to terrible performance when calling something like the default implementation of WriteAsync, CopyToAsync or anything that calls PipeWriter.GetMemory() or PipeWriter.GetSpan() without a sizeHint.

private const int DefaultMinimumSegmentSize = 4096;

In my testing, copying a 2GB file went from taking 8607ms to 1771ms by increasing PipeOptions.MinimumSegmentSize from 4096 bytes to 655350 bytes. Even with the in-between Stream.DefaultCopyBufferSize value of 81920 bytes the copy time dropped to 2458ms.

You

Configuration

You can find the benchmark app at https://github.com/halter73/PipeTest/blob/master/Program.cs.

F:\dev\halter73\PipeTest [master +3 ~1 -0 !]> dotnet --info
.NET SDK (reflecting any global.json):
 Version:   6.0.100-alpha.1.20472.11
 Commit:    e55929c5a5

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.20231
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   F:\dev\aspnet\AspNetCore\.dotnet\sdk\6.0.100-alpha.1.20472.11\

Host (useful for support):
  Version: 6.0.0-alpha.1.20507.4
  Commit:  4fef87c65e

.NET SDKs installed:
  6.0.100-alpha.1.20472.11 [F:\dev\aspnet\AspNetCore\.dotnet\sdk]

.NET runtimes installed:
  Microsoft.NETCore.App 6.0.0-alpha.1.20468.7 [F:\dev\aspnet\AspNetCore\.dotnet\shared\Microsoft.NETCore.App]

Regression?

No.

Data

Before (4096)

F:\dev\halter73\PipeTest [slice +3 ~1 -0 !]> dotnet run .\test.in .\test.out PipelinesSE
Copying .\test.in to .\test.out with method PipelinesSE...done!

GetTotalAllocatedBytes(true): [117,722,376] bytes
GetTotalMemory(false): [3,353,520] bytes
Executed for 8607ms.

After (655350)

F:\dev\halter73\PipeTest [master +3 ~1 -0 !]> dotnet run .\test.in .\test.out PipelinesSE
Copying .\test.in to .\test.out with method PipelinesSE...done!

GetTotalAllocatedBytes(true): [1,672,544] bytes
GetTotalMemory(false): [1,718,512] bytes
Executed for 1771ms.
@halter73 halter73 added the tenet-performance Performance related issue label Oct 16, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.IO.Pipelines untriaged New issue has not been triaged by the area owner labels Oct 16, 2020
@halter73
Copy link
Member Author

@BrennanConroy
Copy link
Member

Triage: Alternatively, we can add an API to pass a memory hint to WriteAsync and CopyToAsync

@halter73
Copy link
Member Author

We could do both.

@davidfowl
Copy link
Member

I don't really understand what we'd change the default to based on this since it's scenario specific. I do think we could change the defaults for CopyToAsync and have a way to pass in a sizehint for CopyToAsync and WriteAsync.

@benaadams
Copy link
Member

In my testing, copying a 2GB file went from taking 8607ms to 1771ms by increasing PipeOptions.MinimumSegmentSize from 4096 bytes to 655350 bytes. Even with the in-between Stream.DefaultCopyBufferSize value of 81920 bytes the copy time dropped to 2458ms.

Is that doing multiple smaller async reads from a stream rather than repeated smaller copies?

e.g. if you read to a large chunk (65k); then copy that to the small 4k blocks what's the time difference?

@halter73
Copy link
Member Author

Is that doing multiple smaller async reads from a stream rather than repeated smaller copies?

e.g. if you read to a large chunk (65k); then copy that to the small 4k blocks what's the time difference?

When experimenting with file copying on Windows, I tested that by repeatedly calling GetMemory()/Advance() without flushing every time in my flush-less branch. That only yielded an ~8% improvement.

Then I created a cheat branch that went even further and Advanced prior to receiving data so it could do concurrent reads into smaller buffers. This was yielded a ~48% improvement which was equivalent to using a larger buffer (where larger buffer size = concurrent reads * smaller buffer size). But since you can only really read into one buffer at a time with a PipeWriter, that's not an option unless we add PipeWriter APIs.

You can see all the branches I experimented with at https://github.com/halter73/PipeTest/branches. I've included my local perf measurements in my commit messages.

@benaadams
Copy link
Member

Ah, DefaultMinimumSegmentSize is independent of the Memory pool block size; I was thinking it was increasing that.

Though it would cause by default allocations from the ArrayPool rather than the Memory pool for Kestrel (if the size is not specified)?

@halter73
Copy link
Member Author

Though it would cause by default allocations from the ArrayPool rather than the Memory pool for Kestrel (if the size is not specified)?

If we do this, Kestrel likely wouldn't use the DefaultMinimumSegmentSize in order to avoid this.

@Chacoon3
Copy link

Thank you guys for discussing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants