Double the size of the previous segment #66930

davidfowl · 2022-03-21T06:09:40Z

This change comes after long observations around how pipelines doesn't work well for copying large blocks of data mainly due to the segment allocation strategy. Today we allocate each segment > minimum segment size < max pool size. This works well if data is being quickly consumed because we can keep memory allocations to a minimum but doesn't work well when there's large chunks of data are being written. This change doubles the segment size based on the previous segment up to 1MB (arbitrary limit). This should make the default behavior work pretty much the same as today but performance of larger writes/reads should improve.

PS: Running more performance experiments.

Makes #49259 more feasible as the number of segments will be less per pipe because of the growth strategy

Might improve #43480 in the default case

- This change comes after long observations around how pipelines doesn't work well for copying large blocks of data mainly due to the segment allocation strategy. Today we allocate each segment > minimum segment size < max pool size. This works well if data is being quickly consumed because we can keep memory allocations to a minimum but doesn't work well when there's large chunks of data are being written. This change doubles the segment size based on the previous segment up to 1MB (arbitrary limit). This should make the default behavior work pretty much the same as today but performance of larger writes/reads should improve.

davidfowl · 2022-03-21T06:16:57Z

src/libraries/System.IO.Pipelines/src/System/IO/Pipelines/PipeOptions.cs

@@ -11,6 +11,9 @@ public class PipeOptions
    {
        private const int DefaultMinimumSegmentSize = 4096;

+        // Arbitrary 1MB max segment size
+        internal const int MaximumSegmentSize = 1024 * 1024;


Maybe this should be higher for large copy scenarios? Not sure.

We should think about how this interacts with logic like that for SocketConnection.MinAllocBufferSize. I could see us potentially unnecessarily doing syscalls using relatively small 2KB buffer for reads after this change as we reach the end of the block. This is a DoS mitigation given smaller segment sizes. But with larger segment sizes, we could leave more bytes unfilled before allocating syscalls without wasting too much memory proportionally.

Right, we'll need to tweak how we think about this (and if it matters). You'll end up throwing away 2K (which maybe is too aggressive anyways?)

I don't think throwing away up to 2K at the end of the segment is that bad especially if the segments get larger. I think we could throw away more. We want to reduce syscalls, so reading into small tail space would be counterproductive given large amounts of data. When we're copying data from one buffer to another in user mode, it makes far more sense to use all the available space.

Are you suggesting anything or just saying this is interesting?

davidfowl · 2022-03-21T06:35:42Z

src/libraries/System.IO.Pipelines/src/System/IO/Pipelines/StreamPipeWriter.cs

@@ -135,15 +135,16 @@ private void AllocateMemory(int sizeHint)
                        _tailBytesBuffered = 0;
                    }

-                    BufferSegment newSegment = AllocateSegment(sizeHint);
+                    int newSegmentSize = Math.Min(PipeOptions.MaximumSegmentSize, _tail.Capacity * 2);


This is simple to understand but another possible technique could be to base the growth on number of segments rather than the last segment (so taking the speed of the consumer into account to shrink the next segment). I'm not sure if it's a big concern though because:

If the last segment is ever fully consumed, it'll restart to the minimum segment size (default 4K).

If the last segment isn't fully consumed because the reader can't keep up, the pause threshold can be adjusted and working with bigger blocks of memory is better for the reader.

halter73 · 2022-03-21T22:28:59Z

src/libraries/System.IO.Pipelines/src/System/IO/Pipelines/Pipe.cs

@@ -206,7 +208,7 @@ private void AllocateWriteHeadSynchronized(int sizeHint)
            }
        }

-        private BufferSegment AllocateSegment(int sizeHint)
+        private BufferSegment AllocateSegment(int minimumSegmentSize, int sizeHint)


Nit: We should rename the sizeHint to minSize or something in internal/private APIs because it now violates the contract to provide spans/memory smaller than that so it's not really a "hint". With the new variables, this becoming somewhat confusing.

Yes we should but I dislike unrelated changes. I'll make that change as well.

halter73 · 2022-03-21T22:29:56Z

src/libraries/System.IO.Pipelines/src/System/IO/Pipelines/Pipe.cs

@@ -197,7 +197,9 @@ private void AllocateWriteHeadSynchronized(int sizeHint)
                            _writingHeadBytesBuffered = 0;
                        }

-                        BufferSegment newSegment = AllocateSegment(sizeHint);
+                        // Double the minimum segment size on subsequent segements
+                        int newSegmentSize = Math.Min(PipeOptions.MaximumSegmentSize, _writingHead.Capacity * 2);


What pool are you testing this with?

ArrayPool. I need to change the implementation of the PinnedBlockMemoryPool to make this work the way it's intended.

davidfowl · 2022-04-16T22:29:32Z

/azp run

azure-pipelines · 2022-04-16T22:29:37Z

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

davidfowl · 2022-04-16T22:30:01Z

/azp run help

azure-pipelines · 2022-04-16T22:30:06Z

No pipelines are associated with this pull request.

davidfowl · 2022-04-17T19:41:41Z

/azp run runtime

azure-pipelines · 2022-04-17T19:42:05Z

Azure Pipelines successfully started running 1 pipeline(s).

stephentoub · 2022-07-22T18:33:20Z

@davidfowl, are you still working on this?

davidfowl · 2022-07-23T13:58:02Z

Punt to future.

dotnet-issue-labeler bot added the area-System.IO.Pipelines label Mar 21, 2022

ghost assigned davidfowl Mar 21, 2022

davidfowl requested review from BrennanConroy, halter73 and adityamandaleeka March 21, 2022 06:09

Fixed tests and comment

21ee822

davidfowl commented Mar 21, 2022

View reviewed changes

This was referenced Mar 21, 2022

ThreadPoolTests.CooperativeBlockingCanCreateThreadsFaster timed out #64964

Closed

Threadpool test CooperativeBlockingCanCreateThreadsFaster failing on Mac #66852

Closed

davidfowl changed the title ~~Double the size of each segment~~ Double the size of the previous segment Mar 21, 2022

halter73 reviewed Mar 21, 2022

View reviewed changes

davidfowl marked this pull request as ready for review April 10, 2022 08:31

davidfowl added 2 commits April 17, 2022 22:02

Make GetSegmentSize static

d4f728c

Again

9385601

davidfowl closed this Jul 23, 2022

jkotas deleted the davidfowl/double-buffers branch July 28, 2022 19:08

ghost locked as resolved and limited conversation to collaborators Aug 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Double the size of the previous segment #66930

Double the size of the previous segment #66930

davidfowl commented Mar 21, 2022 •

edited

Loading

davidfowl Mar 21, 2022 •

edited

Loading

halter73 Mar 21, 2022

davidfowl Mar 22, 2022

halter73 Mar 22, 2022

davidfowl Mar 25, 2022

davidfowl Mar 21, 2022 •

edited

Loading

halter73 Mar 21, 2022

davidfowl Mar 21, 2022

halter73 Mar 21, 2022

davidfowl Mar 21, 2022

davidfowl commented Apr 16, 2022

azure-pipelines bot commented Apr 16, 2022

davidfowl commented Apr 16, 2022

azure-pipelines bot commented Apr 16, 2022

davidfowl commented Apr 17, 2022

azure-pipelines bot commented Apr 17, 2022

stephentoub commented Jul 22, 2022

davidfowl commented Jul 23, 2022

Double the size of the previous segment #66930

Double the size of the previous segment #66930

Conversation

davidfowl commented Mar 21, 2022 • edited Loading

davidfowl Mar 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidfowl Mar 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidfowl commented Apr 16, 2022

azure-pipelines bot commented Apr 16, 2022

davidfowl commented Apr 16, 2022

azure-pipelines bot commented Apr 16, 2022

davidfowl commented Apr 17, 2022

azure-pipelines bot commented Apr 17, 2022

stephentoub commented Jul 22, 2022

davidfowl commented Jul 23, 2022

davidfowl commented Mar 21, 2022 •

edited

Loading

davidfowl Mar 21, 2022 •

edited

Loading

davidfowl Mar 21, 2022 •

edited

Loading