Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double the size of the previous segment #66930

Closed
wants to merge 4 commits into from
Closed

Conversation

davidfowl
Copy link
Member

@davidfowl davidfowl commented Mar 21, 2022

  • This change comes after long observations around how pipelines doesn't work well for copying large blocks of data mainly due to the segment allocation strategy. Today we allocate each segment > minimum segment size < max pool size. This works well if data is being quickly consumed because we can keep memory allocations to a minimum but doesn't work well when there's large chunks of data are being written. This change doubles the segment size based on the previous segment up to 1MB (arbitrary limit). This should make the default behavior work pretty much the same as today but performance of larger writes/reads should improve.

PS: Running more performance experiments.

Makes #49259 more feasible as the number of segments will be less per pipe because of the growth strategy

Might improve #43480 in the default case

- This change comes after long observations around how pipelines doesn't work well for copying large blocks of data mainly due to the segment allocation strategy. Today we allocate each segment > minimum segment size < max pool size. This works well if data is being quickly consumed because we can keep memory allocations to a minimum but doesn't work well when there's large chunks of data are being written. This change doubles the segment size based on the previous segment up to 1MB (arbitrary limit). This should make the default behavior work pretty much the same as today but performance of larger writes/reads should improve.
@@ -11,6 +11,9 @@ public class PipeOptions
{
private const int DefaultMinimumSegmentSize = 4096;

// Arbitrary 1MB max segment size
internal const int MaximumSegmentSize = 1024 * 1024;
Copy link
Member Author

@davidfowl davidfowl Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be higher for large copy scenarios? Not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about how this interacts with logic like that for SocketConnection.MinAllocBufferSize. I could see us potentially unnecessarily doing syscalls using relatively small 2KB buffer for reads after this change as we reach the end of the block. This is a DoS mitigation given smaller segment sizes. But with larger segment sizes, we could leave more bytes unfilled before allocating syscalls without wasting too much memory proportionally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we'll need to tweak how we think about this (and if it matters). You'll end up throwing away 2K (which maybe is too aggressive anyways?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think throwing away up to 2K at the end of the segment is that bad especially if the segments get larger. I think we could throw away more. We want to reduce syscalls, so reading into small tail space would be counterproductive given large amounts of data. When we're copying data from one buffer to another in user mode, it makes far more sense to use all the available space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting anything or just saying this is interesting?

@@ -135,15 +135,16 @@ private void AllocateMemory(int sizeHint)
_tailBytesBuffered = 0;
}

BufferSegment newSegment = AllocateSegment(sizeHint);
int newSegmentSize = Math.Min(PipeOptions.MaximumSegmentSize, _tail.Capacity * 2);
Copy link
Member Author

@davidfowl davidfowl Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple to understand but another possible technique could be to base the growth on number of segments rather than the last segment (so taking the speed of the consumer into account to shrink the next segment). I'm not sure if it's a big concern though because:

  • If the last segment is ever fully consumed, it'll restart to the minimum segment size (default 4K).
  • If the last segment isn't fully consumed because the reader can't keep up, the pause threshold can be adjusted and working with bigger blocks of memory is better for the reader.

@davidfowl davidfowl changed the title Double the size of each segment Double the size of the previous segment Mar 21, 2022
@@ -206,7 +208,7 @@ private void AllocateWriteHeadSynchronized(int sizeHint)
}
}

private BufferSegment AllocateSegment(int sizeHint)
private BufferSegment AllocateSegment(int minimumSegmentSize, int sizeHint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should rename the sizeHint to minSize or something in internal/private APIs because it now violates the contract to provide spans/memory smaller than that so it's not really a "hint". With the new variables, this becoming somewhat confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should but I dislike unrelated changes. I'll make that change as well.

@@ -197,7 +197,9 @@ private void AllocateWriteHeadSynchronized(int sizeHint)
_writingHeadBytesBuffered = 0;
}

BufferSegment newSegment = AllocateSegment(sizeHint);
// Double the minimum segment size on subsequent segements
int newSegmentSize = Math.Min(PipeOptions.MaximumSegmentSize, _writingHead.Capacity * 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What pool are you testing this with?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayPool. I need to change the implementation of the PinnedBlockMemoryPool to make this work the way it's intended.

@davidfowl davidfowl marked this pull request as ready for review April 10, 2022 08:31
@davidfowl
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@davidfowl
Copy link
Member Author

/azp run help

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@davidfowl
Copy link
Member Author

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephentoub
Copy link
Member

@davidfowl, are you still working on this?

@davidfowl
Copy link
Member Author

Punt to future.

@davidfowl davidfowl closed this Jul 23, 2022
@jkotas jkotas deleted the davidfowl/double-buffers branch July 28, 2022 19:08
@ghost ghost locked as resolved and limited conversation to collaborators Aug 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants