Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: Limit data frame payload size #111

Merged
merged 2 commits into from
Feb 11, 2021
Merged

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Feb 10, 2021

By limitting the data frame payload size one gains the following
beneftis:

  1. Reduce delays sending time-sensitive frames, e.g. window updates.

  2. Minimize head-of-line blocking across streams.

  3. Enable better interleaving of send and receive operations, as each is
    carried out atomically instead of concurrently with its respective
    counterpart.

Limiting the frame size to 16KiB does not introduce a large overhead. A
Yamux frame header is 12 bytes large, thus this change introduces an
overhead of ~0.07%.

See #100 (comment) and comments below that for details.

Might reduce probability for #104 to happen.

By limitting the data frame payload size one gains the following
beneftis:

1. Reduce delays sending time-sensitive frames, e.g. window updates.

2. Minimize head-of-line blocking across streams.

3. Enable better interleaving of send and receive operations, as each is
carried out atomically instead of concurrently with its respective
counterpart.

Limiting the frame size to 16KiB does not introduce a large overhead. A
Yamux frame header is 12 bytes large, thus this change introduces an
overhead of ~0.07%.
Copy link
Contributor

@romanb romanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, though I would prefer for this size to be configurable. It can then also be configurable in rust-libp2p in an analogous way to the "split send size" that is configurable for libp2p-mplex and currently defaults to 8KiB after measurement results very similar to yours for yamux. But I don't insist and leave it up to you.

@mxinden
Copy link
Member Author

mxinden commented Feb 11, 2021

though I would prefer for this size to be configurable.

Can you think of a scenario where one would want to configure the limit?

@romanb
Copy link
Contributor

romanb commented Feb 11, 2021

Can you think of a scenario where one would want to configure the limit?

If you consider your benchmark results to be representative in every case, then no. I just tend to be a bit more cautious in that different people may see a different optimum in different constellations and configurations of their networking stack and hence would allow them to experiment with this setting for themselves, should they want to do so.

@mxinden
Copy link
Member Author

mxinden commented Feb 11, 2021

If you consider your benchmark results to be representative in every case, then no.

You are right, I do not.

7a77cfb makes the limit configurable. I tried to mirror the mplex naming while keeping the Yamux conventions. As you usually have a good eye for easy-to-grasp documentation, let me know if you have suggestions for alternative wording @romanb.

@mxinden mxinden merged commit 29f5009 into libp2p:develop Feb 11, 2021
@romanb
Copy link
Contributor

romanb commented Feb 22, 2021

Limiting the frame size to 16KiB does not introduce a large overhead.

Maybe not in terms of bandwidth, but at least the test tests/concurrent.rs seems to show that somewhat as expected, the smaller the frame size, the slower the test. For example, on the branch of #112 where the test uses a 128KiB payload size the test takes ~1 second with a 128KiB frame size, but ~4s with a 16KiB frame size (on my machine, debug mode). The difference is less pronounced in release mode (~0.4s vs ~0.6s). Very similar numbers for develop where the test current uses 100KiB payloads (but actually stalls very often with >=128KiB frame sizes unless one configures a generous TCP send buffer - a problem that is no longer present in #112). I guess a somewhat noticeable difference in CPU time is to be expected due to the significant increase in function calls (and probably system calls as well). I'm not suggesting that it outweighs the benefits, but it shows that is was probably a good idea to make the size at which frames are split configurable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants