-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set max concurrent uni streams accordingly -- do not over allocate open uni streams #1060
base: master
Are you sure you want to change the base?
Set max concurrent uni streams accordingly -- do not over allocate open uni streams #1060
Conversation
00d93ce
to
1f6108a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split the new functionality in its own struct, and add unit tests for the math?
1c5fcf3
to
d21dab8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1060 +/- ##
=========================================
- Coverage 82.1% 82.1% -0.1%
=========================================
Files 893 893
Lines 236600 236677 +77
=========================================
+ Hits 194429 194451 +22
- Misses 42171 42226 +55 |
Done. |
streamer/src/nonblocking/quic.rs
Outdated
) -> u64 { | ||
let max_streams_per_throttle_window = | ||
ema.available_load_capacity_in_throttling_duration(peer_type, total_stake); | ||
(UniStreamQosUtil::compute_max_allowed_uni_streams(peer_type, total_stake) as u64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: UniStreamQosUtil::compute_max_allowed_uni_streams
could be replaced with Self::compute_max_allowed_uni_streams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
/// Given the max_streams_per_throttling_interval, derive the streams per throttle window. | ||
/// Do not allow concurrent streams more than the max streams per throttle window. | ||
pub fn max_concurrent_uni_streams_per_throttling_interval( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need a function for this? It's just a wrapper on min
. Why not directly use min()
where we are calling this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes the design goal more explicit and easier to do test
@@ -811,8 +844,15 @@ async fn handle_connection( | |||
stats.total_streams.load(Ordering::Relaxed), | |||
stats.total_connections.load(Ordering::Relaxed), | |||
); | |||
connection.set_max_concurrent_uni_streams(max_uni_streams); | |||
if let Some(receive_window) = receive_window { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any benefit of moving receive_window setting to this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It encapsulates better to put QOS related config for connections in one place (receive window and max concurrent uni streams.
streamer/src/nonblocking/quic.rs
Outdated
@@ -856,6 +896,20 @@ async fn handle_connection( | |||
sleep(throttle_duration).await; | |||
} | |||
} | |||
let max_concurrent_uni_streams = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: max_concurrent_uni_streams
is very overloaded here. Can we simplify the code? Maybe we if use min()
instead of UniStreamQosUtil::max_concurrent_uni_streams_per_throttling_interval()
, the code could be compressed and simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the variables a little to clarify
Some nits. Otherwise the logic looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please give @alessandrod a chance to look at it before merging.
@alessandrod -- this is the formal change for the experiments we have done in '3gG' where we hard coded concurrent streams to 1 for unstaked connection based on the calculation of max allowed streams count per throttle window. This change makes the stream count limit to take the minimum of (original max current uni streams per stake, max allowed streams per throttle window). |
Oops sorry I had missed this! I'll take a look in the morning |
Hi @alessandrod , any further comments on this PR? I'd like to wrap it up |
From what I understand, I don't think this code is needed, adds complexity and probably some round trips to communicate the new limit to peers. (And locking and wakeups of the connection task, but admittedly those should be infrequent). I could be wrong of course, but I haven't seen any plausible explanation of why this is needed (or multiple streams are needed to begin with). |
Can you clarify why it is not needed? I think you would agree over allocate is bad. If you are questioning why we have multiple streams in the first place. As I mentioned it is based on the thinking of reducing head of line issue and 2. for better performance using parallelism. And I mentioned in the slack channel that 1 stream vs the current default there is at least 3 times of difference in bench-tps test. Your point of multiple stream may cause fragmentations among the stream is a valid point. But it is orthogonal to this PR. I am not making it worse. If you are proposing just change stream count to 1 for everything, I am thinking it is too drastic change. Please explicit if you are proposing something different. |
Over allocate what? From what I understand - and again I could be wrong - we're not allocating anything more on the server side whether we allow 1 stream or 1000 streams. The max streams limit is a protocol limit that is communicated to the client. The server doesn't pre-reserve anything about it. Just the client will stop opening streams once it runs out of streams. Stream id tracking is done in the client, it requires no synchronization with the server. We have one task per connection. We read one stream at a time - not in parallel: we pop the next stream from the connection and process it. What gets over allocated? The thing that changes how much the server allocates is the receive window, and we already bound that.
I thought we agreed on slack that there's no HOL issue? If you don't agree, can you explain to me where the HOL issue is exactly? And can you explain where the parallelism is, if the server has one task per connection and pops one stream at a time? How is parallelism increased exactly?
You are adding code that I don't think is necessary. More code is always bad: more complexity and more bugs. In that sense, it's worse. Obviously I can be wrong, but I'd like to know how I'm wrong if you want me to approve the PR.
I'm not proposing to set streams to 1 in the context of this PR. I do think we should do it, but as you said it's orthogonal to this PR and likely requires fixes in the client before we can do it. |
Lijun: To follow up. I mentioned in the PR description: |
Problem
We can over allocate max concurrent open uni streams than the total streams allowed within a throttle window.
For example, an unstaked node might be eligible for 2 uni streams per throttle window, but we might allocate the 128 concurrent uni streams for it.
This has two problems: allocating resources on the server side unnecessarily and allow the client to open concurrent uni streams which will be throttled on server side, more timeout error on client side and more load on the server side.
Summary of Changes
Do not allow more open concurrent uni streams than the one permitted in a throttle window.
Fixes #