-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Squash network inbound queues #13956
Conversation
⏱️ 12h 13m total CI duration on this PR
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick pass. Will return to look at the internals 😄
mempool/src/tests/node.rs
Outdated
|
||
#[cfg(wrong_network_abstraction)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchocho, any thoughts on updating or removing all these tests? Should that be done before this PR?
(My feeling is that these should be fixed/removed first, e.g., to ensure we don't accidentally miss something. Feels odd just ignoring them all.)
data: data.into(), | ||
res_tx, | ||
let notification = ReceivedMessage { | ||
message: NetworkMessage::RpcRequest(RpcRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it might be worth exposing a test-only method to create these for tests (e.g., to avoid us having to assign 0's to request_id
, priority
and rx_at
everywhere)?
@@ -26,6 +26,7 @@ pub enum PeerManagerRequest { | |||
} | |||
|
|||
/// Notifications sent by PeerManager to upstream actors. | |||
/// TODO: PeerManagerNotification now only exists in test code and should be deleted; probably use `ReceivedMessage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the mempool-side changes, this can now be removed completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me. 😄 Will wait for @bchocho to comment.
let mut connection = MultiplexMessageSink::new(connection, MAX_FRAME_SIZE); | ||
for _ in 0..30 { | ||
// The client should then send the network message. | ||
connection.send(&send_msg).await.unwrap(); | ||
} | ||
info!("client sent"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we want to commit these, or were these for local debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test code logging is pretty cheap? it gets thrown away if the test passes, right?
network/framework/src/peer/mod.rs
Outdated
); | ||
match self.upstream_handlers.get(&direct.protocol_id) { | ||
None => { | ||
// TODO: better label than "declined"? more like "garbage-in" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to prioritize following up on the TODOs in a future PR 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not_recognized
?
I took a stab at mempool fixing and it wasn't as bad as the full network2 branch with changing both inbound and outbound paths. Merged the fix in here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mempool
changes look great! I'll continue looking at the rest of the PR
@@ -26,6 +26,7 @@ pub enum PeerManagerRequest { | |||
} | |||
|
|||
/// Notifications sent by PeerManager to upstream actors. | |||
/// TODO: PeerManagerNotification now only exists in test code and should be deleted; probably use `ReceivedMessage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the mempool-side changes, this can now be removed completely?
network/framework/src/peer/mod.rs
Outdated
); | ||
match self.upstream_handlers.get(&direct.protocol_id) { | ||
None => { | ||
// TODO: better label than "declined"? more like "garbage-in" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not_recognized
?
@@ -214,31 +262,52 @@ impl<TMessage> Stream for NetworkEvents<TMessage> { | |||
} | |||
} | |||
|
|||
fn unix_micros() -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use time service?
fn now_unix_time(&self) -> Duration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the usage sites are pretty awkward to plumb that through to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @brianolson 😄
network/framework/src/counters.rs
Outdated
@@ -26,6 +26,7 @@ pub const RECEIVED_LABEL: &str = "received"; | |||
pub const SENT_LABEL: &str = "sent"; | |||
pub const SUCCEEDED_LABEL: &str = "succeeded"; | |||
pub const FAILED_LABEL: &str = "failed"; | |||
pub const UNKNOWN_LABEL: &str = "unk"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we just make this unknown
? 😄
@@ -124,7 +122,7 @@ fn build_test_peer_manager( | |||
peer_manager, | |||
peer_manager_request_tx, | |||
connection_reqs_tx, | |||
hello_rx, | |||
// hello_rx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove?
} | ||
}, | ||
NetworkMessage::RpcResponse(response) => { | ||
NetworkMessage::RpcResponse(_) => { | ||
// non-reference cast identical to this match case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason we can't just match this directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other branches work better with a match over a reference, and *response
didn't work.
} = message; | ||
let dequeue_at = unix_micros(); | ||
let dt_micros = dequeue_at - rx_at; | ||
let dt_seconds = (dt_micros as f64) / 1000000.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd move this to a helper:
let dequeue_delta_secs = calculate_dequeue_delta(rx_at);
...
fn calculate_dequeue_delta(rx_at: u64) -> u64 {
let dequeue_at = unix_micros();
let dt_micros = dequeue_at - rx_at;
(dt_micros as f64) / 1000000.0
}
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
✅ Forge suite
|
Description
Remove several queues of indirection, move received messages directly from the reader thread of a peer to application code.
add inbound queue delay metric
aptos_network_inbound_queue_time
Type of Change
Which Components or Systems Does This Change Impact?
How Has This Been Tested?
Local cluster tests. unit tests. forge cluster tests.
Key Areas to Review
Some unit tests in mempool were lost due to relying on layers of the network that are being phased out.
Some parts of the inbound path may still be there and unused, further code cleanup is possible.
Checklist