-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Send messages to EntryNotifierService in Tpu #31889
Conversation
ebb1a45
to
c32d28b
Compare
} | ||
} | ||
|
||
// TODO: in PohRecorder, we panic if the send to BroadcastStage fails. Should we do the same here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on this question?
Also, would you want to see this send in a separate thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we panic in pohrecorder? PohService waits for an exit signal https://github.com/apfitzge/solana/blob/abbf96f17aca9647766559db5b5de714726da033/poh/src/poh_service.rs#L371, and poh_recorder doesn't panic on record either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you want to see this send in a separate thread?
This is already running in a separate TpuEntryNotifier
thread right? Seems like this send is perfectly fine here since it's nonblocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we panic in pohrecorder? PohService waits for an exit signal https://github.com/apfitzge/solana/blob/abbf96f17aca9647766559db5b5de714726da033/poh/src/poh_service.rs#L371, and poh_recorder doesn't panic on record either
This is the panic I was referring to:
solana/poh/src/poh_recorder.rs
Line 186 in 37ebb70
Err(e) => panic!("Poh recorder returned unexpected error: {e:?}"), |
It's a bit of a chore to track all the results around, so I might be misreading. But I think that if the BroadcastStage send fails with an Err, this returns an Err, which gets returned to that method with the panic I linked above here. Is that right?
Codecov Report
@@ Coverage Diff @@
## master #31889 +/- ##
=========================================
- Coverage 81.9% 81.9% -0.1%
=========================================
Files 760 761 +1
Lines 207397 207528 +131
=========================================
+ Hits 169962 170006 +44
- Misses 37435 37522 +87 |
} | ||
} | ||
|
||
// TODO: in PohRecorder, we panic if the send to BroadcastStage fails. Should we do the same here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we panic in pohrecorder? PohService waits for an exit signal https://github.com/apfitzge/solana/blob/abbf96f17aca9647766559db5b5de714726da033/poh/src/poh_service.rs#L371, and poh_recorder doesn't panic on record either
core/src/tpu.rs
Outdated
@@ -229,10 +231,18 @@ impl Tpu { | |||
prioritization_fee_cache, | |||
); | |||
|
|||
let (tpu_entry_sender, tpu_entry_receiver) = unbounded(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would it be clearer to rename these to broadcast_entry_sender
and broadcast_entry_receiver
to match the names of the constructor arguments in TpuEntryNotifier
} | ||
} | ||
|
||
// TODO: in PohRecorder, we panic if the send to BroadcastStage fails. Should we do the same here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you want to see this send in a separate thread?
This is already running in a separate TpuEntryNotifier
thread right? Seems like this send is perfectly fine here since it's nonblocking
poh/src/poh_recorder.rs
Outdated
pub entry_index: Option<usize>, | ||
} | ||
|
||
impl WorkingBank { | ||
fn increment_entry_index(&mut self) { | ||
if let Some(index) = self.entry_index.as_mut() { | ||
*index += 1 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like we could just make entry_index
always a usize? It doesn't seem to be saving much work other than incrementing an integer.
This has the added benefit of removing the additional track_entry_indexes
in set_bank()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also thinking maybe it would be better to just track all these indexes in tpu_entry_notifier
? I.e. if you see a new entry for the same slot, then bump the index. This has a few benefits
- would allow us to leave
poh_recorder
unchanged and unaware of this additional logic - if the track entry option is off, we can avoid spinning up the tpu_entry_notifier service altogether, and no additional work is incurred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. Thanks for the idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there is an additional reason for tracking the entry indexes in poh_recorder: in order to be able to figure out which transactions fell in which entry, we are planning to add the entry index to the geyser transaction notification. The easiest way to do that is to include the entry index with the transaction-starting-index that's returned here
…ntrySummarys alongside BroadcastStage messages
…ed; eliminate duplicate WorkingBankEntry type
b3775f4
to
173147b
Compare
Last 3 commits are new, based on review comments (I'm happy to clean up+squash things later). However, this persists tracking of entry_indexes in poh_recorder, as per this comment: #31889 (comment) |
173147b
to
7a207b6
Compare
7a207b6
to
743b071
Compare
Problem
#31290 adds an entry notification service, which is currently only fed from the Tvu pipeline. But leaders should also support geyser entry notifications -- this is especially useful for testing plugins on
solana-test-validator
, since the TestValidator is the leader for every slot.Summary of Changes
Adds a little service to sit between PohRecorder and BroadcastStage in the Tpu pipeline. If geyser entry notifications are supported, the entry summary information is copied from the message and sent to the entry notification service. All entries are passed through unchanged to the BroadcastStage receiver.
I recommend reviewing by commit, because a change to the
PohRecorder::set_bank()
caused a lot of churn in tests (like in #25688)