-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Connect TPU's broadcast service with TVU's blob fetch stage #2587
Conversation
BlobFetchStage::new( | ||
Arc::new(node.sockets.tvu.pop().unwrap()), | ||
blob_receiver_exit.clone(), | ||
let (blob_fetch_sender, blob_fetch_receiver) = channel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this test be in-ignored? Seems like it’s the only test that would be affected by this patch, which really means that it’s all untested (beyond running a full-blown testnet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are tests in TVU which should exercise this change, as BlobFetchStage is part of TVU.
I am not sure why this test is ignored. Maybe a separate PR to test/unignore it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Seems a little bizarre architecturally. Check out: https://github.com/solana-labs/solana/blob/master/book/art/fullnode.bob Does this PR remove that arrow that's going from TPU to Bank and add one from BroadcastService to TVU? If so, that seems less desirable in that it'd mean sigverify and process_transactions has to be run twice for every transaction. |
src/fullnode.rs
Outdated
@@ -314,6 +317,7 @@ impl Fullnode { | |||
tpu_sockets: node.sockets.tpu, | |||
broadcast_socket: node.sockets.broadcast, | |||
role_notifiers: (to_leader_receiver, to_validator_receiver), | |||
blob_sender: Some(blob_sender), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this would be better owned by the Tvu struct? That way the exit() function for tvu can take care of dropping this instead of fullnode having to drop it on exit. That will also fix the failing validator_parallel_exit() test :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but doesn't that require tvu::exit() to take a mutable reference to self? Same problem here as well, if we want to drop it in fullnode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes t does, so basically it would just move this chunk of logic surrounding the Option into the tvu's exit function. The benefit of this is that the current fullnode's exit() will encapsulate the entire shutdown process without having special case logic here in the fullnode that's specific to the TVU. Right now that test is failing because some of the code in the codebase assumes that calling exit() on fullnode exits the tpu/tvu, when in fact with this change, only close() does that.
@garious , It doesn't remove the arrow from TPU to bank. The bank will run the transactions twice for now. I agree that it's not desirable. But this will get streamlined via bank forking PR. |
@pgarg66, the intent of @sagar-solana's PR was that the TPU would LBYL (Look Before You Leap), and not write anything to its Bank fork that the TVU can't use directly. I thought we had worked out how that should be possible and that the PR implemented it. Is it not possible until the forking PR is in? Is all this churn a short-term band-aid that's going to slow us down when the forking PR is merged? |
By "all this churn", I just mean the architectural rewiring and its implications. The code changes are quite tiny. ;) |
@garious my understanding is that this work was supposed to be part of Sagar's PR. Maybe we can wait for bank forking PR to be merged before fixing (if at all required) this. However, the perf testnet on master will be broken till then. |
perf testnet on master being broken in the short-term is no problem. Whatever gets the forking bank PR stable and integrated needs to be the top priority. Any work outside that can assume the forking bank functionality exists. |
@garious this is the correct band-aid until we have a forking bank |
- This is needed since ledger is being written only in TVU now
Codecov Report
@@ Coverage Diff @@
## master #2587 +/- ##
========================================
- Coverage 78.1% 77.8% -0.3%
========================================
Files 113 113
Lines 18251 18274 +23
========================================
- Hits 14259 14231 -28
- Misses 3992 4043 +51 |
…abs#2587) * Connect TPU's broadcast service with TVU's blob fetch stage - This is needed since ledger is being written only in TVU now * fix clippy warnings * fix failing test * fix broken tests * fixed failing tests
* genesis: Make "cluster-type" aware of features #### Problem The `--cluster-type` parameter in solana-genesis mentions cluster features, but refers to things like epoch length and hashes per tick. This behavior is confusing because it doesn't include the feature set. #### Summary of changes For "mainnet-beta", "testnet", and "devnet", clone the cluster feature set, and deactivate the appropriate features. * Allow deactivating individual features with --cluster-type
Problem
The testnet is stopping within a few minutes of start.
#2468 change removed ledger writing from broadcast service. The ledger is only written in TVU now. However, the channel between broadcast service and TVU was not setup.
Summary of Changes
Expose the TVU channel (blob fetch receiver) to TPU's broadcast service.
Fixes #2586