-
Notifications
You must be signed in to change notification settings - Fork 764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syncing strategy refactoring (part 3) #5737
Syncing strategy refactoring (part 3) #5737
Conversation
…n_warp_proof_response()` with `SyncingAction::SendGenericRequest` and `SyncingStrategy::on_generic_response()`
@dmitry-markin @lexnv this is probably the last one in a series, I'm reasonably happy with the achieved result and will try to rebase our downstream changes on top of these new APIs without modifying Substrate itself. |
@ggwpez you commented on CI stuff before, so maybe you can help. Apparently Rust version used for formatting has changed, but since it just says Request: can preflight workflow print the versions maybe for reference? It prints some stuff, but nothing actually useful IMO. UPD: I was applying formatting in a different subdirectory, but the request is still relevant. |
938b988
to
6012b67
Compare
…gAction::StartRequest`
…ock_response()` with `SyncingAction::StartRequest` and `SyncingStrategy::on_generic_response()`
… we already effectively more common `Arc<dyn Link<B>>` in many places
…uild_default_syncing_engine` to make construction of the default syncing engine a bit easier
6012b67
to
53b34d0
Compare
…_network` API before paritytech#5666
…sc_service::Configuration`
…refactoring-part-3 # Conflicts: # substrate/client/network/sync/src/engine.rs
Addressed Bastian's comment from #5666 and resolved conflicts with |
Yea am very much in favour of printing all versions at the beginning of CI jobs, (Rust, rustup, cargo, clippy, fmt, nextest) to easily debug stuff. |
bot fmt |
@ggwpez https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7399653 was started for your command Comment |
Friendly ping here, would be nice to merge this before merge conflicts happen. It shouldn't be as scary as the diff might suggest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good and indeed finally allow custom syncing strategies use with substrate. This is what was originally planned when separate syncing strategies were introduced. Thank you for implementing this!
} | ||
if remove_obsolete { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove pending responses first, then check if our peers
contain the peer_id
?
I think we were removing the pending_responses
before send_block_request
checked if this is a known peer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above branch is never ever supposed to happen (see debug_assert!(false)
), so I don't think it makes any difference either way whether we remove pending responses before crashing or not.
Also removed redundant comment below in latest push.
self.protocol_name.clone(), | ||
request.encode_to_vec(), | ||
tx, | ||
IfDisconnected::ImmediateError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dq: Should we try to establish connection instead of immediate errors?
Would you think it is beneficial for strategies to provide this option in their actions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as it was before, so I didn't change the behavior here and so far there was no need to do this.
Maybe the assumption is that we only send requests to connected peers, so if someone disconnected we just error instead of attempting to dial them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for contributing!
…refactoring-part-3 # Conflicts: # substrate/client/network/sync/src/strategy/chain_sync.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also resolved a minor merge conflict in imports after #5686
self.protocol_name.clone(), | ||
request.encode_to_vec(), | ||
tx, | ||
IfDisconnected::ImmediateError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as it was before, so I didn't change the behavior here and so far there was no need to do this.
Maybe the assumption is that we only send requests to connected peers, so if someone disconnected we just error instead of attempting to dial them?
} | ||
if remove_obsolete { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above branch is never ever supposed to happen (see debug_assert!(false)
), so I don't think it makes any difference either way whether we remove pending responses before crashing or not.
Also removed redundant comment below in latest push.
…refactoring-part-3
Merged |
I will run some final tests, and, if everything fine, we can merge the PR. |
We're running a fork with these changes for over a month now without issues |
@dmitry-markin How does this PR look like, is it fine to be merged? |
I am testing it on Versi to make sure everything is good. Code-wise it is fine. In any case, let's stabilize it in master first skipping stable2412. @nazar-pc you are using a fork anyway and not need it in the polkadot-sdk release, aren't you? |
I was really-really trying to upstream out patches wherever possible, it is kind of a pain to rebase everything on each release. I was hoping that opening PR in September will make it into December build, I'd be disappointed if it didn't. There isn't really anything crazy in this PR, mostly just moving things around, I'd say the risk of regressions is low and will be evident immediately. We just launched Mainnet yesterday (already 1239 consensus nodes) with this included after more than a month of testnets, there was zero complaints so far about anything that would be potentially related to this for what it is worth. |
Then I got it wrong, sorry @nazar-pc. Let's see what we can do regrading the December release. |
12d9052
Thanks everyone! |
Description
This is a continuation of #5666 that finally fixes #5333.
This should allow developers to create custom syncing strategies or even the whole syncing engine if they so desire. It also moved syncing engine creation and addition of corresponding protocol outside
build_network_advanced
method, which is something Bastian expressed as desired in #5 (comment)Here I replaced strategy-specific types and methods in
SyncingStrategy
trait with generic ones. SpecificallySyncingAction
is now used by all strategies instead of strategy-specific types with conversions.StrategyKey
was an enum with a fixed set of options and now replaced with an opaque type that strategies create privately and send to upper layers as an opaque type. Requests and responses are now handled in a generic way regardless of the strategy, which reduced and simplified strategy API.PolkadotSyncingStrategy
now lives in its dedicated module (had to edit .gitignore for this) like other strategies.build_network_advanced
takes genericSyncingService
as an argument alongside with a few other low-level types (that can probably be extracted in the future as well) without any notion of specifics of the way syncing is actually done. All the protocol and tasks are created outside and not a part of the network anymore. It still adds a bunch of protocols like for light client and some others that should eventually be restructured makingbuild_network_advanced
just building generic network and not application-specific protocols handling.Integration
Just like #5666 introduced
build_polkadot_syncing_strategy
, this PR introducesbuild_default_block_downloader
, but for convenience and to avoid typical boilerplate a simpler high-level functionbuild_default_syncing_engine
is added that will take care of creating typical block downloader, syncing strategy and syncing engine, which is what most users will be using going forward.build_network
towards the end of the PR was renamed tobuild_network_advanced
andbuild_network
's API was reverted to pre-#5666, so most users will not see much of a difference during upgrade unless they opt-in to use new API.Review Notes
For
StrategyKey
I was thinking about using something like private type and then storingTypeId
inside instead of a static string in it, let me know if that would preferred.The biggest change happened to requests that different strategies make and how their responses are handled. The most annoying thing here is that block response decoding, in contrast to all other responses, is dependent on request. This meant request had to be sent throughout the system. While originally
Response
wasVec<u8>
, I didn't want to re-encode/decode request and response just to fit into that API, so I ended up withBox<dyn Any + Send>
. This allows responses to be truly generic and each strategy will know how to downcast it back to the concrete type when handling the response.Import queue refactoring was needed to move
SyncingEngine
construction out ofbuild_network
that awkwardly implemented forSyncingService
, but due to&mut self
wasn't usable onArc<SyncingService>
for no good reason.Arc<SyncingService>
itself is of course useless, but refactoring to replace it with justSyncingService
was unfortunately rejected in #5454As usual I recommend to review this PR as a series of commits instead of as the final diff, it'll make more sense that way.
Checklist
T
required)