-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Don't send ActiveLeaves
from leaves in db on startup in Overseer
#6727
Conversation
…ait for fresh leaves instead.
And if this gets merged we should revert #6691 |
node/overseer/src/lib.rs
Outdated
ActiveLeavesUpdate::start_work(ActivatedLeaf { hash, number, status, span }); | ||
self.broadcast_signal(OverseerSignal::ActiveLeaves(update)).await?; | ||
} | ||
self.on_head_activated(&hash, None).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this call actually do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and why do we care about stale leaves here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_head_activated
creates a span for the leaf and notifies external listeners (in case someone is subscribed for it). I think it's better not to remove it.
The entries it creates are cleaned up on finalization.
and why do we care about stale leaves here?
My theory: the initial idea was to start the subsystems as fast as possible. And there is a contract, that first leaf is special and it should be used just for a initialization. For example the dispute-coordinator waits for the first leaf before starting its main loop.
But at some point the subsystems started doing actual work when the first leaf is received and we started seeing the state discarded issues.
I misunderstood your question. After discussing it online - we don't care. I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I want to test how this performs on Versi and then I'll merge |
Not sure about that. The issue #6452 was addressing is related to node falling out of sync after startup. So it will go into major sync from time to time due to some slowness. It could be due to network issues or CPU/IO or some load problems in dispute-coordinator or other subsystems. Reverting #6691 would reintroduce #6656 I'm afraid. This PR doesn't seem to address finality notifications being sent during major sync (probably by design), so I believe #6691 and #6452 are still relevant. |
@tdimitrov the recent tests on versi - have they been based on this one? And if so - can we merge? |
The recent tests didn't include this PR. |
The PR is working fine on versi for 3+ days. I don't think it solves paritytech/polkadot-sdk#793 completely but definitely makes it better. I'm merging once CI is green, |
bot merge |
Waiting for commit status. |
Merge cancelled due to error. Error: Statuses failed for 4c52942 |
This is a companion for paritytech/polkadot#6727
bot merge |
Waiting for commit status. |
* Don't pass `leaves()` to `Overseer::builder()` This is a companion for paritytech/polkadot#6727 * update lockfile for {"substrate", "polkadot"} --------- Co-authored-by: parity-processbot <>
* master: (27 commits) bump `zombienet` version to v1.3.37 (#6773) Bump `blake2b_simd` to 1.0.1 (#6829) changelog: update template for new label behavior (E3/E4) (#6804) Companion for paritytech/substrate#12828 (#6380) Don't send `ActiveLeaves` from leaves in db on startup in Overseer (#6727) Polkadot XCM Body constants (#6788) Decrease expected peer count in zombinenet tests (#6826) Additional tracing in `provisioner`, `vote_selection` and `dispute-coordinator` (#6775) Change node-key for bootnodes (#6772) Change handle_import_statements to FatalResult (#6820) Introduce XCM matcher for writing barriers (#6756) Freeze note on `SessionInfo`. (#6818) Bump parity-db (#6816) Removing Outdated References to Misbehavior Arbitration Subsystem (#6814) Forgotten re-export for `MatchedConvertedConcreteId` (#6815) Companion for substrate#13509: bump API versions of {Beefy,Mmr}Api (#6809) Migrate to `Weight::from_parts` (#6794) [XCM] Multiple `FungiblesAdapter`s support + `WeightTrader::buy_weight` more accurate error (#6739) Get rid of unnecessary cloning and work. (#6808) changelog: fix migration listing (#6806) ...
* master: (27 commits) bump `zombienet` version to v1.3.37 (#6773) Bump `blake2b_simd` to 1.0.1 (#6829) changelog: update template for new label behavior (E3/E4) (#6804) Companion for paritytech/substrate#12828 (#6380) Don't send `ActiveLeaves` from leaves in db on startup in Overseer (#6727) Polkadot XCM Body constants (#6788) Decrease expected peer count in zombinenet tests (#6826) Additional tracing in `provisioner`, `vote_selection` and `dispute-coordinator` (#6775) Change node-key for bootnodes (#6772) Change handle_import_statements to FatalResult (#6820) Introduce XCM matcher for writing barriers (#6756) Freeze note on `SessionInfo`. (#6818) Bump parity-db (#6816) Removing Outdated References to Misbehavior Arbitration Subsystem (#6814) Forgotten re-export for `MatchedConvertedConcreteId` (#6815) Companion for substrate#13509: bump API versions of {Beefy,Mmr}Api (#6809) Migrate to `Weight::from_parts` (#6794) [XCM] Multiple `FungiblesAdapter`s support + `WeightTrader::buy_weight` more accurate error (#6739) Get rid of unnecessary cloning and work. (#6808) changelog: fix migration listing (#6806) ...
* Don't pass `leaves()` to `Overseer::builder()` This is a companion for paritytech/polkadot#6727 * update lockfile for {"substrate", "polkadot"} --------- Co-authored-by: parity-processbot <>
This PR modifies Overseer not to send
ActiveLeaves
on startup with the leaves passed to it on construction. These leaves are obtained from the database and they usually are stale. Sending them to subsystems for processing leads to failing runtime calls (because the leaves are already pruned).The new behaviour is only to import the leaves but not to notify the subsystems about it. First
ActiveLeaves
will be sent when fresh a fresh leaf is imported.Partially addresses paritytech/polkadot-sdk#793. Fixes #6694.
Also check #6689 for more context.
cumulus companion: paritytech/cumulus#2275