Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Finality notification: Optimize calculation of stale heads #11200

Merged
merged 8 commits into from
Apr 12, 2022

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented Apr 11, 2022

While looking into some problem on Versi where a collator seemed to be stuck. I found out that it
was not stuck but there was a huge gap between last finalized and best block. This lead to a lot
leaves and it was basically trapped inside some loop of reading block headers from the db to find
the stale heads. While looking into this I found out that leaves already supports the feature to
give us the stale heads relative easily. However, the semantics change a little bit. Instead of
returning all stale heads of blocks that are not reachable anymore after finalizing a block, we
currently only return heads with a number lower than the finalized block. This should be no problem,
because these other leaves that are stale will be returned later when a block gets finalized which
number is bigger than the block number of these leaves.

While doing that, I also changed tree_route of the FinalityNotification to include the
old_finalized. Based on the comment I assumed that this was already part of it. However, if
wanted, I can revert this change.

While looking into some problem on Versi where a collator seemed to be stuck. I found out that it
was not stuck but there was a huge gap between last finalized and best block. This lead to a lot
leaves and it was basically trapped inside some loop of reading block headers from the db to find
the stale heads. While looking into this I found out that `leaves` already supports the feature to
give us the stale heads relative easily. However, the semantics change a little bit. Instead of
returning all stale heads of blocks that are not reachable anymore after finalizing a block, we
currently only return heads with a number lower than the finalized block. This should be no problem,
because these other leaves that are stale will be returned later when a block gets finalized which
number is bigger than the block number of these leaves.

While doing that, I also changed `tree_route` of the `FinalityNotification` to include the
`old_finalized`. Based on the comment I assumed that this was already part of it. However, if
wanted, I can revert this change.
@bkchr bkchr added A0-please_review Pull request needs code review. B5-clientnoteworthy C1-low PR touches the given topic and has a low impact on builders. labels Apr 11, 2022
/// This means that no changes are done.
///
/// Returns the leaves that would be displaced.
pub fn simulate_finalize_height(&self, number: N) -> FinalizationDisplaced<H, N> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to accept any kind of better name :D

Copy link
Member

@davxy davxy Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalize_height_dry_run ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

displaced_by_finalize_height?

Comment on lines 309 to 311
/// Path from the old finalized to new finalized parent (implicitly finalized blocks).
///
/// This maps to the range `[old_finalized, new_finalized]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I was also under the impression that this would already include the last finalized block.

/// This means that no changes are done.
///
/// Returns the leaves that would be displaced.
pub fn simulate_finalize_height(&self, number: N) -> FinalizationDisplaced<H, N> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

displaced_by_finalize_height?

client/service/src/client/client.rs Outdated Show resolved Hide resolved
Comment on lines -913 to -922
// It is not guaranteed that `backend.blockchain().leaves()` doesn't return
// heads that were in a stale state before this finalization and thus already
// included in previous notifications. We want to skip such heads.
// Given the "route" from the currently finalized block to the head under
// analysis, the condition for it to be added to the new stale heads list is:
// `!retracted.is_empty() && last_finalized_number <= pivot.number`
// 1. "route" has some "retractions".
// 2. previously finalized block number is not greater than the "route" pivot:
// - if `last_finalized_number <= pivot.number` then this is a new stale head;
// - else the stale head was already included by some previous finalization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we also don't need to deal with this as we are guaranteed that stuff only gets displaced once. Sorry for making you waste brain cycles @davxy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Has been very instructive anyway :-)

@@ -307,6 +307,8 @@ pub struct FinalityNotification<Block: BlockT> {
/// Finalized block header.
pub header: Block::Header,
/// Path from the old finalized to new finalized parent (implicitly finalized blocks).
///
/// This maps to the range `[old_finalized, new_finalized]`.
Copy link
Member

@davxy davxy Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// This maps to the range `[old_finalized, new_finalized]`.
/// This maps to the range `(old_finalized, new_finalized)`.

The new_finalized hash is not included in the FinalityNotification::tree_route. Is extracted here when mapping the FinalizeSummary to FinalityNotification.

As has been discussed at the time, the last block is included in the FinalityNotification as a separate explicit field to be more "inline" with the design of the import notification.

The tree_route field should contain from old-finalized (excluded) to new finalized (excluded).

(the prev explicitly finalized block is not included here to prevent duplication, see request for change comment below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already contained the new_finalized, so this suggestion is not really correct.

Comment on lines 899 to 901
let finalized = std::iter::once(last_finalized)
.chain(route_from_finalized.enacted().iter().map(|elem| elem.hash))
.collect::<Vec<_>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let finalized = std::iter::once(last_finalized)
.chain(route_from_finalized.enacted().iter().map(|elem| elem.hash))
.collect::<Vec<_>>();
let finalized =
route_from_finalized.enacted().iter().map(|elem| elem.hash).collect::<Vec<_>>();

Copy link
Member

@davxy davxy Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't included the last_fnalized (i.e. the previous last finalized block) to prevent a double notification for that block:

  • 1st notification when it is the last finalized block (explicitly finalized one)
  • 2nd notification when it is the fist implicitly finalized block

Indeed in the tests now a2 is repeated twice:

finality_notification_check(&mut finality_notifications, &[genesis, a1.hash(), a2.hash()], &[]);
finality_notification_check(&mut finality_notifications, &[a2.hash(), a3.hash()], &[b2.hash()]);

Genesis is the only block for which a notification was never sent since it is never explicitly finalized via a block import (implicitly finalized at startup)

In this way if blocks are (as it is usually the "normal" case) finalized one after the other then for each finalized block you'll always receive two finalizations (currently finalized one + prev finalized one)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You convinced me :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davxy done :)

@davxy
Copy link
Member

davxy commented Apr 11, 2022

Excluded the requested modification about removing the previous last finalized block from the notification, LGTM and is indeed faster :-)

assert!(finality_notifications.try_next().is_err());
}

#[test]
fn finality_notifications_content() {
sp_tracing::try_init_simple();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a leftover

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of, but also can stay :)

@bkchr bkchr added the D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit label Apr 11, 2022
@bkchr bkchr requested a review from davxy April 11, 2022 18:38
Copy link
Member

@davxy davxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@davxy
Copy link
Member

davxy commented Apr 12, 2022

Nevermind, you've already fixed it :-)

@bkchr
Copy link
Member Author

bkchr commented Apr 12, 2022

bot merge

@paritytech-processbot
Copy link

Waiting for commit status.

@paritytech-processbot
Copy link

Merge cancelled due to error. Error: Checks failed for 7ed136b

@bkchr bkchr merged commit ebef053 into master Apr 12, 2022
@bkchr bkchr deleted the bkchr-optimize-finality-notification branch April 12, 2022 11:12
@xlc
Copy link
Contributor

xlc commented Apr 12, 2022

Does this help paritytech/cumulus#432?

@bkchr
Copy link
Member Author

bkchr commented Apr 12, 2022

No

@bkchr
Copy link
Member Author

bkchr commented Apr 12, 2022

That is something completely different.

DaviRain-Su pushed a commit to octopus-network/substrate that referenced this pull request Aug 23, 2022
…h#11200)

* Finality notification: Optimize calculation of stale heads

While looking into some problem on Versi where a collator seemed to be stuck. I found out that it
was not stuck but there was a huge gap between last finalized and best block. This lead to a lot
leaves and it was basically trapped inside some loop of reading block headers from the db to find
the stale heads. While looking into this I found out that `leaves` already supports the feature to
give us the stale heads relative easily. However, the semantics change a little bit. Instead of
returning all stale heads of blocks that are not reachable anymore after finalizing a block, we
currently only return heads with a number lower than the finalized block. This should be no problem,
because these other leaves that are stale will be returned later when a block gets finalized which
number is bigger than the block number of these leaves.

While doing that, I also changed `tree_route` of the `FinalityNotification` to include the
`old_finalized`. Based on the comment I assumed that this was already part of it. However, if
wanted, I can revert this change.

* FMT

* Update client/service/src/client/client.rs

Co-authored-by: André Silva <[email protected]>

* Do not include the last finalized block

* Rename function

* FMT

* Fix tests

* Update figure

Co-authored-by: André Silva <[email protected]>
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
…h#11200)

* Finality notification: Optimize calculation of stale heads

While looking into some problem on Versi where a collator seemed to be stuck. I found out that it
was not stuck but there was a huge gap between last finalized and best block. This lead to a lot
leaves and it was basically trapped inside some loop of reading block headers from the db to find
the stale heads. While looking into this I found out that `leaves` already supports the feature to
give us the stale heads relative easily. However, the semantics change a little bit. Instead of
returning all stale heads of blocks that are not reachable anymore after finalizing a block, we
currently only return heads with a number lower than the finalized block. This should be no problem,
because these other leaves that are stale will be returned later when a block gets finalized which
number is bigger than the block number of these leaves.

While doing that, I also changed `tree_route` of the `FinalityNotification` to include the
`old_finalized`. Based on the comment I assumed that this was already part of it. However, if
wanted, I can revert this change.

* FMT

* Update client/service/src/client/client.rs

Co-authored-by: André Silva <[email protected]>

* Do not include the last finalized block

* Rename function

* FMT

* Fix tests

* Update figure

Co-authored-by: André Silva <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants