Skip to content
This repository has been archived by the owner on Jan 11, 2024. It is now read-only.

FM-259: Sign all incomplete checkpoints #292

Merged
merged 1 commit into from
Oct 3, 2023
Merged

Conversation

aakoshh
Copy link
Contributor

@aakoshh aakoshh commented Sep 29, 2023

Closes consensus-shipyard/ipc#298

The PR changes the end-of-period checkpoint signature broadcasting by validators to apply on all incomplete (a.k.a. pending) checkpoints where they were validators.

So, instead of doing this on node start, it's done on each checkpoint period end. This has the following benefits:

  • main.rs stays simpler and the checkpointing logic is kept in the interpreter
  • a validator doesn't have to restart their node to fill in their signatures if e.g. they ran out of funds, they just need to get some tokens and it will automatically be retried

@aakoshh aakoshh marked this pull request as draft October 2, 2023 13:02
@aakoshh aakoshh marked this pull request as ready for review October 2, 2023 13:02
.find(|v| v.public_key.0 == validator_ctx.public_key)
.cloned()
{
// TODO: Code generation in the ipc-solidity-actors repo should cater for this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cryptoAtwill, can you double-check that this is the case once you migrate to the ipc-solidity-actors bindings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember seeing that this has been implemented already, we just talked about the fact that the bindings generate as many versions of SubnetID and whatnot as there are facets. So it's just a note for the future that if this gets out of hand, we should try to solve it there.

let height = checkpoint.block_height;
let validator_ctx = ctx.clone();

tokio::spawn(async move {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it possibel to do the broadcast of incomplete signatures in parallel here without resulting in nonce races? (I also ask because in the comment above you mention that broadcasts can't be done in parallel).

Copy link
Contributor Author

@aakoshh aakoshh Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it looks like my comment wasn't as clear as it could have been.

What I meant was that we should not try to kick off a separate background task for each pending checkpoint like this:

for cp in pending_checkpoints {
  tokio::spawn(async move {
    checkpoint::broadcast_signature(broadcaster, cp).await;
  });
}

That's because every time the broadcaster is asked to send a transaction it will query the state for the nonce, and in this case that will surely mean all of the checkpoint submissions will get the same nonce and only one will go through.

Instead, what we have is effectively this:

tokio::spawn(async move {
  for cp in pending_checkpoints {
    checkpoint::broadcast_signature(broadcaster, cp).await;
  }
});

So it's not doing the submissions in parallel, they are done one after the other, but in the background.

Base automatically changed from fm-255-no-broadcast-if-syncing to main October 3, 2023 09:07
@aakoshh aakoshh force-pushed the fm-259-sign-incomplete-ckpt branch from ef135e5 to 808e14c Compare October 3, 2023 09:17
@aakoshh aakoshh merged commit b2ac4ff into main Oct 3, 2023
1 check passed
@aakoshh aakoshh deleted the fm-259-sign-incomplete-ckpt branch October 3, 2023 09:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upon restart list incomplete checkpoints and re-broadcast signatures if necessary
3 participants