Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "synced block height too far ahead of the tip: dropped downloaded block" #3603

Closed
teor2345 opened this issue Feb 22, 2022 · 2 comments
Closed
Labels
C-bug Category: This is a bug C-enhancement Category: This is an improvement I-heavy Problems with excessive memory, disk, or CPU usage I-integration-fail Continuous integration fails, including build and test failures I-slow Problems with performance or responsiveness

Comments

@teor2345
Copy link
Contributor

teor2345 commented Feb 22, 2022

Motivation

Sometimes Zebra downloads a block that is a long way ahead of the state tip. Currently, we're dropping those blocks, which can waste a lot of network bandwidth.

There are two cases we need to handle here:

  1. the block is near the estimated tip, but the state tip is a long way from the estimated tip
  2. the block is close to the state tip

Here is how we currently handle them:

  1. drop the block, because verification would timeout anyway, and download it again in an hour or two
  2. drop the block, and download it again a few minutes later

For the second case, we could pause extending tips instead.

Designs

Syncer:

  • add a state pipeline limit to the syncer, and wait for it after waiting for the download & verify lookahead limit
    • await the current state tip height changing in a loop, until we are under the limit
  • only return a "too far ahead" error when the block is a very long way ahead of the tip

If we also get these errors from the inbound downloads, we could increase its limit, or just turn down the log level. (We can't pause the inbound downloads, because they are gossiped.)

Related Work

@teor2345 teor2345 added C-bug Category: This is a bug C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage P-Medium ⚡ I-heavy Problems with excessive memory, disk, or CPU usage I-slow Problems with performance or responsiveness I-integration-fail Continuous integration fails, including build and test failures labels Feb 22, 2022
@ftm1000
Copy link

ftm1000 commented Feb 24, 2022

@teor2345
Copy link
Contributor Author

This isn't causing us any issues right now.

@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug C-enhancement Category: This is an improvement I-heavy Problems with excessive memory, disk, or CPU usage I-integration-fail Continuous integration fails, including build and test failures I-slow Problems with performance or responsiveness
Projects
None yet
Development

No branches or pull requests

3 participants