Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very large block validation times during / after chocolate update #7565

Closed
travisperson opened this issue Oct 26, 2021 · 8 comments
Closed
Labels
area/chain Area: Chain kind/enhancement Kind: Enhancement P1 P1: Must be resolved

Comments

@travisperson
Copy link
Contributor

We are running v1.13.0 on a few nodes with the experimental split-store enabled. A few of these nodes experienced desync issues immediately after the chocolate update (one node had zero issues) with block validation times up to 200+ seconds.

Profiles: https://bafybeiesexrmvtw7kftq4dul5hgbet7izan43imyhttphamwgg54bn3sd4.ipfs.dweb.link/

@travisperson
Copy link
Contributor Author

image

The nodes seems to have recovered on their own at first, but then started to desync later. I restarted all the nodes with issues and they quickly started to sync again. I will monitor these nodes, and report if the issue occurs again.

I would like to review the above profiles with someone to understand what was actually happening during these times though.

@jennijuju jennijuju added the P1 P1: Must be resolved label Oct 26, 2021
@vyzo
Copy link
Contributor

vyzo commented Oct 27, 2021

So it seems that there were problems with the migration.
My theory is that the premigration got garbage collected as unreachable and it left us running the migration online or something along these lines.

One possible solution for this problem is to register a protector for migrated state so that it doesn't get garbage collected.

@BigLep
Copy link
Member

BigLep commented Oct 28, 2021

I'm assuming that as part of addressing this, we'll include a regression test as well. Let me know if that isn't the case.

@vyzo
Copy link
Contributor

vyzo commented Oct 28, 2021

The simplest solution is to inhibit compaction around upgrade/migration; this will sidestep the issue in a non-intrusive way.
We will also want sortless compaction to make sure we can compact large state trees safely.

@vyzo
Copy link
Contributor

vyzo commented Dec 3, 2021

Fix in #7734.

@TippyFlitsUK
Copy link
Contributor

Hi @vyzo

This looks like a reminder. Should we close it or keep it open?

@TippyFlitsUK TippyFlitsUK added kind/enhancement Kind: Enhancement area/chain Area: Chain need/author-input Hint: Needs Author Input labels Mar 24, 2022
@vyzo
Copy link
Contributor

vyzo commented Mar 24, 2022

afaict it should be fixed.

@TippyFlitsUK
Copy link
Contributor

Many thanks!! Fix in #7734.

@TippyFlitsUK TippyFlitsUK removed the need/author-input Hint: Needs Author Input label Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/chain Area: Chain kind/enhancement Kind: Enhancement P1 P1: Must be resolved
Projects
None yet
Development

No branches or pull requests

5 participants