-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Warp sync from non genesis #35
Comments
The challenge here is that we'd need to delete the existing state from the database somehow. |
Isn't this just the same as pruning? |
Each block execution produces a set of trie nodes that must be inserted/deleted into the state db to reflects insertions/deletions into the trie. Pruning simply delays deletions till some later point in time by keeping temporary journals of what has been deleted. Warping to a new state means there's no delta update to the trie, but instead the old trie must be deleted and the new one is populated from the warp snapshot. Deleting the whole trie may be done either by enumerating the trie to get all the nodes (slow) or by clearing the |
This issue has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/community-bootnodes-and-snapshot-providers/630/1 |
But the latter approach is still "better" because it's more or less atomic, no? Whereas the former approach (enumerating the trie) leaves the DB in a corrupted state. |
The best would probably be to clear and write the new data to the state column in one go. This would be some kind of "overwrite transaction" to the db. |
Well, you can warp sync and insert the new state first and then enumerate-delete the old one. This way even if deletion is interrupted you may end up with junk in the DB, but it will still be working.
This is indeed the most simple way as long as everyting fits in memory. However if we want incremental writes to the DB while state sync is in progress it becomes more complicated. Theoretically sync should insert into temp columns. Replace the state and state_meta columns once it is done. It makes more sense to sync to a new DB rather than fiddle with columns. Block pruning logic, offline storage, etc might all leave junk when block suddenly jumps to e.g. +1M. Also polkadot storage needs to be able to handle this. |
What you mean by Polkadot storage? |
parachains db, av-store, etc. For example, I'm not sure if av-store pruning will handle the case when block number suddenly changes from 1M to 2M. Basically, all the subsystems that rely on finality notifications being sent for at least once for every few hundlred blocks, need to be checked if they clean up correctly when suddenly a million blocks are finalized. |
While thinking about this again. When we warp sync to block X, we will finalize this block. When we finalize this block the db logic should start to prune all the old blocks and thus we should generate the required delta updates for the trie? Other logic in the node should see the finality notification and then be able to react and do its own pruning? For sure all of this isn't super performant and we could may include some information that we skipped a lot of blocks. |
I'm not sure what you mean. Pruning finalized state X simply means that trie nodes that were removed in X as part of the state transition, are now actually removed from the database. This won't help cleanup the existing trie nodes. I.e. supposed there's a trie node that is inserted at block 10 and removed in block 30. The client is at block 20 and warps to 40. The trie node will not be removed from the database because the "delta" that removed it was generated during execution of block 30 and the client skipped it when doing warp sync. The only way to get to this node now (and remove it) is to take the state root for block 10 and iterate the trie.
Probably. I'm just syaing that other modules might not not expecting to cleanup after finalizing huge chains. We might see things like querying long tree routes, or loading a lot of stuff into memory again. It makes more sense for them either start with a fresh db after a warp sync, or handle some kind of special signal that tells that ALL data must be cleared. |
Yeah sorry, you are right and I had mixed up some stuff. |
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
When this is implemented, we should think about what to do with old justifications/headers of era changes: #2710 (comment) |
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
…ytech#35) * rm force_delayed_canonicalize fn * rm integrity check
Currently Warp sync always starts from genesis. We should make it work from the latest available finalized block. This should make warp syncing work after a restart.
We should add multiple zombienet tests that abort the execution of the syncing node in different stages to ensure that we can always continue to sync after the next restart.
The text was updated successfully, but these errors were encountered: