-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow pruning to multiple targets, including without snapshots #186
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some notes & thoughts
// don't belong to the target state and the genesis state | ||
// - iterate the snapshot, reconstruct the relevant state | ||
// - iterate the database, delete all other state entries which | ||
// don't belong to the target state and the genesis state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better if could avoid any unnecessary geth changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just gofmt which happens every time I save the file. I'd be happy to change it back, but I want to make sure I won't need to make any future changes first, so I'll do it after you approve. Though at this point I've already changed a large percentage of the file.
core/state/pruner/pruner.go
Outdated
@@ -261,6 +261,8 @@ func (p *Pruner) Prune(root common.Hash) error { | |||
} | |||
// Use the bottom-most diff layer as the target | |||
root = layers[len(layers)-1].Root() | |||
} else if p.snaptree.Snapshot(root) == nil { | |||
p.snaptree.Rebuild(root, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it will work.. however, I think that a better alternative would be to skip creating the snaptree in NewPruner and passing the right root to snapshot.New which wouldn't require any change in snapshot.
For nitro - I think we'll need to support multiple roots (so all will be cleaned unless in any of these roots), which should probably be done with multiple snaptrees, each with a different root, each updating the same stateBloom.
At least in a validators we'd like to keep state for: something recent + latest validated + latest accepted onchain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added support for multiple roots, but I found that having multiple snaptrees was way too expensive, and would completely destroy any existing snapshots. Instead, I've added support for generating the pruning bloom filters without a snapshot :)
core/state/snapshot/snapshot.go
Outdated
func (t *Tree) Rebuild(root common.Hash) { | ||
func (t *Tree) Rebuild(root common.Hash, async bool) { | ||
var genPending chan struct{} | ||
if !async { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Why use defer and not just check async at the end of function?
(if using defer, I'd at least check that genPending is not nil) -
why not use waitBuild for waiting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used a defer because I wanted it to happen after the mutex was unlocked. I'll switch it to waitBuild though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the need for this change
core/state/pruner/pruner.go
Outdated
} | ||
|
||
// We assume output does not need the value, only the key | ||
func dumpRawTrieDescendants(db ethdb.Database, root common.Hash, output ethdb.KeyValueWriter) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking a stateBloom instead of KeyValueWriter would be better for self-documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly comparing dumpTrieDescendant to extractGenesis.
If there is a good reason for the diff - LGTM. Otherwise, code-sharing the two might have nice advantages.
core/state/pruner/pruner.go
Outdated
@@ -228,10 +233,130 @@ func prune(snaptree *snapshot.Tree, root common.Hash, maindb ethdb.Database, sta | |||
return nil | |||
} | |||
|
|||
func nodeIteratorKey(it trie.NodeIterator) (common.Hash, error) { | |||
if it.Leaf() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through extractGenesis it doesn't have this part - only it.Hash()
Also - if I'm reading this correctly, LeafKey would be the path to the data, and I don't remember that it should appear as a key in the database (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extractGenesis is right here :) I'm updating my function and having extractGenesis just call mine since mine is parallel and has ETA tracking
Store bloom filter roots in the file content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM + small suggestion (can be done separately or skipped)
|
||
func bloomFilterName(datadir string, hash common.Hash) string { | ||
return filepath.Join(datadir, fmt.Sprintf("%s.%s.%s", stateBloomFilePrefix, hash.Hex(), stateBloomFileSuffix)) | ||
return dumpRawTrieDescendants(db, genesis.Root(), stateBloom) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code already changed, you could also make this func treat the real arbitrum genesis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we need the genesis block number separately on the nitro side to compute the latest finalized L2 block from the latest finalized message number. Getting the block number is a bit trickier here without the ReadChainConfig helper (it means we'd need to first read block 0, and then read the actual genesis block), so I've kept it on the nitro side for now.
No description provided.