Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor transition post genesis #311

Merged
merged 5 commits into from
Nov 30, 2023

Conversation

gballet
Copy link
Owner

@gballet gballet commented Nov 24, 2023

The base branch is able to perform the transition in one block, but it fails if there are more than one block. The problem stems from the fact ended and started are global variables, and that during block creation, several attempts are made at creating a block with a different tx set each time. So the state transition pointers were incorrect. To make things worse, once the block has been created, it's passed to the CL, and the latter passes it back to the EL for insertion, at which points the block is re-executed.

This PR implements a cleaner approach in which there is only one "active" set of conversion pointers at a time, but once the block has been created, these sets of pointers (called TransitionState) are saved on a per-root basis. This way, if (for whatever reason) multiple blocks are built on top of this one, the pointers are always going to be the same.

When building on top of the parent block, the state database copies the final TransitionState of its parent, so that it can updates the pointers as it goes through the transition. When it's done, and the final, translated root root is calculated, the current TransitionState is saved into the state database list of transition states, referenced by the final root.

TODO

Copy link
Owner Author

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few explanatory comments.

@@ -25,6 +25,7 @@ import (
"github.com/ethereum/go-ethereum/consensus"
"github.com/ethereum/go-ethereum/consensus/misc/eip1559"
"github.com/ethereum/go-ethereum/consensus/misc/eip4844"
"github.com/ethereum/go-ethereum/core"
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this causes a cycle in tests

@@ -368,6 +369,12 @@ func (beacon *Beacon) Finalize(chain consensus.ChainHeaderReader, header *types.
state.Witness().TouchAddressOnWriteAndComputeGas(header.Coinbase[:], uint256.Int{}, utils.NonceLeafKey)
state.Witness().TouchAddressOnWriteAndComputeGas(header.Coinbase[:], uint256.Int{}, utils.CodeKeccakLeafKey)
state.Witness().TouchAddressOnWriteAndComputeGas(header.Coinbase[:], uint256.Int{}, utils.CodeSizeLeafKey)

if chain.Config().IsPrague(header.Number, header.Time) {
fmt.Println("at block", header.Number, "performing transition?", state.Database().InTransition())
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Println("at block", header.Number, "performing transition?", state.Database().InTransition())

if chain.Config().IsPrague(header.Number, header.Time) {
fmt.Println("at block", header.Number, "performing transition?", state.Database().InTransition())
parent := chain.GetHeaderByHash(header.ParentHash)
core.OverlayVerkleTransition(state, parent.Root)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR does the same thing as #309, but it was easier to reason about when called in a single location. It should still be possible to move it to e.g. Prepare but that's definitely enough work for a single PR.

@@ -405,6 +413,7 @@ func (beacon *Beacon) FinalizeAndAssemble(chain consensus.ChainHeaderReader, hea
return nil, fmt.Errorf("nil parent header for block %d", header.Number)
}

state.Database().LoadTransitionState(parent.Root)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When producing the proof, the pre-tree is re-opened. Unfortunately, the OpenTrie function needs the pointers, who have to be reloaded. This is a bit tacky and could be avoided if the pre tree was opened before, e.g. in Prepare.

Comment on lines +447 to +454
// This should only happen for the first block,
// so the previous tree is a merkle tree. Logically,
// the "previous" verkle tree is an empty tree.
okpre = true
vtrpre = trie.NewVerkleTrie(verkle.New(), state.Database().TrieDB(), utils.NewPointCache(), false)
post := state.GetTrie().(*trie.TransitionTrie)
vtrpost = post.Overlay()
okpost = true
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change caused this panic to trigger, because OpenTrie needs the parameters to decide if it's a verkle or merkle tree. But I don't think it was correct to begin with, because for the first block, the "pre" tree is technically a MPT. But if we want to prove against something, that something we prove against is an empty verkle tree. So this is what this code does: it returns an empty verkle tree so that the proof can be built against it.

For some reason I haven't determined yet, the first proof is still empty. That is an investigation for later though.

Comment on lines +279 to +282
if ts.CurrentAccountAddress != nil {
ret.CurrentAccountAddress = &common.Address{}
copy(ret.CurrentAccountAddress[:], ts.CurrentAccountAddress[:])
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super important: don't just copy the pointer to a unique address.

LastMerkleRoot common.Hash // root hash of the read-only base tree
LastMerkleRoot common.Hash // root hash of the read-only base tree
CurrentTransitionState *TransitionState
TransitionStatePerRoot map[common.Hash]*TransitionState
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TransitionStatePerRoot map[common.Hash]*TransitionState
TransitionStatePerRoot map[common.Hash]*TransitionState // List of per-root conversion state pointers

ts, ok := db.TransitionStatePerRoot[root]
if !ok || ts == nil {
// Start with a fresh state
ts = &TransitionState{ended: db.triedb.IsVerkle()}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where we handle the "verkle at genesis case" since if that happens, then the loaded transition state will be nil and so it is initialized.

Note that this hasn't been tested yet, but should be before everything is merged back into kaustinen-post-shapella.

Comment on lines -109 to -112
parent := p.bc.GetHeaderByHash(header.ParentHash)
if err := OverlayVerkleTransition(statedb, parent.Root); err != nil {
return nil, nil, 0, fmt.Errorf("error performing verkle overlay transition: %w", err)
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the transition to Finalize.

Comment on lines -907 to -909
if w.chain.Config().IsPrague(header.Number, header.Time) {
core.OverlayVerkleTransition(state, parent.Root)
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the conversion to Finalize.

@gballet
Copy link
Owner Author

gballet commented Nov 30, 2023

merging into parent because the devops infrastructure needs smaller branch names 😅

@gballet gballet merged commit 1d80ebd into transition-post-genesis Nov 30, 2023
gballet added a commit that referenced this pull request Feb 8, 2024
* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie
gballet added a commit that referenced this pull request Feb 28, 2024
* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie
gballet added a commit that referenced this pull request Mar 26, 2024
* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie
gballet added a commit that referenced this pull request Apr 15, 2024
* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie
gballet added a commit that referenced this pull request Apr 29, 2024
quell linter issue

support Transition post tree in conversion

Refactor transition post genesis (#311)

* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie

fix OpenStorageTrie: return an error instead of panicking if the account tree not a verkle tree (#321)

add a switch to force proof in blocks (#322)

* add a switch to force proof in blocks

* activate switch

* fix switch type

add switch to override the stride of the overlay conversion (#323)

* add switch to override the stride of the overlay conversion

* set a default stride of 10k

add a few traces for relaunch

add missing step in the chain of setting the override for overlay stride

fix: save and load transition state for block processing (#324)

* fix: save and load transition state for block processing

* log whether the tree is verkle in LoadTransitionState

* fix: ensure the transition is marked as started in insertChain

* dump saved address

* fix nil pointer panic

* remove stacktrace that is no longer useful

* fix a panic

* fix build

* check: copy current account address BEFORE it's saved

* mandatory panic fix

* Remove debug fmt.Println

* more cleanup + comments

fix: ensure StorageProcessed is properly recovered

add traces for gas pool overflow

cmd/{geth, utils}: add a command to clear verkle costs (#326)

* cmd/{geth, utils}: add a command to clear verkle costs

fix:  boolean issue

fix: load finalization state in FinalizeAndAssemble (#340)

* Conversion and TransitionTrie fixes (#346)

* fixes

Signed-off-by: Ignacio Hagopian <[email protected]>

* remove old comment

Signed-off-by: Ignacio Hagopian <[email protected]>

---------

Signed-off-by: Ignacio Hagopian <[email protected]>

* trace cleanup

---------

Signed-off-by: Ignacio Hagopian <[email protected]>
Co-authored-by: Ignacio Hagopian <[email protected]>

fix: check for nil override in NewBlockChain (#357)

fix rebase issue

fix a few rebase issues

add debug traces

fix: assume that having to create a new transaction state mean it hasn't happened.

This is a workaround, because it will always be false when restarting a node
that has started the transition but not necessarily completed it.

add logs to debug shadowfork

persist conversion state to db and use an LRU cache for active transition states (#375)

* persist conversion state to db

* fix: don't recreate LRU when writing state

* opt: only write state to db if not already present in LRU

* fix: rlp can't encode TransitionState

* fix: use gob because binary.Write does not support structs 🤦‍♂️

* fix: nil pointer panic

* add logs to debug shadowfork

* no such thing as not enough traces

* ditto

* fix stupid bug

* add a comment for readability

* add more traces

* Lock the state transition during conversion (#384)

* heavy handed approach: lock the state transition during conversion

* also lock transition state loading/unloading

* reduce logs verbosity

* add conversion test to workflow (#386)

* add conversion test to workflow

* mandatory -f switch fix in rm

* forgot & at the end of the geth command

* remove incorrect kill

* add debug traces

* add an overlay stride

* fix typo

* Apply suggestions from code review

eth: add debug_conversionStatus RPC call (#392)

* eth: add debug_conversionStatus RPC call

* add debug trace4s

* Apply suggestions from code review

* export started/ended fields

fix post-verge sync (#404)

* fix post-verge sync

* review: fix truncated comment
gballet added a commit that referenced this pull request Apr 29, 2024
quell linter issue

support Transition post tree in conversion

Refactor transition post genesis (#311)

* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie

fix OpenStorageTrie: return an error instead of panicking if the account tree not a verkle tree (#321)

add a switch to force proof in blocks (#322)

* add a switch to force proof in blocks

* activate switch

* fix switch type

add switch to override the stride of the overlay conversion (#323)

* add switch to override the stride of the overlay conversion

* set a default stride of 10k

add a few traces for relaunch

add missing step in the chain of setting the override for overlay stride

fix: save and load transition state for block processing (#324)

* fix: save and load transition state for block processing

* log whether the tree is verkle in LoadTransitionState

* fix: ensure the transition is marked as started in insertChain

* dump saved address

* fix nil pointer panic

* remove stacktrace that is no longer useful

* fix a panic

* fix build

* check: copy current account address BEFORE it's saved

* mandatory panic fix

* Remove debug fmt.Println

* more cleanup + comments

fix: ensure StorageProcessed is properly recovered

add traces for gas pool overflow

cmd/{geth, utils}: add a command to clear verkle costs (#326)

* cmd/{geth, utils}: add a command to clear verkle costs

fix:  boolean issue

fix: load finalization state in FinalizeAndAssemble (#340)

* Conversion and TransitionTrie fixes (#346)

* fixes

Signed-off-by: Ignacio Hagopian <[email protected]>

* remove old comment

Signed-off-by: Ignacio Hagopian <[email protected]>

---------

Signed-off-by: Ignacio Hagopian <[email protected]>

* trace cleanup

---------

Signed-off-by: Ignacio Hagopian <[email protected]>
Co-authored-by: Ignacio Hagopian <[email protected]>

fix: check for nil override in NewBlockChain (#357)

fix rebase issue

fix a few rebase issues

add debug traces

fix: assume that having to create a new transaction state mean it hasn't happened.

This is a workaround, because it will always be false when restarting a node
that has started the transition but not necessarily completed it.

add logs to debug shadowfork

persist conversion state to db and use an LRU cache for active transition states (#375)

* persist conversion state to db

* fix: don't recreate LRU when writing state

* opt: only write state to db if not already present in LRU

* fix: rlp can't encode TransitionState

* fix: use gob because binary.Write does not support structs 🤦‍♂️

* fix: nil pointer panic

* add logs to debug shadowfork

* no such thing as not enough traces

* ditto

* fix stupid bug

* add a comment for readability

* add more traces

* Lock the state transition during conversion (#384)

* heavy handed approach: lock the state transition during conversion

* also lock transition state loading/unloading

* reduce logs verbosity

* add conversion test to workflow (#386)

* add conversion test to workflow

* mandatory -f switch fix in rm

* forgot & at the end of the geth command

* remove incorrect kill

* add debug traces

* add an overlay stride

* fix typo

* Apply suggestions from code review

eth: add debug_conversionStatus RPC call (#392)

* eth: add debug_conversionStatus RPC call

* add debug trace4s

* Apply suggestions from code review

* export started/ended fields

fix post-verge sync (#404)

* fix post-verge sync

* review: fix truncated comment
@gballet gballet mentioned this pull request Apr 29, 2024
gballet added a commit that referenced this pull request May 2, 2024
quell linter issue

support Transition post tree in conversion

Refactor transition post genesis (#311)

* rewrite per-block conversion pointer management

* remove unused method

* fix: a branch that can verge at genesis or post genesis (#314)

* fix: import cycle in conversion refactor (#315)

* fix shadowfork panic in OpenStorageTrie

fix OpenStorageTrie: return an error instead of panicking if the account tree not a verkle tree (#321)

add a switch to force proof in blocks (#322)

* add a switch to force proof in blocks

* activate switch

* fix switch type

add switch to override the stride of the overlay conversion (#323)

* add switch to override the stride of the overlay conversion

* set a default stride of 10k

add a few traces for relaunch

add missing step in the chain of setting the override for overlay stride

fix: save and load transition state for block processing (#324)

* fix: save and load transition state for block processing

* log whether the tree is verkle in LoadTransitionState

* fix: ensure the transition is marked as started in insertChain

* dump saved address

* fix nil pointer panic

* remove stacktrace that is no longer useful

* fix a panic

* fix build

* check: copy current account address BEFORE it's saved

* mandatory panic fix

* Remove debug fmt.Println

* more cleanup + comments

fix: ensure StorageProcessed is properly recovered

add traces for gas pool overflow

cmd/{geth, utils}: add a command to clear verkle costs (#326)

* cmd/{geth, utils}: add a command to clear verkle costs

fix:  boolean issue

fix: load finalization state in FinalizeAndAssemble (#340)

* Conversion and TransitionTrie fixes (#346)

* fixes

Signed-off-by: Ignacio Hagopian <[email protected]>

* remove old comment

Signed-off-by: Ignacio Hagopian <[email protected]>

---------

Signed-off-by: Ignacio Hagopian <[email protected]>

* trace cleanup

---------

Signed-off-by: Ignacio Hagopian <[email protected]>
Co-authored-by: Ignacio Hagopian <[email protected]>

fix: check for nil override in NewBlockChain (#357)

fix rebase issue

fix a few rebase issues

add debug traces

fix: assume that having to create a new transaction state mean it hasn't happened.

This is a workaround, because it will always be false when restarting a node
that has started the transition but not necessarily completed it.

add logs to debug shadowfork

persist conversion state to db and use an LRU cache for active transition states (#375)

* persist conversion state to db

* fix: don't recreate LRU when writing state

* opt: only write state to db if not already present in LRU

* fix: rlp can't encode TransitionState

* fix: use gob because binary.Write does not support structs 🤦‍♂️

* fix: nil pointer panic

* add logs to debug shadowfork

* no such thing as not enough traces

* ditto

* fix stupid bug

* add a comment for readability

* add more traces

* Lock the state transition during conversion (#384)

* heavy handed approach: lock the state transition during conversion

* also lock transition state loading/unloading

* reduce logs verbosity

* add conversion test to workflow (#386)

* add conversion test to workflow

* mandatory -f switch fix in rm

* forgot & at the end of the geth command

* remove incorrect kill

* add debug traces

* add an overlay stride

* fix typo

* Apply suggestions from code review

eth: add debug_conversionStatus RPC call (#392)

* eth: add debug_conversionStatus RPC call

* add debug trace4s

* Apply suggestions from code review

* export started/ended fields

fix post-verge sync (#404)

* fix post-verge sync

* review: fix truncated comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant