Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd: command to dump preimages enumerated in snapshot order, in a flat file #27819

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions cmd/geth/chaincmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,17 @@ It's deprecated, please use "geth db import" instead.
Description: `
The export-preimages command exports hash preimages to an RLP encoded stream.
It's deprecated, please use "geth db export" instead.
`,
}
exportOverlayPreimagesCommand = &cli.Command{
Action: exportOverlayPreimages,
Name: "export-overlay-preimages",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Name: "export-overlay-preimages",
Name: "export-preimages",

I mean, it's just exporting preimages, right?

Usage: "Export the preimage in overlay tree migration order",
ArgsUsage: "<dumpfile>",
Flags: flags.Merge([]cli.Flag{}, utils.DatabasePathFlags),
Description: `
The export-overlay-preimages command exports hash preimages to a flat file, in exactly
the expected order for the overlay tree migration.
`,
}
dumpCommand = &cli.Command{
Expand Down Expand Up @@ -399,6 +410,24 @@ func exportPreimages(ctx *cli.Context) error {
return nil
}

// exportOverlayPreimages dumps the preimage data to a flat file.
func exportOverlayPreimages(ctx *cli.Context) error {
if ctx.Args().Len() < 1 {
utils.Fatalf("This command requires an argument.")
}
stack, _ := makeConfigNode(ctx)
defer stack.Close()

chain, _ := utils.MakeChain(ctx, stack, true)

start := time.Now()
if err := utils.ExportOverlayPreimages(chain, ctx.Args().First()); err != nil {
utils.Fatalf("Export error: %v\n", err)
}
fmt.Printf("Export done in %v\n", time.Since(start))
return nil
}

func parseDumpConfig(ctx *cli.Context, stack *node.Node) (*state.DumpConfig, ethdb.Database, common.Hash, error) {
db := utils.MakeChainDatabase(ctx, stack, true)
var header *types.Header
Expand Down
1 change: 1 addition & 0 deletions cmd/geth/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ func init() {
exportCommand,
importPreimagesCommand,
exportPreimagesCommand,
exportOverlayPreimagesCommand,
removedbCommand,
dumpCommand,
dumpGenesisCommand,
Expand Down
67 changes: 67 additions & 0 deletions cmd/utils/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,73 @@ func ExportPreimages(db ethdb.Database, fn string) error {
return nil
}

// ExportOverlayPreimages exports all known hash preimages into the specified file,
// in the same order as expected by the overlay tree migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify what that "same order" entails

func ExportOverlayPreimages(chain *core.BlockChain, fn string) error {
log.Info("Exporting preimages", "file", fn)

fh, err := os.OpenFile(fn, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.ModePerm)
if err != nil {
return err
}
Comment on lines +382 to +385
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncate? Wouldn't it be nicer if we could abort/resume, somehow?

defer fh.Close()

writer := bufio.NewWriter(fh)
defer writer.Flush()

statedb, err := chain.State()
if err != nil {
return fmt.Errorf("failed to open statedb: %w", err)
}

mptRoot := chain.CurrentBlock().Root

accIt, err := chain.Snapshots().AccountIterator(mptRoot, common.Hash{})
if err != nil {
return err
}
defer accIt.Release()

count := 0
for accIt.Next() {
acc, err := types.FullAccount(accIt.Account())
if err != nil {
return fmt.Errorf("invalid account encountered during traversal: %s", err)
}
addr := rawdb.ReadPreimage(statedb.Database().DiskDB(), accIt.Hash())
Comment on lines +405 to +410
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting. So, your way of doing this is to

  • Iterate the account-snapshot, (ordered by hashed address)
  • For each h(a), read preimage.

An alternative way to do it would be to iterate the premages: in a first phase, out to an external file. In a second phase, that file could be (iteratively) sorted by post-image instead of pre-image.

It would be interesting to see the differences in speed between the two approaches.

A benefit with the second approach is that it wouldn't be sensitive to new data -- if you get an additional chunk of preimages (either from a few more blocks, or some external source), you could just append them to the "unsorted" file, and then re-sort it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see now that the order you want is not just "ordered by post-hash", the ordering is "ordered by post-hash account, then the preimages for that account's storage trie preimages". So the final ordering would be

acc1 
acc2
storagekey1-acc2
storagekey2-acc2
acc3

Seems like a pretty strange ordering -- and also highly sensitive to changes in state. If you advance one block, you need to redo everything, because the ordering is state-dependent and cannot be performed given only the data itself.

if len(addr) != 20 {
return fmt.Errorf("addr len is zero is not 32: %d", len(addr))
}
if _, err := writer.Write(addr); err != nil {
return fmt.Errorf("failed to write addr preimage: %w", err)
}

if acc.Root == types.EmptyRootHash {
stIt, err := chain.Snapshots().StorageIterator(mptRoot, accIt.Hash(), common.Hash{})
Comment on lines +418 to +419
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems bass-ackwards. No need to create an iterator if the root is the empty root hash. Tho it's faster for sure :)

if err != nil {
return fmt.Errorf("failed to create storage iterator: %w", err)
}
for stIt.Next() {
slotnr := rawdb.ReadPreimage(statedb.Database().DiskDB(), stIt.Hash())
if len(slotnr) != 32 {
return fmt.Errorf("slotnr not 32 len")
}
if _, err := writer.Write(slotnr); err != nil {
return fmt.Errorf("failed to write slotnr preimage: %w", err)
}
}
stIt.Release()
}
count++
if count%100000 == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please instead add a log output every 8 seconds or so, Exporting preimages with some stats about how many, how much remaining, how much time elapsed etc.

log.Info("Last exported account", "account", accIt.Hash())
}
}

log.Info("Exported preimages", "file", fn)
return nil
}

// exportHeader is used in the export/import flow. When we do an export,
// the first element we output is the exportHeader.
// Whenever a backwards-incompatible change is made, the Version header
Expand Down