Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: state schema path & pebble db #2323

Open
Zorato opened this issue May 21, 2024 · 12 comments
Open

Question: state schema path & pebble db #2323

Zorato opened this issue May 21, 2024 · 12 comments

Comments

@Zorato
Copy link

Zorato commented May 21, 2024

Is there any chance to get pruned snapshot of Arbitrum One node with PebbleDB engine (instead of LevelDB) with state.schema=path (instead of default one path)?

Pruning node every few weeks causes downtimes and requires lots of free disk space to maintain, while state schema path may resolve growing chaindata size issue.

Any advice on how to sync node from scratch with these parameters are appreciated.

@daniil-parsiq
Copy link

This would be incredibly useful

@yorickdowne
Copy link

I do not believe this is possible. I have seen --persistent.db-engine pebble but nothing about path schema in the --help

@Zorato
Copy link
Author

Zorato commented Aug 8, 2024

I suppose it's related to #2324
Any ETA on getting PBSS snapshot available?

@iandennismiller
Copy link

It is now possible to launch with:

/usr/local/bin/nitro ... \
        --execution.caching.state-scheme path \
        --persistent.db-engine pebble

However, I am unable to bootstrap a node like this. The existing genesis snapshot from https://snapshot.arbitrum.foundation/arb1/nitro-genesis.tar contains leveldb information, of course, so nitro exits with:

err="db.engine choice was pebble but found pre-existing leveldb database in specified data directory"

Without a pebbledb genesis snapshot, I presume it's not yet possible to use the new state scheme in production.

So, I would echo @Zorato 's request for pebble-backed snapshots.

Side note: I looked into low-level conversion from leveldb to rocksdb or pebble but this didn't seem trivial. However, if it worked, then existing snapshots could be re-used.

@yorickdowne
Copy link

Correct, because of the historical data, this isn’t trivial. Optimism had a similar challenge. Team-provided pebble/path snapshots would be the solution.

@yorickdowne
Copy link

yorickdowne commented Aug 10, 2024

Side note: I looked into low-level conversion from leveldb to rocksdb or pebble but this didn't seem trivial. However, if it worked, then existing snapshots could be re-used.

That’s less interesting than path tbh. Sure, PebbleDB is better supported, shuts down faster, and is less likely to corrupt. These are all desirable and reasons to use Pebble.

A conversion of the existing DB would then be Pebble / hash, and you’re still stuck with manual offline pruning every so often.

It’s path that introduces continuous pruning. That’s the one that has the biggest operational impact. And yes sure, you may as well go with PebbleDB if a resync is inevitable anyway.

Fingers crossed the team can provide a historical snapshot in Pebble/path form.

@iandennismiller
Copy link

iandennismiller commented Aug 10, 2024

A conversion of the existing DB would then be Pebble / hash, and you’re still stuck with manual offline pruning every so often.

Good point.

A thought I had last night is that the testing code contributed in #2324 probably contains a good portion of the work required to stand up the new pebble/path state store. A utility script could be derived from the test code that inits the empty pebble/path database, iterates the genesis snapshot block-by-block, and writes to the pebble/path db. From there, wouldn't it be a matter of syncing to a live archive node, as usual? Once synced, prune the state store and make a tarball.

Easier said than done, I'm sure, but I am watching with great interest.

edit: removed user tag

@iandennismiller
Copy link

I just noticed part of this problem has been solved in #2061 - specifically converting leveldb to pebbledb.

I do not think this this tool provides conversion from hash to path state store. However, I did notice some testing code related to Path: https://github.com/OffchainLabs/nitro/pull/2061/files#diff-f6a86af8229f2ce0d04e0a35bb0a8caab9a94de92a36d85451303fbc2b560d30R100

@iandennismiller
Copy link

There is now a pebbledb genesis snapshot:

https://snapshot.arbitrum.io/arb1/nitro-genesis-pebble.tar

This is another step towards path state schema.

@yorickdowne
Copy link

This is another step towards path state schema

I don't see how. Path and Pebble go together well, but they are also independent of each other. This snapshot is from January 2024 and is a Pebble / Hash DB.

A step towards Path would be a Pebble / Path genesis snapshot, which includes Classic.

@iandennismiller
Copy link

Yes, you're right it's pebble/hash, so there is still a missing piece. Where I could be incorrect is with my expectations of the database conversion tool, which may have a different scope than I understood.

level->pebble is supported by the tool and it appears to work. The tool is also aware of path state scheme, in addition to hash. Currently, the conversion tool doesn't actually support path state.

My hope for the near future is that the dbconv tool would be extended to support writing path state scheme. The basic architecture of the database conversion script seems appropriate for eventually supporting hash->path conversion.

tl;dr:

  • the dbconv tool is stable-enough now to create the pebbledb/hash genesis snapshot
  • the geth codebase has path state merged with working tests
  • the missing piece is to migrate some of the path-state testing code, which can build a path state database from scratch, into the dbconv tool

In the end, I'm just waiting for the team to finalize this and release the pebble/path snapshot. I hope to never prune again.

@joshuacolvin0
Copy link
Member

pathdb currently has performance issues for a fast chain like Arbitrum One. We will release a path based snapshot when performance has been improved enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants