-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tool for database conversion #2061
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2061 +/- ##
==========================================
+ Coverage 22.39% 28.94% +6.54%
==========================================
Files 226 236 +10
Lines 32034 34687 +2653
==========================================
+ Hits 7175 10041 +2866
+ Misses 23601 23504 -97
+ Partials 1258 1142 -116 |
I tried to run the script, and it exited (and deleted everything by default) because it found that the wasm database was already on pebble (maybe from the fresh sync?). Imo it should copy the database in that case, and not error and revert the whole operation. |
os.Exit(1) | ||
} | ||
stats := conv.Stats() | ||
log.Info("Conversion finished.", "entries", stats.Entries(), "MB", stats.Bytes()/1024/1024, "avg Ke/s", stats.AverageEntriesPerSecond()/1000, "avg MB/s", stats.AverageBytesPerSecond()/1024/1024, "elapsed", stats.Elapsed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Ke/s
is a little confusing, it could have the same unit defined in printProgress, i.e. entries/s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in: #2591
func DBConvConfigAddOptions(f *flag.FlagSet) { | ||
DBConfigAddOptions("src", f, &DefaultDBConvConfig.Src) | ||
DBConfigAddOptions("dst", f, &DefaultDBConvConfig.Dst) | ||
f.Int("ideal-batch-size", DefaultDBConvConfig.IdealBatchSize, "ideal write batch size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick:
f.Int("ideal-batch-size", DefaultDBConvConfig.IdealBatchSize, "ideal write batch size") | |
f.Int("ideal-batch-size", DefaultDBConvConfig.IdealBatchSize, "ideal write batch size in bytes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
included in: #2591
) | ||
|
||
func TestConversion(t *testing.T) { | ||
_ = testhelpers.InitTestLog(t, log.LvlTrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: unnecessary line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in: #2591
bc := builder.L2.ExecNode.Backend.ArbInterface().BlockChain() | ||
current := bc.CurrentBlock() | ||
if current == nil { | ||
Fatal(t, "failed to get current block header") | ||
} | ||
triedb := bc.StateCache().TrieDB() | ||
visited := 0 | ||
i := uint64(0) | ||
// don't query historical blocks when PathSchem is used | ||
if builder.execConfig.Caching.StateScheme == rawdb.PathScheme { | ||
i = current.Number.Uint64() | ||
} | ||
for ; i <= current.Number.Uint64(); i++ { | ||
header := bc.GetHeaderByNumber(i) | ||
_, err := bc.StateAt(header.Root) | ||
Require(t, err) | ||
tr, err := trie.New(trie.TrieID(header.Root), triedb) | ||
Require(t, err) | ||
it, err := tr.NodeIterator(nil) | ||
Require(t, err) | ||
for it.Next(true) { | ||
visited++ | ||
} | ||
Require(t, it.Error()) | ||
} | ||
t.Log("visited nodes:", visited) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this is a useful test, but it is OK being here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's just to iterate over state tries to better check for possible db corruption. That's still not a complete check, but increases coverage.
|
||
convert_result= | ||
convert () { | ||
srcdir=$(echo $src/$1 | tr -s /) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: tr command is unecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in: #2591
|
||
convert_result= | ||
convert () { | ||
srcdir=$(echo $src/$1 | tr -s /) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: in several places we could use double quote to prevent globbing and word splitting, such as here
srcdir=$(echo $src/$1 | tr -s /) | |
srcdir=$(echo "$src/$1" | tr -s /) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that it is already handled, but I am not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why the behavior in this script is not implemented in dbconv golang tool instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was decided on one of the nitro meetings. The idea was to keep dbconv tool as simple and flexible as possible and have a most common use case covered by a bash script.
removeDir() { | ||
cmd="rm -r \"$1\"" | ||
echo $cmd | ||
eval $cmd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to use eval here, you can call rm
directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's needed to support directories with spaces in name. Otherwise I get:
== Note: removing only failed destination directory
rm -r "dstdb with spaces/l2chaindata"
rm: "dstdb: No such file or directory
rm: with: No such file or directory
rm: spaces/l2chaindata": No such file or directory
Resolves NIT-1259
This PR adds tool for database conversion and a script automating typical conversion of nitro node databases.
database-conversion.bash script
The script coverts nitro databases from
leveldb
topebble
using the dbconv tool.Upon successful completion the script prints out:
dbconv tool
example usage
Local experiments
Experiments were conducted on
l2chaindata
database from https://snapshot.arbitrum.foundation/sepolia/nitro-archive.tar. The source database directory had size of aprox 57GB.--verify="keys"
) took aprox 5 minutesnote: the biggest impact had the size of write batch (--ideal-batch-size) which was increased to 100MB from geth's default 100KB.