-
Notifications
You must be signed in to change notification settings - Fork 20.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Legacy receipt converter tool #23454
Conversation
A thought: the first few thousand mainnet blocks were empty, so no use just checking block 1 for format.
|
Ah right. I should check blocks from num 1 until a non-empty receipt is found then check if its legacy |
That's it, yep
|
Do you have some numbers on how much space this saves on mainnet? I'll sync my node up and can do a receipt conversion |
0f2a255
to
05812de
Compare
To test this I did a full sync on an old commit for ~4mil blocks, then switched to version v1.9.0 for a while. Output of the migrate command: INFO [11-18|13:07:37.018] First legacy receipt number=46147
INFO [11-18|13:07:37.018] Starting migration ancients=4,207,670
INFO [11-18|13:07:37.026] Processing legacy elements number=0
INFO [11-18|13:07:43.214] Processing legacy elements number=500,000
INFO [11-18|13:08:04.514] Processing legacy elements number=1,000,000
INFO [11-18|13:08:41.157] Processing legacy elements number=1,500,000
INFO [11-18|13:09:18.377] Processing legacy elements number=2,000,000
INFO [11-18|13:09:57.718] Processing legacy elements number=2,500,000
INFO [11-18|13:10:38.356] Processing legacy elements number=3,000,000
INFO [11-18|13:11:48.204] Processing legacy elements number=3,500,000
INFO [11-18|13:14:09.616] Found first non-legacy element number=3,931,786 filenum=2 copyOver=true
INFO [11-18|13:14:21.536] Replacing table files
INFO [11-18|13:14:21.869] Migration finished duration=6m44.851466067s Now to make sure the content of the files are correct I'm concatenating the table files and comparing that against the freezer of a node with a "modern" db. i.e.: $ cmp -b ~/datadir-mainnet/geth/chaindata/ancient/receipts.part.cdat ~/datadir-receipts-backup/geth/chaindata/ancient/receipts.all.cdat
differ: byte 3312800212, line 12334563 is 257 M-/ 276 M-> But when I retrieve the encoded receipt for the corresponding block they seem to match. Wondering if it's snappy-related. |
So I wrote a small script that compares the tables of a certain kind against that of another freezer and the result seems to match. My hunch as to the differences in the binary files is due to different snappy output. From this thread:
So possible that in v1.9.0 geth was using an older snappy version than master, causing the difference in compressed receipts that I saw via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to do some more review of this, but have some comments so far
log.Info("No legacy receipts to migrate", "number", firstIdx) | ||
return nil | ||
} | ||
log.Info("First legacy receipt", "number", firstIdx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaict, this is the last time you make any use of firstIdx
: the rest of the code starts over from 0
and feeds it into the transformer
. And the transformer will (?) return stop=true
for the first empty receipt list, so I don't see how this can work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to double-check to see why its working :) so an empty receipt list is both legacy and non-legacy and transformer will ask is this legacy? it is so stop=false
. stop
will be true only for the first non-empty non-legacy receipt list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was planning to pass firstIdx
to MigrateTable
so we could start from there but we also need to copy over the first empty receipts so I decided to keep MigrateTable
as is and only use firstIdx
to determine whether the db is in the old format.
@@ -101,6 +101,10 @@ func (t *table) Sync() error { | |||
return t.db.Sync() | |||
} | |||
|
|||
func (t *table) MigrateTable(kind string, fn TransformerFn) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring
@@ -111,6 +111,9 @@ type AncientWriter interface { | |||
|
|||
// Sync flushes all in-memory ancient store data to disk. | |||
Sync() error | |||
|
|||
// MigrateTable Processes and migrates entries in a table to a new format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the semantics of the func([]byte)([]byte, bool, error)
Co-authored-by: Martin Holst Swende <[email protected]>
Considering that this will only ever be used by a very very small percentage of users, and only ever once, I think we should prefer simplicity over speed. So maybe let's aim for the cleanest possible way to do it.
|
Fair enough, I'll do a re-write which simply copies over all entries. |
Closing because of #24028 |
See #23245 and #23247 for context. Adds a command
db freezer-migrate
that goes through the list of receipts in the freezer and converts the legacy ones.