-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhead expecations #171
Comments
For reference, we're inserting 1000 blocks with one transaction per block, which amounts to around 150kb. |
That seems a little high, but only like a factor of 2 too high. See here for an approximate formula: https://github.com/casey/ord/pull/20#issuecomment-1003267185 I've fixed at least one issue on master where space was getting wasted |
Ah, nice, I missed that. Is that formula still correct when value sizes are large? I.e. if I insert a 1mib value, should I expect database usage to increase 3mib after insert? |
No. I think expectation would be more like 1.5x in that case. For a 1mib value it will allocate the next power of 2 page size and store your value in that. For small values it's more like 3x due to overhead |
So, when I run that benchmark it prints |
Ah yes, all that other stuff 😅
They're all sha-256 (or double sha-256, because Satoshi was paranoid) so 32 bytes. Is it a reasonable strategy to try to allocate values in close to power-of-two sizes? I could store a bunch of blocks together, say just under 128MiB, which would reduce the space overhead at least for that part of the data. |
I think this issue can be closed, although maybe documenting the expected overhead in the readme or design doc would be useful. |
I think I'll add some stats which people can query to see the actual overhead. I'm not sure I want to document the behavior, at least not right now, since the overhead is dependent on heuristics about when to merge pages |
Doesn't really feel worth it to me, but ya if you're tight on space then I would suggest a power-of-two minus half a page (2kb). There's no way that redb's overhead will exceed half a page :) |
I added some methods to retrieve the storage stats. Looks like it checks out against the formula I gave:
|
Sweet, this is awesome. |
Currently working on performance benchmarks, and we noticed that when inserting 150K bytes into the database required the database size to be 1MiB. How much overhead should we be expecting in this case?
Here's the current state of the database code and the corresponding benchmark
The text was updated successfully, but these errors were encountered: