-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim extra shred bytes in blockstore #16602
Conversation
Codecov Report
@@ Coverage Diff @@
## master #16602 +/- ##
=========================================
- Coverage 82.7% 82.7% -0.1%
=========================================
Files 414 414
Lines 115686 115702 +16
=========================================
+ Hits 95760 95765 +5
- Misses 19926 19937 +11 |
@sakridge - I ended up cherry-picking these commits into my own branch since they were failing the CI before; it appears those failures were either unrelated issues that were fixed or a non-deterministic failure. Regardless of RocksDB vs. AccountsDB, I think we still want this change to keep our backend datastore as slim as possible, right? Assuming so, thoughts on any additional testing that should be done for validation of this PR? The Edit: I'm pushing in one more commit to add a comment or two so CI will re-run, but everything had passed: |
Yep, we still want it, exactly as you said to keep the db as small as possible. |
Here are the performance run results:
Does seem to be worse than the recent runs on master's latest (which are hitting 55k+ max TPS) |
I think it might be within the noise of this measurement. You could compare the specific |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
So I dug a little deeper on this and did a comparison of my run with the previous nightly run. For the next two graphs, the first two humps are the previous two nightly runs; the third hump is my run. Here is insert: Digging deeper on the graphana graphs, the data is pretty "jagged" so think I would agree that any observed difference is within the noise, so I'm happy if you are |
nit: |
ea3f8dd
to
66c9c61
Compare
e9017be
to
28f7ad8
Compare
35b4da4
to
f25c9f5
Compare
Yeah, I think you're right. We already have another similar validity check in |
f25c9f5
to
40ed492
Compare
Strip the zero-padding off of data shreds before insertion into blockstore Co-authored-by: Stephen Akridge <[email protected]> Co-authored-by: Nathan Hawkins <[email protected]>
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
This is a partial backport of solana-labs#16602 to allow compatibility with that change.
* Zero pad data shreds on fetch from blockstore This is a partial backport of #16602 to allow compatibility with that change. * Remove size check and resize shreds to consistent length
Problem
Data shred bytestreams are currently inserted into the blockstore with padded 0's. Every data shred has at least some padded 0's due to there being a "restricted" section at the end of a data shred's payload:
https://github.com/solana-labs/solana/blob/master/ledger/src/shred.rs#L44-L50
Furthermore, data shreds that are not filled to capacity with data will have additional 0-padding to get to fill up the packet.
The result of these two items is that the blockstore is getting bloated with extraneous bytes.
Summary of Changes
Trim the extra padding bytes off of the shred on insertion; "re-expand" the bytestream when retrieving from the blockstore to avoid breaking the erasure coding algorithm (which requires that coding and data shreds are of same length).
Fixes #16236