-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block per key-value checksum #11287
Block per key-value checksum #11287
Conversation
1cc86ff
to
87a559e
Compare
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
48197de
to
3802ba1
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
0a0803f
to
d5c68f5
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the complete test coverage and complete argument passing (that is, no default values for the new parameters)
I have a few files remaining to review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Congrats on finishing this huge feature
d5c68f5
to
9d8e70a
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
2 similar comments
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! I've update the PR summary with new benchmark results, should be good to review again.
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -564,12 +564,13 @@ Status VersionEditHandler::LoadTables(ColumnFamilyData* cfd, | |||
assert(builder_iter->second != nullptr); | |||
VersionBuilder* builder = builder_iter->second->version_builder(); | |||
assert(builder); | |||
const MutableCFOptions* moptions = cfd->GetLatestMutableCFOptions(); | |||
Status s = builder->LoadTableHandlers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so you know : potential merge conflict with #11288
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM again
@@ -1125,6 +1284,7 @@ size_t Block::ApproximateMemoryUsage() const { | |||
if (read_amp_bitmap_) { | |||
usage += read_amp_bitmap_->ApproximateMemoryUsage(); | |||
} | |||
usage += checksum_size_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for remembering this
6c0407d
to
eb622b1
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
eb622b1
to
7ce6820
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
7ce6820
to
240b522
Compare
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: add option
block_protection_bytes_per_key
and implementation for block per key-value checksum. The main changes areblock_protection_bytes_per_key
around (mainly for methods defined in table_cache.h)Tests:
python3 tools/db_crashtest.py blackbox --simple --block_protection_bytes_per_key=1 --write_buffer_size=1048576
Follow up (maybe as a separate PR): make sure corruption status returned from BlockIters are correctly handled.
Performance:
Turning on block per KV protection has a non-trivial negative impact on read performance and costs additional memory.
For memory, each block includes additional 24 bytes for checksum-related states beside checksum itself. For CPU, I set up a DB of size ~1.2GB with 5M keys (32 bytes key and 200 bytes value) which compacts to ~5 SST files (target file size 256 MB) in L6 without compression. I tested readrandom performance with various block cache size (to mimic various cache hit rates):
The benchmark has a lot of variance, but there was a 5% to 25% regression in this benchmark with different cache hit rates.