-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix range deletion tombstone ingestion with global seqno #6429
Fix range deletion tombstone ingestion with global seqno #6429
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ASSERT_OK(sst_file_writer.Open(file)); | ||
ASSERT_OK(sst_file_writer.DeleteRange(Key(0), Key(30))); | ||
ExternalSstFileInfo file_info; | ||
ASSERT_OK(sst_file_writer.Finish(&file_info)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just say ASSERT_OK(s); here, or even combine it with the previous line (ASSERT_OK(sst_file_writer.Finish(&file_info));).
Done.
|
||
for (int i = 5; i < 25; i++) { | ||
std::string res; | ||
ASSERT_TRUE(db_->Get(ReadOptions(), Key(i), &res).IsNotFound()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ASSERT_TRUE(Get(Key(i)).IsNotFound()) would be a bit cleaner here.
Done.
@@ -150,8 +150,7 @@ class BlockReadAmpBitmap { | |||
class Block { | |||
public: | |||
// Initialize the block with the specified contents. | |||
explicit Block(BlockContents&& contents, SequenceNumber _global_seqno, | |||
size_t read_amp_bytes_per_bit = 0, | |||
explicit Block(BlockContents&& contents, size_t read_amp_bytes_per_bit = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here (not with the PR per se but with the pre-existing default parameter values) is that removing
_global_seqno
changes the meaning of code like this:Block reader1(std::move(contents), kDisableGlobalSequenceNumber); // kDisableGlobalSequenceNumber is now read_amp_bytes_per_bit !!!
We should review all places where
Block
s are constructed and fix any problems like this.
Good catch. I went through all occurrences of Block
and found a bunch. You can see the fixes in the "address comments" commit (35be04d).
HISTORY.md
Outdated
@@ -12,6 +12,7 @@ | |||
* BlobDB now ignores trivially moved files when updating the mapping between blob files and SSTs. This should mitigate issue #6338 where out of order flush/compaction notifications could trigger an assertion with the earlier code. | |||
* Batched MultiGet() ignores IO errors while reading data blocks, causing it to potentially continue looking for a key and returning stale results. | |||
* `WriteBatchWithIndex::DeleteRange` returns `Status::NotSupported`. Previously it returned success even though reads on the batch did not account for range tombstones. The corresponding language bindings now cannot be used. In C, that includes `rocksdb_writebatch_wi_delete_range`, `rocksdb_writebatch_wi_delete_range_cf`, `rocksdb_writebatch_wi_delete_rangev`, and `rocksdb_writebatch_wi_delete_rangev_cf`. In Java, that includes `WriteBatchWithIndex::deleteRange`. | |||
* Fix a bug where range tombstone blocks in ingested files were cached incorrectly during ingestion. If range tombstones were read from those incorrectly cached blocks, the keys they covered would be exposed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could you mention the fix in HISTORY.md?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment. Thanks!
If we are writing a global seqno for an ingested file, the range tombstone metablock gets accessed and put into the cache during ingestion preparation. At the time, the global seqno of the ingested file has not yet been determined, so the cached block will not have a global seqno. When the file is ingested and we read its range tombstone metablock, it will be returned from the cache with no global seqno. In that case, we use the actual seqnos stored in the range tombstones, which are all zero, so the tombstones cover nothing. This commit removes global_seqno_ variable from Block. When iterating over a block, the global seqno for the block is determined by the iterator instead of storing this mutable attribute in Block. Additionally, this commit adds a regression test to check that keys are deleted when ingesting a file with a global seqno and range deletion tombstones.
35be04d
to
380eade
Compare
@ajkr has updated the pull request. Re-import the pull request |
Thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…11252) Summary: CreateColumnFamilyWithImport() did not support range tombstones for two reasons: 1. it uses point keys of a input file to determine its boundary (smallest and largest internal key), which means range tombstones outside of the point key range will be effectively dropped. 2. it does not handle files with no point keys. Also included a fix in external_sst_file_ingestion_job.cc where the blocks read in `GetIngestedFileInfo()` can be added to block cache now (issue fixed in #6429). This PR adds support for exporting and importing column family with range tombstones. The main change is to add smallest internal key and largest internal key to `SstFileMetaData` that will be part of the output of `ExportColumnFamily()`. Then during `CreateColumnFamilyWithImport(...,const ExportImportFilesMetaData& metadata,...)`, file boundaries can be set from `metadata` directly. This is needed since when file boundaries are extended by range tombstones, sometimes they cannot be deduced from a file's content alone. Pull Request resolved: #11252 Test Plan: - added unit tests that fails before this change Closes #11245 Reviewed By: ajkr Differential Revision: D43577443 Pulled By: cbi42 fbshipit-source-id: 6bff78e583cc50c44854994dea0a8dd519398f2f
Original author: @jeffrey-xiao
If we are writing a global seqno for an ingested file, the range
tombstone metablock gets accessed and put into the cache during
ingestion preparation. At the time, the global seqno of the ingested
file has not yet been determined, so the cached block will not have a
global seqno. When the file is ingested and we read its range tombstone
metablock, it will be returned from the cache with no global seqno. In
that case, we use the actual seqnos stored in the range tombstones,
which are all zero, so the tombstones cover nothing.
This commit removes global_seqno_ variable from Block. When iterating
over a block, the global seqno for the block is determined by the
iterator instead of storing this mutable attribute in Block.
Additionally, this commit adds a regression test to check that keys are
deleted when ingesting a file with a global seqno and range deletion
tombstones.