-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Compare can return incorrect results when only suffixes are compared #127914
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Comments
RaduBerinde
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Jul 30, 2024
Hi @RaduBerinde, please add branch-* labels to identify which branch(es) this C-bug affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
nicktrav
moved this from Incoming
to In Progress (this milestone)
in [Deprecated] Storage
Jul 30, 2024
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to cockroachdb/pebble
that referenced
this issue
Jul 31, 2024
We currently use `Compare` to compare both keys and bare suffixes. This makes the implementation more complicated as it has to handle both cases. It also prevents adding many useful assertions - for example, we can't verify that `Compare` does byte-wise comparison of prefixes because we can't unconditionally call `Split`. We add a mandatory `CompareSuffixes` method that is used when comparing bare suffixes. The result of `Compare` can be defined completely in terms of `Split` and `CompareSuffixes`; `Compare` now has a default implementation. We also add a `CheckComparer` testing facility that can be used to verify `Split` and `Compare` on combinations of prefixes and suffixes. We use it to add a test to all `Comparer`s and it will be used from CRDB code as well. Informs cockroachdb/cockroach#127914
RaduBerinde
added a commit
to RaduBerinde/cockroach
that referenced
this issue
Aug 1, 2024
The `EngineComparer` has a bug when comparing bare suffixes - we call the normalization function on a slice without the sentinel byte; that function's logic is based on length so it will not function properly. Currently, I don't think this bug can manifest in any visible problems in practice - we compare bare suffixes only for range keys and those suffixes never have a logical component or synthetic bit. It is conceivable that we compare those with some actual key suffix (e.g. when applying masking) but that would only matter if a range key prefix and an existing key in that range have the same timestamp, which should never happen. We now have a separate `CompareSuffixes` function which is used for bare suffixes. This simplifies the implementation, allows us to add more assertions, and allows testing the `Compare` function (since the result can be derived from `Split` and `CompareSuffixes`). We also improve the comparer testing to include these special cases of version bytes that must be ignored, and we use the new Pebble `CheckComparer` test to verify the internal consistency of the comparer. Thanks to the more strict checks, several cases of passing `roachpb.Key`s unencoded (i.e. without the sentinel byte) were found and fixed. Pebble changes: * [`1b22d0b7`](cockroachdb/pebble@1b22d0b7) base: add CompareSuffixes to Comparer * [`bcbe8d78`](cockroachdb/pebble@bcbe8d78) keyspan: use slices.Sort functions * [`8af904fa`](cockroachdb/pebble@8af904fa) sstable/block: move Compress, Decompress, DecompressedLen * [`21672e31`](cockroachdb/pebble@21672e31) sstable/block: move blockType to block.CompressionIndicator * [`c73ffdb8`](cockroachdb/pebble@c73ffdb8) sstable/block: move Compression enum from sstable * [`e3782de2`](cockroachdb/pebble@e3782de2) colblk: add keyspan block writer, reader, iterator * [`ad3a2f0b`](cockroachdb/pebble@ad3a2f0b) manifest: fix incorrect Annotator pointer aggregation * [`4589d053`](cockroachdb/pebble@4589d053) sstable: implement Writer.DeleteRange using RawWriter.EncodeSpan * [`6e60d6ba`](cockroachdb/pebble@6e60d6ba) internal/{rangedel,rangekey}: take Span by value * [`0ef38b39`](cockroachdb/pebble@0ef38b39) sstable: remove WriterOption * [`9cbad0b5`](cockroachdb/pebble@9cbad0b5) db: Add BenchmarkPointDeletedSwath * [`e326a015`](cockroachdb/pebble@e326a015) sstable: clean up use-filter-block logic * [`ce28942d`](cockroachdb/pebble@ce28942d) sstable: twoLevelIterator: unembed singleLevelIterator * [`b6400d63`](cockroachdb/pebble@b6400d63) sstable: rename NewIterWithBlockPropertyFiltersAndContextEtc * [`a6580b19`](cockroachdb/pebble@a6580b19) sstable: remove compaction iterator wrappers * [`1cc8f4c9`](cockroachdb/pebble@1cc8f4c9) sstable: remove NewIterWithBlockPropertyFilters * [`3fded19e`](cockroachdb/pebble@3fded19e) sstable: improve TrivialReaderProvider * [`034fd8ef`](cockroachdb/pebble@034fd8ef) db: remove IterOptions.TableFilter * [`cde3df0f`](cockroachdb/pebble@cde3df0f) manifest: clean up Level * [`368636f0`](cockroachdb/pebble@368636f0) sstable: show filenum when metadata is corrupted * [`f8e117d3`](cockroachdb/pebble@f8e117d3) db: fix missing close in TestOverlappingIngestedSSTs * [`d3209700`](cockroachdb/pebble@d3209700) deps: update snappy to 43d5d4c * [`f0a0e2bb`](cockroachdb/pebble@f0a0e2bb) sstable: split Writer into Writer and RawWriter * [`ed42fb43`](cockroachdb/pebble@ed42fb43) db: don't load large filter blocks for flushable ingests * [`ca2b210f`](cockroachdb/pebble@ca2b210f) db: strictly enforce prefix in [flushable]BatchIter.SeekPrefixGE * [`44e5fb43`](cockroachdb/pebble@44e5fb43) db: store *Comparer on Batch * [`e36d078a`](cockroachdb/pebble@e36d078a) manifest: improve `Annotator` interface with generics * [`1cedb60f`](cockroachdb/pebble@1cedb60f) sstable: don't skipBackward past empty blocks with hideObsolete * [`4cf4f852`](cockroachdb/pebble@4cf4f852) colblk: clarify PrefixBytes comments * [`2752abb9`](cockroachdb/pebble@2752abb9) keyspan: add contexts to keyspan iterators * [`54e5ba64`](cockroachdb/pebble@54e5ba64) db: use testkeys comparer in TestTracing * [`0e232926`](cockroachdb/pebble@0e232926) db: convert TestTracing to datadriven * [`89e5644a`](cockroachdb/pebble@89e5644a) db: Add BenchmarkQueueWorkload * [`72c3f550`](cockroachdb/pebble@72c3f550) base: make Compare more restrictive * [`4f01ec48`](cockroachdb/pebble@4f01ec48) sstable: plumb context to range iterator constructors * [`de0e731a`](cockroachdb/pebble@de0e731a) colblk: add facility for cloning an Array to a []T * [`a2f6d39a`](cockroachdb/pebble@a2f6d39a) colblk: randomize BundleSizes in block writer randomized test * [`b26e6326`](cockroachdb/pebble@b26e6326) colblk: implement a common interface across column data accessors * [`6492e866`](cockroachdb/pebble@6492e866) colblk: make decoder function signatures consistent * [`dbf9ab67`](cockroachdb/pebble@dbf9ab67) compact: remove unused parameter in pickAutoLPositive Release note: none. Fixes: cockroachdb#127914
craig bot
pushed a commit
that referenced
this issue
Aug 1, 2024
128043: storage: bump Pebble and fix Comparer r=RaduBerinde a=RaduBerinde The `EngineComparer` has a bug when comparing bare suffixes - we call the normalization function on a slice without the sentinel byte; that function's logic is based on length so it will not function properly. Currently, I don't think this bug can manifest in any visible problems in practice - we compare bare suffixes only for range keys and those suffixes never have a logical component or synthetic bit. It is conceivable that we compare those with some actual key suffix (e.g. when applying masking) but that would only matter if a range key prefix and an existing key in that range have the same timestamp, which should never happen. We now have a separate `CompareSuffixes` function which is used for bare suffixes. This simplifies the implementation, allows us to add more assertions, and allows testing the `Compare` function (since the result can be derived from `Split` and `CompareSuffixes`). We also improve the comparer testing to include these special cases of version bytes that must be ignored, and we use the new Pebble `CheckComparer` test to verify the internal consistency of the comparer. Thanks to the more strict checks, several cases of passing `roachpb.Key`s unencoded (i.e. without the sentinel byte) were found and fixed. Pebble changes: * [`1b22d0b7`](cockroachdb/pebble@1b22d0b7) base: add CompareSuffixes to Comparer * [`bcbe8d78`](cockroachdb/pebble@bcbe8d78) keyspan: use slices.Sort functions * [`8af904fa`](cockroachdb/pebble@8af904fa) sstable/block: move Compress, Decompress, DecompressedLen * [`21672e31`](cockroachdb/pebble@21672e31) sstable/block: move blockType to block.CompressionIndicator * [`c73ffdb8`](cockroachdb/pebble@c73ffdb8) sstable/block: move Compression enum from sstable * [`e3782de2`](cockroachdb/pebble@e3782de2) colblk: add keyspan block writer, reader, iterator * [`ad3a2f0b`](cockroachdb/pebble@ad3a2f0b) manifest: fix incorrect Annotator pointer aggregation * [`4589d053`](cockroachdb/pebble@4589d053) sstable: implement Writer.DeleteRange using RawWriter.EncodeSpan * [`6e60d6ba`](cockroachdb/pebble@6e60d6ba) internal/{rangedel,rangekey}: take Span by value * [`0ef38b39`](cockroachdb/pebble@0ef38b39) sstable: remove WriterOption * [`9cbad0b5`](cockroachdb/pebble@9cbad0b5) db: Add BenchmarkPointDeletedSwath * [`e326a015`](cockroachdb/pebble@e326a015) sstable: clean up use-filter-block logic * [`ce28942d`](cockroachdb/pebble@ce28942d) sstable: twoLevelIterator: unembed singleLevelIterator * [`b6400d63`](cockroachdb/pebble@b6400d63) sstable: rename NewIterWithBlockPropertyFiltersAndContextEtc * [`a6580b19`](cockroachdb/pebble@a6580b19) sstable: remove compaction iterator wrappers * [`1cc8f4c9`](cockroachdb/pebble@1cc8f4c9) sstable: remove NewIterWithBlockPropertyFilters * [`3fded19e`](cockroachdb/pebble@3fded19e) sstable: improve TrivialReaderProvider * [`034fd8ef`](cockroachdb/pebble@034fd8ef) db: remove IterOptions.TableFilter * [`cde3df0f`](cockroachdb/pebble@cde3df0f) manifest: clean up Level * [`368636f0`](cockroachdb/pebble@368636f0) sstable: show filenum when metadata is corrupted * [`f8e117d3`](cockroachdb/pebble@f8e117d3) db: fix missing close in TestOverlappingIngestedSSTs * [`d3209700`](cockroachdb/pebble@d3209700) deps: update snappy to 43d5d4c * [`f0a0e2bb`](cockroachdb/pebble@f0a0e2bb) sstable: split Writer into Writer and RawWriter * [`ed42fb43`](cockroachdb/pebble@ed42fb43) db: don't load large filter blocks for flushable ingests * [`ca2b210f`](cockroachdb/pebble@ca2b210f) db: strictly enforce prefix in [flushable]BatchIter.SeekPrefixGE * [`44e5fb43`](cockroachdb/pebble@44e5fb43) db: store *Comparer on Batch * [`e36d078a`](cockroachdb/pebble@e36d078a) manifest: improve `Annotator` interface with generics * [`1cedb60f`](cockroachdb/pebble@1cedb60f) sstable: don't skipBackward past empty blocks with hideObsolete * [`4cf4f852`](cockroachdb/pebble@4cf4f852) colblk: clarify PrefixBytes comments * [`2752abb9`](cockroachdb/pebble@2752abb9) keyspan: add contexts to keyspan iterators * [`54e5ba64`](cockroachdb/pebble@54e5ba64) db: use testkeys comparer in TestTracing * [`0e232926`](cockroachdb/pebble@0e232926) db: convert TestTracing to datadriven * [`89e5644a`](cockroachdb/pebble@89e5644a) db: Add BenchmarkQueueWorkload * [`72c3f550`](cockroachdb/pebble@72c3f550) base: make Compare more restrictive * [`4f01ec48`](cockroachdb/pebble@4f01ec48) sstable: plumb context to range iterator constructors * [`de0e731a`](cockroachdb/pebble@de0e731a) colblk: add facility for cloning an Array to a []T * [`a2f6d39a`](cockroachdb/pebble@a2f6d39a) colblk: randomize BundleSizes in block writer randomized test * [`b26e6326`](cockroachdb/pebble@b26e6326) colblk: implement a common interface across column data accessors * [`6492e866`](cockroachdb/pebble@6492e866) colblk: make decoder function signatures consistent * [`dbf9ab67`](cockroachdb/pebble@dbf9ab67) compact: remove unused parameter in pickAutoLPositive Release note: none. Fixes: #127914 Co-authored-by: Radu Berinde <[email protected]>
github-project-automation
bot
moved this from In Progress (this milestone)
to Done
in [Deprecated] Storage
Aug 1, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
The Compare code below handles the case where only suffixes are compared. However,
normalizeEngineKeyVersionForCompare
will later be called without the sentinel byte and it will not function properly, since its logic is based on the length.cockroach/pkg/storage/pebble.go
Line 414 in e8e4cce
This could in principle result in an incorrect result when a suffix has the synthetic bit or zero logical timestamp. In practice, it might not be an issue - we compare "bare" suffixes for range keys and range keys all use just the 8-byte wall time which is already normalized.
Jira issue: CRDB-40669
The text was updated successfully, but these errors were encountered: