Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/metamorphic: test block property filters #1725

Closed
jbowens opened this issue May 24, 2022 · 4 comments · Fixed by #1737
Closed

internal/metamorphic: test block property filters #1725

jbowens opened this issue May 24, 2022 · 4 comments · Fixed by #1737
Assignees

Comments

@jbowens
Copy link
Collaborator

jbowens commented May 24, 2022

The block-property filter codepaths don't get coverage in the metamorphic tests. Now that the metamorphic tests use keys with suffixes, we could add a block property collector and filter that mirrors the behavior of CockroachDB's time-bound iteration.

@jbowens
Copy link
Collaborator Author

jbowens commented May 24, 2022

Since this is a source of nondeterminism, this would require some interstitial iterator that skips over keys that match the block property's filtering condition but aren't eliminated by the block-property filter itself.

@jbowens jbowens self-assigned this May 25, 2022
@jbowens
Copy link
Collaborator Author

jbowens commented May 26, 2022

Range deletions are an obstacle here and to #1713. The following assumes reverse direction, but the problem exists in the forward direction too. Suppose you have a file [a,SET#2, z,SET#2] that contains the range deletion [b,y)#1. The containing level iterator is seeked using SeekLT(d). Suppose all of the file's point keys ≤d are filtered by block-property filters or range-key masking.

The levelIter.SeekLT(d) call will immediately find that the point iterator is exhausted, because all the point keys are skipped. Without any changes, the levelIter will skip right past the file when it sees that the smallest boundary is not a RANGEDEL (level_iter.go:712-715). We need some way to keep the file and the the file's range del iterator open until the other levels no longer need it. Elsewhere, we return some boundary key to ensure that the merging iterator keeps it open. But what would we use as a boundary key here?

  1. The ideal would be to return the smallest range deletion start in the file, ensuring we keep the file open until we exhaust the bounds of the files range deletions. We don't have this available on the *fileMetadata. We could seek the rangeDelIter to its beginning to retrieve it, but the mergingIter is using the range deletion iterator and likely has a range deletion Span from this iterator stored on l.tombstone. Repositioning the range deletion iterator would invalidate l.tombstone (specifically overwriting its Span.Keys slice).
  2. We could return a synthetic sentinel key, constructed using the file's smallest user key. However, this would be outside the bounds of the file. This isn't a problem in practice if user keys are never split across multiple sstables within a level, but we don't yet have that guarantee for existing files.
  3. We could return the file's smallest key as-is, with the recognition that low-level filtering like block-propety filtering and range-key masking is best effort only. We'd need to be able to differentiate between the case that we already returned l.iterFile.Smallest and the case that l.iterFile.Smallest was skipped by a low-level filter. We could begin saving returned point keys to a l.iterKey field, and then use it to compare against the smallest field when the point iter is exhausted and a range del iterator is present. We wouldn't have the value corresponding to the smallest key available though. We would need to rely on the higher level filtering to discard this value-less key (risky).
  4. ???

Another (daunting) thought is to refactor the way this mergingIter<>levelIter range deletion interaction works more generally by using the interleaving iterator to keep the table's point iterator and the range deletion iterator moving together within the levelIter. I think this would require an extension to the interleaving iterator to allow interleaving of span end keys too.

@sumeerbhola
Copy link
Collaborator

This is indeed tricky.

I think it is worth explicitly articulating what we are trying to do here with block property filters: block property filters were completely non-deterministic in that they could (a) expose points that should have been filtered, (b) expose points that have already been deleted. The goal here is to eliminate (b) so that the false positives in (a) can be removed simply by applying the filter again. Let's call this truthful-bpf.
There are 2 causes of (b)

  • rangedels: block property filters are oblivious to rangedels so block property filtering when aggregated to the file level can skip files containing relevant rangedels. For truthful-bpf iteration we are not going to skip files that have a rangedel block.
  • dels: block property filters can use both keys and values in constructing the filter. If the value is used, the corresponding del may not be seen. So for truthful-bbp iteration, the filters need to restrict themselves to the key.

Now coming back to this pausing problem in levelIter. I had to refresh my memory by reading the code, so I am writing down a summary of my understanding. All the pausing we do is to make sure rangedels continue to be alive until they are needed. Our current pausing is:

  • Pausing when reached iterator bounds:

    • Set *isSyntheticIterBoundsKey. This allows mergingIter to realize it is exhausted when such a key is the top of the heap.
    • Use InternalKeyRangeDeleteSentinel, since sorts before all point keys with the same user key. So can be used to transform exclusive iter upper bound to inclusive -- will be seen after all relevant point keys have been seen during forward iteration. And works for inclusive iter lower bound since during reverse iteration will be seen after the relevant keys have been seen. Iteration will stop for the levelIter since this is the iterator bounds -- so we don't need to worry about these pauses using keys that overlap with adjacent files in the same level.
  • Pausing when reach a file bound that is a rangedel: mergingIter skips these keys since it ignores based on InternalKeyKindRangeDelete.

  • Pausing for SeekPrefixGE: This also uses InternalKeyKindRangeDelete.

    • Use file upper bound (inclusive) with InternalKeyKindRangeDelete. It is possible the upperbound has the same userkey as the key in SeekPrefixGE(prefix, key, ...).
      • If the file upperbound already happened to be a rangedel, this is a noop.
      • Say file upperbound is c.SET#50. That did not match SeekPrefixGE. There may be a rangedel of the form [a, c)#x. By using c.RANGEDEL#50 as the pausing point we will make sure the rangedel is effective. We are moving to the next file before c.SET#50, but that is ok since we know c.SET#50 did not match. By choosing c.RANGEDEL#50 as the last key exposed by this file, we also ensure that there is no overlap with the next file, which may contain c.SET#49.

Our problem with truthful-bpf iteration resembles SeekPrefixGE but has some complications:

  • SeekPrefixGE is the absolute positioning method. So it knows c.SET#50 has not been seen by the user so it is safe to expose c.RANGEDEL#50 even though it sorts before c.SET#50.
  • We have to deal with reverse iteration.

These are basically a terse restatement of what you outlined.

Solution outline:

  • Expose up to levelIter (from the sstable iterator) whether Seek* or Next/Prev that exhausted the file skipped some data blocks.
  • If skipped some data blocks and the file is exhausted and the levelIter is iterating in:
    • forward direction: The userkey corresponding to the file largest could not have been exposed since truthful-bpf is restricted to keys. When file largest is not a rangedel, do the same thing as SeekPrefixGE does.
    • reverse direction: when file smallest is not a rangedel, we need to construct an InternalKey that is >= this file's smallest key (so it doesn't overlap with the prev file), and that will be ignored by mergingIter. The difficult case is this file having the smallest as a.SET#50 and the prev file having largest as a.DEL#50. There is no gap to squeeze something in.

We can fix this no-gap problem by explicitly introducing a isIgnorableSyntheticIterFileBoundsKey bool that is visible to mergingIter and have mergingIter ignore such keys regardless of the type. We would use it for both SeekPrefixGE and the 2 case above and simply use the smallest and largest keys with this bool set, instead of doing tricks with InternalKeyKindRangeDelete.

jbowens added a commit to jbowens/pebble that referenced this issue Jun 3, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jun 3, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
@jbowens
Copy link
Collaborator Author

jbowens commented Jun 6, 2022

Thanks @sumeerbhola, I implemented the bool-signalling approach in #1737.

jbowens added a commit to jbowens/pebble that referenced this issue Jun 29, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jun 30, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jun 30, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jul 1, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jul 6, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jul 7, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit to jbowens/pebble that referenced this issue Jul 14, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix cockroachdb#1725.
jbowens added a commit that referenced this issue Jul 18, 2022
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.

Fix #1725.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants