-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/metamorphic: add block-property filter coverage #1737
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 15 files reviewed, 1 unresolved discussion
-- commits
line 5 at r4:
I swear I checked for this; but somehow they slipped through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't looked at the last commit yet. Generally looks good.
Reviewed 2 of 2 files at r1, 5 of 5 files at r2, 3 of 5 files at r3, 2 of 8 files at r5.
Reviewable status: 9 of 15 files reviewed, 10 unresolved discussions (waiting on @jbowens and @sumeerbhola)
level_iter.go
line 79 at r5 (raw file):
// iterators that perform filtering of keys. When a new file's iterator is // opened, it's tested to see if it implements filteredIter. If it does, // it's stored here to allow the level iterator recognize when keys were
to recognize ...
level_iter.go
line 680 at r5 (raw file):
// possible the file's range deletions are still relevant to other // levels. Return the largest boundary as a special ignorable marker // to avoid advancing to the next file.
I think its worth saying in a code comment here something like: it is safe to return l.iterFile.Largest since it must have been skipped, and therefore must be > what we have previously returned when iterating in this direction, and so the monotonicity required by the mergingIter heap is not violated.
level_iter.go
line 750 at r5 (raw file):
// possible the file's range deletions are still relevant to other // levels. Return the smallest boundary as a special ignorable key // to avoid advancing to the next file.
ditto
merging_iter.go
line 48 at r2 (raw file):
smallestUserKey, largestUserKey []byte // isLargestUserKeyRangeDelSentinel is set to true when a file's largest // boundary is an exclusive range deletion sentinel file. If true, the file
sentinel.
table_cache.go
line 400 at r5 (raw file):
// interface, so that the level iterator surfaces file boundaries when // range deletions are present. return filteredAll, rangeDelIter, err
would it be slightly more efficient if we returned filteredAll only if rangeDelIter was non-nil?
sstable/reader.go
line 96 at r5 (raw file):
// any keys due to block-property filters. This is used by the Pebble // levelIter to control when an iterator steps to the next sstable. FilteredAnyKeys() bool
this change could use some datadriven testing.
sstable/reader.go
line 219 at r5 (raw file):
// filteredKeys indicates whether the last iterator positioning operation // skipped any keys due to block-property filters. It is exposed through // FilteredAnyKeys and used by the level iterator.
for naming consistency, can you call this filteredAnyKeys
.
sstable/reader.go
line 527 at r5 (raw file):
} i.exhaustedBounds = 0 i.filteredKeys = false
This makes me nervous, in that I think this happens to be correct because of the specifics of the logic in seekGEHelper
that tries to use the current iterator position and avoids doing a seek.
But in general I am worried that if we have previously stopped because the block was irrelevant, it could also include keys that are relevant to this seek position. And so if we now decide to not do a full seek (which is correct) we may erroneously return filteredKey=false, when it is in fact true.
Maybe one way to build more confidence is to establish an invariant on these next optimizations for the seeks e.g. do we ever try to use a next optimization if the last positioning call (step or seek) returned nothing because of loadBlockIrrelevant
. I think not, because we seem to condition on !data.isDataInvalidated()
and on !i.index.isDataInvalidated()
, but I would appreciate you taking a closer look.
sstable/reader.go
line 695 at r5 (raw file):
func (i *singleLevelIterator) SeekLT(key []byte) (*InternalKey, []byte) { i.exhaustedBounds = 0 i.filteredKeys = false
same worry here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I'm assuming you don't want to narrow the block property collector interface to only allow keys. I'm a bit torn about this -- I can see some utility to a future where values also contribute.
Either way, I think we could use a better version of the "truthful-bpf" comment from #1725 (comment) in block_property.go, to lay out the meaning of correctness.
Reviewable status: 9 of 15 files reviewed, 10 unresolved discussions (waiting on @jbowens and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, yeah, I added a comment to block_property.go describing this 'deterministic' block-property filtering.
Reviewable status: 1 of 16 files reviewed, 10 unresolved discussions (waiting on @jbowens and @sumeerbhola)
level_iter.go
line 79 at r5 (raw file):
Previously, sumeerbhola wrote…
to recognize ...
Done.
level_iter.go
line 680 at r5 (raw file):
Previously, sumeerbhola wrote…
I think its worth saying in a code comment here something like: it is safe to return l.iterFile.Largest since it must have been skipped, and therefore must be > what we have previously returned when iterating in this direction, and so the monotonicity required by the mergingIter heap is not violated.
Done.
level_iter.go
line 750 at r5 (raw file):
Previously, sumeerbhola wrote…
ditto
Done.
merging_iter.go
line 48 at r2 (raw file):
Previously, sumeerbhola wrote…
sentinel.
Done.
table_cache.go
line 400 at r5 (raw file):
Previously, sumeerbhola wrote…
would it be slightly more efficient if we returned filteredAll only if rangeDelIter was non-nil?
i'm not sure. we won't ever attempt to call FilteredAnyKeys()
unless the level's rangedel iter is non-nil. i think the thing that might be slightly more efficient is to only populate levelIter.filteredIter
if the rangedel iter is non-nil, so we can avoid the interface type assertion: i'll make that change now.
sstable/reader.go
line 219 at r5 (raw file):
Previously, sumeerbhola wrote…
for naming consistency, can you call this
filteredAnyKeys
.
Done.
sstable/reader.go
line 527 at r5 (raw file):
Previously, sumeerbhola wrote…
This makes me nervous, in that I think this happens to be correct because of the specifics of the logic in
seekGEHelper
that tries to use the current iterator position and avoids doing a seek.
But in general I am worried that if we have previously stopped because the block was irrelevant, it could also include keys that are relevant to this seek position. And so if we now decide to not do a full seek (which is correct) we may erroneously return filteredKey=false, when it is in fact true.Maybe one way to build more confidence is to establish an invariant on these next optimizations for the seeks e.g. do we ever try to use a next optimization if the last positioning call (step or seek) returned nothing because of
loadBlockIrrelevant
. I think not, because we seem to condition on!data.isDataInvalidated()
and on!i.index.isDataInvalidated()
, but I would appreciate you taking a closer look.
I see what you mean. I scrutinized the code, and my understanding matches yours: The conditions on !i.data.isDataInvalidated()
and !i.index.isDataInvalidated()
ensure we'll never trySeekUsingNext
if the last positioning method returned nothing. I'm going to spend a little more time with this tomorrow to find the right invariants to document and assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this needs to be updated to apply similar logic for range-key deletions.
Reviewable status: 1 of 16 files reviewed, 10 unresolved discussions (waiting on @jbowens and @sumeerbhola)
200326b
to
ff822b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 18 files reviewed, 10 unresolved discussions (waiting on @jbowens and @sumeerbhola)
level_iter.go
line 680 at r5 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Done.
Actually, it is possible l.iterFile.Largest
was not skipped. A SeekGE that lands on a index separator k
only knows that the block at the index entry contains keys ≤ k. We can't know whether there were actually keys between the seek key and the index key. If the block is then excluded due to block-property filters, we'll think keys were filtered when none were.
I think this ≤ monotonicity—as opposed to strictly increasing monotonicity—is still okay with respect to the merging iterator. I adjusted the commentary here and renamed the FilteredAnyKeys
function to MaybeFilteredKeys()
.
sstable/reader.go
line 96 at r5 (raw file):
Previously, sumeerbhola wrote…
this change could use some datadriven testing.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 14 of 14 files at r7, 5 of 5 files at r8, 6 of 6 files at r9, 12 of 12 files at r11, 6 of 6 files at r12, 3 of 3 files at r13, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
sstable/block_property.go
line 53 at r9 (raw file):
// into blocks is nondeterministic. Block propery filtering may introduce two // kinds of errors: // a) Point keys that match the filter may be surfaced. This may happen
In the context of block property filters, doesn't "filter" imply that it matches (intersects / selects) the block? Yet here we're saying that "point keys matching the filter" is a type of error?
sstable/block_property.go
line 52 at r13 (raw file):
// Block property filtering is nondeterministic because the separation of keys // into blocks is nondeterministic. Block propery filtering may introduce two // kinds of errors:
Is "error" accurate in this context? Can this be thought of more like a "false positive"?
sstable/testdata/block_properties
line 443 at r11 (raw file):
<a@1:1> MaybeFilteredKeys()=true # Test FilteredAnyKeys().
nit: MaybeFilteredKeys
? Same elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 18 files reviewed, 13 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
sstable/block_property.go
line 53 at r9 (raw file):
Previously, nicktrav (Nick Travers) wrote…
In the context of block property filters, doesn't "filter" imply that it matches (intersects / selects) the block? Yet here we're saying that "point keys matching the filter" is a type of error?
Since block boundaries are completely variable, it is only meaningful to define block-property filters in order to apply a filter function that is a function of the key-value pairs. Block and table level filtering is just the mechanic, but the intended filter is always a function of the point keys themselves.
sstable/block_property.go
line 52 at r13 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Is "error" accurate in this context? Can this be thought of more like a "false positive"?
I think "false positive" or "false negative" are errors, in the statistical sense. I used the "false negative" language down below to describe the first form of error. I'm not sure whether "false positive" (positive as in what survives the filter) or "false negative" (negative as in the filter failing to detect the key) is the appropriate term?
I don't think the second form of error can be described as a false negative/positive, because it's a violation of the semantics ordinarily guaranteed by Pebble: If you delete a key, it should appear deleted on all subsequent reads.
sstable/testdata/block_properties
line 443 at r11 (raw file):
Previously, nicktrav (Nick Travers) wrote…
nit:
MaybeFilteredKeys
? Same elsewhere.
good catch, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r14, 9 of 9 files at r15, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
sstable/block_property.go
line 53 at r9 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Since block boundaries are completely variable, it is only meaningful to define block-property filters in order to apply a filter function that is a function of the key-value pairs. Block and table level filtering is just the mechanic, but the intended filter is always a function of the point keys themselves.
The mechanics of it I understand. Just a matter of interpretation ... does "match the filter" mean "filtering for" or "filtering out"? /shrug
sstable/block_property.go
line 52 at r13 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I think "false positive" or "false negative" are errors, in the statistical sense. I used the "false negative" language down below to describe the first form of error. I'm not sure whether "false positive" (positive as in what survives the filter) or "false negative" (negative as in the filter failing to detect the key) is the appropriate term?
I don't think the second form of error can be described as a false negative/positive, because it's a violation of the semantics ordinarily guaranteed by Pebble: If you delete a key, it should appear deleted on all subsequent reads.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 5 files at r8, 3 of 6 files at r9, 2 of 12 files at r11, 4 of 9 files at r15, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
level_iter.go
line 680 at r5 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Actually, it is possible
l.iterFile.Largest
was not skipped. A SeekGE that lands on a index separatork
only knows that the block at the index entry contains keys ≤ k. We can't know whether there were actually keys between the seek key and the index key. If the block is then excluded due to block-property filters, we'll think keys were filtered when none were.I think this ≤ monotonicity—as opposed to strictly increasing monotonicity—is still okay with respect to the merging iterator. I adjusted the commentary here and renamed the
FilteredAnyKeys
function toMaybeFilteredKeys()
.
I see. Can you also add this index separator example to the code comment to clarify to the reader why we need this maybe-semantics.
table_cache.go
line 400 at r5 (raw file):
Previously, jbowens (Jackson Owens) wrote…
i'm not sure. we won't ever attempt to call
FilteredAnyKeys()
unless the level's rangedel iter is non-nil. i think the thing that might be slightly more efficient is to only populatelevelIter.filteredIter
if the rangedel iter is non-nil, so we can avoid the interface type assertion: i'll make that change now.
ack
sstable/block_property.go
line 59 at r15 (raw file):
// used for filtering is not a strict function of the user key, a block // containing k.DEL may be filtered, while a block containing the deleted // key k.SET may not be filtered.
The comment is missing a correctness definition. I think it is worth separating out the abstract notion of the kv-filter. Also, it may be easier to read if the solution to each problem goes together with the problem definition.
So I'd suggest organizing it in a manner like (of course with all the other details in your comment):
- There is a filter that applies to key-value pairs. Correctness is defined as surfacing exactly the key-value pairs that correspond to applying the kv-filter after reading from the DB.
- Block property filtering that uses that kv-filter produces additional values that don't satisfy the filter because of the separation of keys into blocks. These can be removed by reapplying the kv-filter above the filtering applied by bpf.
- Block property filtering may surface point keys that have been deleted if the kv-filter looks at the value. Don't use such a filter (actually the more nuanced statement in your comment).
sstable/reader.go
line 527 at r5 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I see what you mean. I scrutinized the code, and my understanding matches yours: The conditions on
!i.data.isDataInvalidated()
and!i.index.isDataInvalidated()
ensure we'll nevertrySeekUsingNext
if the last positioning method returned nothing. I'm going to spend a little more time with this tomorrow to find the right invariants to document and assert.
I think we also need a code comment here talking about this potential interaction between maybeFilteredKeys
and trySeekUsingNext
so that the next person changing this code is careful about it. And it isn't limited to trySeekUsingNext
-- I believe it applies to all stepping instead of seeking optimizations here (there is the monotonic bounds one, that has nothing to do with trySeekUsingNext
). So it also needs a code comment in SeekLT
.
sstable/testdata/block_properties
line 508 at r15 (raw file):
seek-lt a seek-lt c seek-lt ca
can you add seek-lt f
and seek-lt e
and seek-lt d
-- I am assuming the first two will produce MaybeFilteredKeys=true.
And is it possible to exercise some of the stepping optimizations by doing SeekGE followed by Next and confirming the correctness of the MaybeFilteredKeys. Also with the one with monotonic bounds that apply to both SeekGE and SeekLT.
15939ba
to
8635bb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumeerbhola —I further narrowed MaybeFilteredKeys
to only be defined at exhausted iterator positions, because otherwise seeks that apply the try-seek-using-next optimization and don't move the iterator position at all will violate the contract. We only care about the exhausted iterator position for levelIter.
Reviewable status: 8 of 18 files reviewed, 11 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
level_iter.go
line 680 at r5 (raw file):
Previously, sumeerbhola wrote…
I see. Can you also add this index separator example to the code comment to clarify to the reader why we need this maybe-semantics.
Done.
sstable/block_property.go
line 59 at r15 (raw file):
Previously, sumeerbhola wrote…
The comment is missing a correctness definition. I think it is worth separating out the abstract notion of the kv-filter. Also, it may be easier to read if the solution to each problem goes together with the problem definition.
So I'd suggest organizing it in a manner like (of course with all the other details in your comment):
- There is a filter that applies to key-value pairs. Correctness is defined as surfacing exactly the key-value pairs that correspond to applying the kv-filter after reading from the DB.
- Block property filtering that uses that kv-filter produces additional values that don't satisfy the filter because of the separation of keys into blocks. These can be removed by reapplying the kv-filter above the filtering applied by bpf.
- Block property filtering may surface point keys that have been deleted if the kv-filter looks at the value. Don't use such a filter (actually the more nuanced statement in your comment).
Done.
sstable/reader.go
line 527 at r5 (raw file):
Previously, sumeerbhola wrote…
I think we also need a code comment here talking about this potential interaction between
maybeFilteredKeys
andtrySeekUsingNext
so that the next person changing this code is careful about it. And it isn't limited totrySeekUsingNext
-- I believe it applies to all stepping instead of seeking optimizations here (there is the monotonic bounds one, that has nothing to do withtrySeekUsingNext
). So it also needs a code comment inSeekLT
.
Done.
sstable/reader.go
line 695 at r5 (raw file):
Previously, sumeerbhola wrote…
same worry here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because otherwise seeks that apply the try-seek-using-next optimization and don't move the iterator position at all will violate the contract
Because the only safe thing to do in that case is to return true?
Narrowing to exhausted sounds good.
Reviewed 1 of 12 files at r11, 2 of 10 files at r16, 8 of 8 files at r17.
Reviewable status: all files reviewed (commit messages unreviewed), 16 unresolved discussions (waiting on @jbowens and @sumeerbhola)
sstable/block_property.go
line 54 at r17 (raw file):
// implement efficient application of a filter F that applies to key-value // pairs. Consider correctness defined as surfacing exactly the same key-value // pairs that would be surfaced if one applied the filter F above block-property
I think it should say "if one applied the filter F above regular iteration"
sstable/block_property.go
line 55 at r17 (raw file):
// pairs. Consider correctness defined as surfacing exactly the same key-value // pairs that would be surfaced if one applied the filter F above block-property // filtering With this correctness definition, block property filtering may
nit: missing period
sstable/block_property.go
line 58 at r17 (raw file):
// introduce two kinds of errors: // // a) Block property filtering that uses a kv-filter may produce additional
we haven't defined "kv-filter". Perhaps the earlier comment that introduces F can say "... of a filter F that applies to key-value pairs (abbreviated as kv-filter)."
sstable/testdata/block_properties
line 538 at r17 (raw file):
. MaybeFilteredKeys()=true . MaybeFilteredKeys()=true <b@10:2> MaybeFilteredKeys()=true
with bounds [b,e), isn't b@10 the first key that is encountered regardless of block property filtering? I'm confused by this true value.
sstable/testdata/block_properties
line 540 at r17 (raw file):
<b@10:2> MaybeFilteredKeys()=true <c@15:3> MaybeFilteredKeys()=false
The corner case I was worried about was something like
set-bounds lower=b upper=ee
seek-ge d
set-bounds lower=ee upper=g
seek-ge ee
where I think the layout of blocks and indices will cause the first seek to skip the lower level index block containing e and f. Then the next set-bounds qualifies for the monotonic bounds stepping optimization, and we want to ensure that seek-ge ee
also returns true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 15 of 18 files reviewed, 15 unresolved discussions (waiting on @sumeerbhola)
sstable/block_property.go
line 54 at r17 (raw file):
Previously, sumeerbhola wrote…
I think it should say "if one applied the filter F above regular iteration"
Done.
sstable/block_property.go
line 58 at r17 (raw file):
Previously, sumeerbhola wrote…
we haven't defined "kv-filter". Perhaps the earlier comment that introduces F can say "... of a filter F that applies to key-value pairs (abbreviated as kv-filter)."
Done.
sstable/testdata/block_properties
line 508 at r15 (raw file):
Previously, sumeerbhola wrote…
can you add
seek-lt f
andseek-lt e
andseek-lt d
-- I am assuming the first two will produce MaybeFilteredKeys=true.And is it possible to exercise some of the stepping optimizations by doing SeekGE followed by Next and confirming the correctness of the MaybeFilteredKeys. Also with the one with monotonic bounds that apply to both SeekGE and SeekLT.
Done.
sstable/testdata/block_properties
line 538 at r17 (raw file):
Previously, sumeerbhola wrote…
with bounds [b,e), isn't b@10 the first key that is encountered regardless of block property filtering? I'm confused by this true value.
This is that case we discussed earlier that makes it hard to tell whether we skipped keys. The index separator between a@1
and b@10
is calculated to be b
. The seek-ge b
lands on the first block, sees that it might contain b
, sees that it's filtered by bpfs and proceeds to the next block.
sstable/testdata/block_properties
line 540 at r17 (raw file):
Previously, sumeerbhola wrote…
The corner case I was worried about was something like
set-bounds lower=b upper=ee seek-ge d set-bounds lower=ee upper=g seek-ge ee
where I think the layout of blocks and indices will cause the first seek to skip the lower level index block containing e and f. Then the next set-bounds qualifies for the monotonic bounds stepping optimization, and we want to ensure that
seek-ge ee
also returns true.
Thanks for this test case—it revealed a bug. We handle this case appropriately within the single-level iterator, but we don't do the comparable thing within the two-level iterator.
The second seek finds the two-level iterator is already positioned and eligible for the bounds optimization. When deciding whether to enter the slow path, it only checks if !i.twoLevelIndex.isDataInvalidated()
, so it falls into the optimization.
I made two code changes to fix this:
loadIndex
now invalidatesi.index
too, not justi.data
.twoLevelIterator.Seek{GE,PrefixGE}
do not fast path ifi.index.isDataInvalidated()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pondering a randomized test that randomizes the index block size and data block size. Maybe two variants, one that tests trySeekUsingNext and one that tests monotonically increasing/decreasing bounds. Might leave it as a TODO/issue to follow up with during stability period.
Reviewable status: 15 of 18 files reviewed, 15 unresolved discussions (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r18, all commit messages.
Reviewable status: 16 of 18 files reviewed, 17 unresolved discussions (waiting on @jbowens and @sumeerbhola)
sstable/reader.go
line 520 at r18 (raw file):
// reached. The i.data.isDataInvalidated() indicates that the sstable was // exhausted. if trySeekUsingNext && (i.exhaustedBounds == +1 || i.data.isDataInvalidated()) {
If this case ok because if maybeFilteredKeys=true
, it will stay true (since the reset to false is after this return)?
Are we losing this optimization for the twoLevelIterator
case with your latest change because trySeekUsingNext
will get set to false in the expanded slow-path for twoLevelIterator.SeekGE
?
sstable/reader.go
line 581 at r18 (raw file):
// caller claimed externally known invariant represented by // trySeekUsingNext=true. if trySeekUsingNext {
also, I wonder whether I overlooked this case in my original "This makes me nervous" comment. It doesn't use anything about i.data.isDataInvalidated
or i.index.isDataInvalidated
.
- isn't it possible for erroneously return
maybeFilteredKeys=false
- And this optimization seems to suffer from the same regression in the
twoLevelIterator
case.
If we ever use trySeekUsingNext
should we just conservatively return maybeFilteredKeys=true
? I'm generally more concerned about losing a seek optimization than returning an extra synthetic key -- and there are some microbenchmarks that suggested that optimizations for seeks on exhausted iterators were quite helpful since with 90% of the data in L6, it is not uncommon to have L1-L5 level iterators already exhausted.
In 4406dab a bounds-checking bug was fixed in the two-level index iterator's SeekLT method. The same bug existed within Last and skipBackward and was missed.
The levelIter and mergingIter are tightly coupled in a few places, with the levelIter passing additional context to the mergingIter by writing into fields owned by the mergingIter. This commit is a small refactor to organize these fields together into a `levelIterBoundaryContext` struct. This is in preparation for adding an additional field.
Previously, if an iterator configured with a block-property filter encountered a table that contained all excluded point keys, the iterator would skip the table entirely, including its range deletions. An iterator configured with a block-property filter could observe keys outside the filtered range that no longer exist. This behavior was fine from CockroachDB's perspective, which always pairs a filtered iterator with an unfiltered iterator, seeking the latter in tandem. The behavior is likely to be surprising in non-CockroachDB use cases. Additionally, the behavior is an obstacle to range-key masking which must respect range deletions for keys not covered by the masking range key. This change requires additional coordination between the mergingIter and levelIter to ensure that a levelIter doesn't close a file while other levels still require its range deletions. This coordination takes the form a new `isIgnorableBoundaryKey` field. The levelIter may return a file's smallest or largest boundary key [without any value], setting `isIgnorableBoundaryKey` to true to signal the key doesn't repesent a real KV pair. This scheme is also extended to SeekPrefixGE, which previously synthesized a RANGEDEL key to serve a similar purpose.
Add test coverage for block-property filters to the metamorphic tests. Division of keys into blocks are nondeterministic, so this introduces nondeterminism into Iterator behavior. To account for this, the metamorphic test's iterator wrapper is adjusted to skip any keys that are eligible for filtering. Fix cockroachdb#1725.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 18 files reviewed, 17 unresolved discussions (waiting on @nicktrav and @sumeerbhola)
sstable/reader.go
line 520 at r18 (raw file):
Previously, sumeerbhola wrote…
If this case ok because if
maybeFilteredKeys=true
, it will stay true (since the reset to false is after this return)?Are we losing this optimization for the
twoLevelIterator
case with your latest change becausetrySeekUsingNext
will get set to false in the expanded slow-path fortwoLevelIterator.SeekGE
?
Yeah, both of those are right.
sstable/reader.go
line 581 at r18 (raw file):
Previously, sumeerbhola wrote…
also, I wonder whether I overlooked this case in my original "This makes me nervous" comment. It doesn't use anything about
i.data.isDataInvalidated
ori.index.isDataInvalidated
.
- isn't it possible for erroneously return
maybeFilteredKeys=false
- And this optimization seems to suffer from the same regression in the
twoLevelIterator
case.If we ever use
trySeekUsingNext
should we just conservatively returnmaybeFilteredKeys=true
? I'm generally more concerned about losing a seek optimization than returning an extra synthetic key -- and there are some microbenchmarks that suggested that optimizations for seeks on exhausted iterators were quite helpful since with 90% of the data in L6, it is not uncommon to have L1-L5 level iterators already exhausted.
This case was relying on the invariant at the function start.
// Invariant: trySeekUsingNext => !i.data.isDataInvalidated() && i.exhaustedBounds != +1
The invariant is 'enforced' by the no-op case in SeekGE
and SeekPrefixGE
, since we never enter seekGEHelper
with trySeekUsingNext=true
if i.data.isDataInvalidated
.
Agreed, the reasoning of correctness was getting too complicated. I've updated to preserve the existing maybeFilteredKeys
value during fast paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might leave it as a TODO/issue to follow up with during stability period.
+1
Reviewed 3 of 9 files at r21.
Reviewable status: 3 of 18 files reviewed, 17 unresolved discussions (waiting on @nicktrav and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
filed #1817
Reviewable status: 3 of 18 files reviewed, 17 unresolved discussions (waiting on @nicktrav and @sumeerbhola)
In cockroachdb#1737, we added some intricate interactions around block property filters, and added the contractual requirement that all sstable iterators return MaybeFilteredKeys() == true after a positioning operation that led to filtered keys due to BPFs. We want to build more confidence around that function obeying that contract after seek optimizations, so this change adds a randomized test that does repeated seeks (incl. some after monotonic bound changes and with trySeekUsingNext), and ensures `MaybeFilteredKeys()` returns true after any skipped keys. I would have put this test in the sstable package, except it'd cause a cyclical dependency between it and `internal.testkeys.blockprops` so I moved it to the outer package. Fixes cockroachdb#1817.
In cockroachdb#1737, we added some intricate interactions around block property filters, and added the contractual requirement that all sstable iterators return MaybeFilteredKeys() == true after a positioning operation that led to filtered keys due to BPFs. We want to build more confidence around that function obeying that contract after seek optimizations, so this change adds a randomized test that does repeated seeks (incl. some after monotonic bound changes and with trySeekUsingNext), and ensures `MaybeFilteredKeys()` returns true after any skipped keys. I would have put this test in the sstable package, except it'd cause a cyclical dependency between it and `internal.testkeys.blockprops` so I moved it to the outer package. Fixes cockroachdb#1817.
In #1737, we added some intricate interactions around block property filters, and added the contractual requirement that all sstable iterators return MaybeFilteredKeys() == true after a positioning operation that led to filtered keys due to BPFs. We want to build more confidence around that function obeying that contract after seek optimizations, so this change adds a randomized test that does repeated seeks (incl. some after monotonic bound changes and with trySeekUsingNext), and ensures `MaybeFilteredKeys()` returns true after any skipped keys. I would have put this test in the sstable package, except it'd cause a cyclical dependency between it and `internal.testkeys.blockprops` so I moved it to the outer package. Fixes #1817.
sstable: fix additional bounds check bugs
In 4406dab a bounds-checking bug was fixed in the two-level index
iterator's SeekLT method. The same bug existed within Last and
skipBackward and was missed.
db: refactor levelIter/mergingIter boundary interface
The levelIter and mergingIter are tightly coupled in a few places, with the
levelIter passing additional context to the mergingIter by writing into fields
owned by the mergingIter. This commit is a small refactor to organize these
fields together into a
levelIterBoundaryContext
struct. This is inpreparation for adding an additional field.
db: don't filter range deletions based on point-key block filters
Previously, if an iterator configured with a block-property filter encountered
a table that contained all excluded point keys, the iterator would skip the
table entirely, including its range deletions. An iterator configured with a
block-property filter could observe keys outside the filtered range that no
longer exist.
This behavior was fine from CockroachDB's perspective, which always pairs a
filtered iterator with an unfiltered iterator, seeking the latter in tandem.
The behavior is likely to be surprising in non-CockroachDB use cases.
Additionally, the behavior is an obstacle to range-key masking which must
respect range deletions for keys not covered by the masking range key.
This change requires additional coordination between the mergingIter and
levelIter to ensure that a levelIter doesn't close a file while other levels
still require its range deletions. This coordination takes the form a new
isIgnorableBoundaryKey
field. The levelIter may return a file's smallest orlargest boundary key [without any value], setting
isIgnorableBoundaryKey
totrue to signal the key doesn't repesent a real KV pair. This scheme is also
extended to SeekPrefixGE, which previously synthesized a RANGEDEL key to serve
a similar purpose.
internal/metamorphic: add block-property filter coverage
Add test coverage for block-property filters to the metamorphic tests.
Division of keys into blocks are nondeterministic, so this introduces
nondeterminism into Iterator behavior. To account for this, the metamorphic
test's iterator wrapper is adjusted to skip any keys that are eligible for
filtering.
Fix #1725.