-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: refactor/cleanup compaction iterator #3082
Comments
This was referenced Nov 17, 2023
itsbilal
pushed a commit
to itsbilal/pebble
that referenced
this issue
Nov 20, 2023
Previously the compaction iterator would avoid user key comparisons in some circumstances when invariants around sequence numbers implied user keys must be different. This commit removes this optimization, replacing it with an assertion that the sequence number invariants hold. This is done out of caution and with the realization that these user key comparisons form a relatively small portion of the CPU cost of a compaction. Informs cockroachdb#3082. Informs cockroachdb/cockroach#114421.
itsbilal
pushed a commit
that referenced
this issue
Nov 20, 2023
Previously the compaction iterator would avoid user key comparisons in some circumstances when invariants around sequence numbers implied user keys must be different. This commit removes this optimization, replacing it with an assertion that the sequence number invariants hold. This is done out of caution and with the realization that these user key comparisons form a relatively small portion of the CPU cost of a compaction. Informs #3082. Informs cockroachdb/cockroach#114421.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 11, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 11, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 11, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 11, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 12, 2024
This commit removes the compaction iterator's concept of a key that lies within the same snapshot stripe but is non-skippable. These "non-skippable" keys lead to bizarre behavior, such as a sequence like: a.MERGE.5 a.RANGEDEL.4 a.MERGE.4 yielding the separate unmerged a.MERGE.5 and a.MERGE.4 internal keys (Note a.RANGEDEL.4 does not delete a.MERGE.4 because they share the same sequence number.) The sameStripeNonSkippable fell out of the fact that range deletions were previously interleaved by the input iterator at their start key with their sequence number. These sameStripeNonSkippable could interrupt any logic iterating across the keys of a stripe, complicating the compaction iterator logic. With cockroachdb#3219, range deletions began to be interleaved at their start key with the maximal sequence number, ensuring they never interrupt the keys of a snapshot stripe. In some instances INVALID keys were also returned with sameStripeNonSkippable. These keys are now treated similarly to other errors encountered during iteration: i.err is set and newStripeNewKey is returned. Close cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 16, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 16, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Follow up work will be able to simplify the compaction iterator logic now that range deletions are always interleaved at the maximal sequence number. Informs cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 16, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Additionally, this commit removes the compaction iterator's concept of a key that lies within the same snapshot stripe but is non-skippable. These "non-skippable" keys lead to bizarre behavior, such as a sequence like: a.MERGE.5 a.RANGEDEL.4 a.MERGE.4 yielding the separate unmerged a.MERGE.5 and a.MERGE.4 internal keys (Note a.RANGEDEL.4 does not delete a.MERGE.4 because they share the same sequence number.) The sameStripeNonSkippable fell out of the fact that range deletions were previously interleaved by the input iterator at their start key with their sequence number. These sameStripeNonSkippable could interrupt any logic iterating across the keys of a stripe, complicating the compaction iterator logic. In some instances INVALID keys were also returned with sameStripeNonSkippable. These keys are now treated similarly to other errors encountered during iteration: i.err is set and newStripeNewKey is returned. Close cockroachdb#3082.
jbowens
added a commit
to jbowens/pebble
that referenced
this issue
Jan 17, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Additionally, this commit removes the compaction iterator's concept of a key that lies within the same snapshot stripe but is non-skippable. These "non-skippable" keys lead to bizarre behavior, such as a sequence like: a.MERGE.5 a.RANGEDEL.4 a.MERGE.4 yielding the separate unmerged a.MERGE.5 and a.MERGE.4 internal keys (Note a.RANGEDEL.4 does not delete a.MERGE.4 because they share the same sequence number.) The sameStripeNonSkippable fell out of the fact that range deletions were previously interleaved by the input iterator at their start key with their sequence number. These sameStripeNonSkippable could interrupt any logic iterating across the keys of a stripe, complicating the compaction iterator logic. In some instances INVALID keys were also returned with sameStripeNonSkippable. These keys are now treated similarly to other errors encountered during iteration: i.err is set and newStripeNewKey is returned. Close cockroachdb#3082.
jbowens
added a commit
that referenced
this issue
Jan 17, 2024
Remove the keyspan.InternalIteratorShim that was intended to be a temporary bridge to allow range deletions to be surfaced within the compaction iterator, replacing it with the keyspan.InterleavingIter. This mirrors the mechanics of range keys. Additionally, this commit removes the compaction iterator's concept of a key that lies within the same snapshot stripe but is non-skippable. These "non-skippable" keys lead to bizarre behavior, such as a sequence like: a.MERGE.5 a.RANGEDEL.4 a.MERGE.4 yielding the separate unmerged a.MERGE.5 and a.MERGE.4 internal keys (Note a.RANGEDEL.4 does not delete a.MERGE.4 because they share the same sequence number.) The sameStripeNonSkippable fell out of the fact that range deletions were previously interleaved by the input iterator at their start key with their sequence number. These sameStripeNonSkippable could interrupt any logic iterating across the keys of a stripe, complicating the compaction iterator logic. In some instances INVALID keys were also returned with sameStripeNonSkippable. These keys are now treated similarly to other errors encountered during iteration: i.err is set and newStripeNewKey is returned. Close #3082.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
keyspan.InterleavingIter
to interleave range deletions at the maximal sequence number. This will simplify the logic considerably and allow us to eliminate thesameStripeNonSkippable
case.The text was updated successfully, but these errors were encountered: