-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: snapshot keyspan bounds #1810
Comments
Can we make this less general, and limit predicates to a single key-span, and check that all iterator bounds are limited to that key-span? |
I think the fragmentation of a range's various key spaces (range-id, range-local, etc) force us to snapshot multiple key spans together. |
Ah yes. We could still have it be explicitly represented as a set of spans, yes? |
Yeah, for sure |
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by cockroachdb#2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots cockroachdb#1810).
Introduce a new type `frontiers`, designed to monitor several different user key frontiers during a compaction. When a user key is encountered that equals or exceeds the configured frontier, the code that specified the frontier is notified and given an opportunity to set a new frontier. Internally, `frontiers` uses a heap (code largely copied from the merging iterator's heap) to avoid N key comparisons for every key. This commit refactors the `limitFuncSplitter` type to make use of `frontiers`. The `limitFuncSplitter` type is used to split flushes to L0 flush split keys, and to split both flushes and compactions to avoid excessive overlap with grandparent files. This change is motivated by #2156, which will introduce an additional compaction-output splitter that must perform key comparisons against the next key to decide when to split a compaction. Additionally, the `frontiers` type may also be useful for other uses, such as applying key-space-dependent logic during a compaction (eg, compaction-time GC, disaggregated storage locality policies, or keyspan-restricted snapshots #1810).
Currently, a snapshot always applies universally across all keys in the database. In CockroachDB, snapshots are used to preserve state within the context of a single range. An LSM snapshot constructed to read range r1 still prevents removal of obsolete keys in range r2.
We could extend
NewSnapshot
to allow supplying a predicatep(k)→bool
that configures the snapshot to only snapshot keys for whichp
returns true. During compaction and flushes, when a snapshot appears to produce a new snapshot stripe within the same key,p(k)
is consulted and a new stripe is produced only ifp(k)→true
. This would allow overwritten keys in a non-snapshotted CockroachDB range to be dropped while preserving overwritten keys in a snapshotted CockroachDB range.There's still the question of what to do when an iterator constructed through
Snapshot.NewIter
reads a key for which the predicatep(k)→false
:p(k)
is undefined. The caller must be careful to never use iteration results outside the predicate. If the predicate is defined over swaths of the key space, this may be achieved through setting iterator bounds.p(k)
, skipping the key ifp(k)→false
.p(k)→true
are filtered at the snapshot's sequence number. User keys for whichp(k)→false
are filtered at the database's visible sequence number.I expect limiting the scope of active snapshots would reduce write amplification, in particular during periods of heavy rebalancing where there are open LSM snapshots and replicas are being simultaneously removed. Replica removal lays down range deletions, but those range deletions are unable to drop the replica's data. Compaction of these range deletions is still prioritized, because wide range deletions force ingested sstables into higher levels. The result is we suffer unnecessary write amplification moving the removed replica's data and the range tombstone into L6.
If we are to tackle this, I think we might want to expose a very limited interface, at least from the CockroachDB
pkg/storage
package that meets our specific snapshot usages. This can help avoid the possibility of reading unshapshotted keys while under the impression of reading through a consistent snapshot.The amount of write amplification saved is still unknown. Adding metrics for the size of obsolete keys preserved during compactions (#1204) would help us prioritize.
Jira issue: PEBBLE-127
The text was updated successfully, but these errors were encountered: