Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sstable: allow range key property collection
Currently, when point keys are added to an sstable, the key is run through each of the block property collectors configured on the `sstable.Writer`. Range keys are not included. Allow range keys to run through the same block property collector pipeline by looping through all configured collectors on the writer and passing the range key. Any struct that implements `sstable.BlockPropertyCollector` is allowed to accept any type of key kind. It is up to the implementation to filter out keys that are not applicable. This allows for flexibility in crafting collectors that can take specific types of keys, or combinations of key kinds (e.g. point- and range-key specific collectors, or table-wide "global" collectors). Rename the existing `DataBlockIntervalCollector` interface to `BlockIntervalCollector`, and rename the `FinishDataBlock` method to `FinishBlock`. Both of these change are intended to make the interface applicable to keys other than point keys that may have dedicated blocks in the sstable, as is the case for range keys. Rename the existing `BlockIntervalCollector` struct to `DataBlockIntervalCollector`. To make the struct specific to point keys, ignore range key kinds in calls to `Add. Add a new `BlockPropertyCollector` helper implementation, `RangeKeyBlockIntervalCollector`, that operates exclusively on range keys. All other keys kinds are ignored. The collector is intended to support maintaining an upper and lower bound on the MVCC timestamps present in a range key block in an sstable. Add a test implementation of a `BlockIntervalCollector` and a data-driven that demonstrates maintaining upper and lower bounds on point and range keys with integer suffixes (e.g. `[email protected]:foo`, `[email protected]:bar [(@100=baz)]`, etc.). One downside with this implementation is that the `BlockPropertyCollector` interface contains methods such as `FinishDataBlock` and `FinishIndexBlock` that are not applicable to range keys (range keys are all in a single block, and do not have an index block, respectively). However, retaining these specific methods allows for implementations to be created that could support both point and range keys (for example, if creating a collector whose properties are intended to be applicable to all key kinds). Alternatives approaches considered, with relative downsides: - adding a new member field to `BlockIntervalCollector` for tracking the range key interval. The downside with this approach is that it requires that both the point and range key intervals be encoded into the properties for the table and block with the same name. This make it difficult to disambiguate the intervals on the read path. - have member fields for both point and range key collectors in `sstable.Writer`. The downside with this approach is that it requires more intrusive changes to the `Writer` to call the specific type of property collector on the write path, which does not scale nicely to support various block types that require property collection (i.e. potentially supporting range-dels in the future). There is also some nuance to managing separate collections of collectors with the `shortID` mapping that is used when encoding the properties in the properties and index blocks (i.e. need to be careful not to re-use the same ordinals, etc.).
- Loading branch information