Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kvstreamer: fully support InOrder mode
This commit extends the Streamer library to support Scan requests that can span multiple ranges in the InOrder mode which allows us to use the Streamer for the lookup joins when ordering needs to be maintained. As with the lookup joins w/o ordering, this is a "poor man's" support which relies on the de-duplication of requests in the join reader's span generators as well as doesn't remove the disk-backed row container that buffers all looked up rows in the join reader ordering strategy. Those caveats will be addressed in the follow-up commits. The contribution of this commit is such that the in-order results buffer can now correctly return the results when a single original Scan request touches multiple ranges as well as when each sub-request (against a single range) can get an arbitrary number of partial responses. Previously, we only had one axis for ordering the results - `Position` values which identify the original request that a particular Result is a response to. This was sufficient for index joins (as well as lookup joins when `SingleRowLookup` hint is `true`); however, when a Scan request can touch multiple ranges, that single axis is no longer sufficient since the results buffer could order two Results for a single Scan request arbitrarily. We go around this limitation by introducing a second dimension for ordering - "sub-request index" which is the ordinal of a particular single-range request within the multi-range Scan request. Consider the following example: original Scan request is `Scan(b, f)`, and we have three ranges: `[a, c)`, `[c, e)`, `[e, g)`. In `Streamer.Enqueue`, the original Scan is broken down into three single-range Scan requests: ``` singleRangeReq[0]: reqs = [Scan(b, c)] positions = [0] subRequestIdx = [0] singleRangeReq[1]: reqs = [Scan(c, e)] positions = [0] subRequestIdx = [1] singleRangeReq[2]: reqs = [Scan(e, f)] positions = [0] subRequestIdx = [2] ``` Note that `positions` values are the same (indicating that each single-range request is a part of the same original multi-range request), but values of `subRequestIdx` are different - they will allow us to order the responses to these single-range requests (which might come back in any order) correctly when returning the results. This information is plumbed into the requests as well as the results. There is yet another complication though - what if a single-range Scan request results in multiple partial responses? To make sure that these partial results are ordered correctly, we need yet another dimension, but at least that dimension can be fully hidden inside of the in-order results buffer. This is possible due to the fact that partial response for the same single-range Scan request will be added into the buffer at different times, so we'll assign the results "add epochs". Consider the following example: we have the original Scan request as `Scan(a, c)` which goes to a single range `[a, c)` containing keys `a` and `b`. Imagine that the Scan response can only contain a single key, so we first get a partial `ScanResponse('a')` with `ResumeSpan(b, c)`, and then we get a partial `ScanResponse('b')` with an empty `ResumeSpan`. The first response will be added to the buffer when during the first `add` call, so its "epoch" is 0 whereas the second response is added during "epoch" 1 - thus, we can correctly return `a` before `b` although the `Position` and sub-request values of two `Result`s are the same. Release note: None
- Loading branch information