Skip to content

Commit

Permalink
colfetcher: respect limit hint
Browse files Browse the repository at this point in the history
Previously, the colfetcher ignored limit hints: it always fetched data
from KV until its batch was full. This produces bad behavior if the
batch size is larger than the limit hint. For example, if the expected
row count was 500, causing us to create a 500-sized batch, but the limit
hint for whatever reason was only 20, we would still go ahead and fetch
500 rows.

This, in practice, does not appear to show up too easily - if the
optimizer is doing its job, the batch size should always be equal to the
limit hint for limited scans.

Release note: None
  • Loading branch information
jordanlewis authored and yuzefovich committed Mar 23, 2021
1 parent 45136e4 commit 240f96d
Showing 1 changed file with 17 additions and 0 deletions.
17 changes: 17 additions & 0 deletions pkg/sql/colfetcher/cfetcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,10 @@ type cFetcher struct {
// seekPrefix is the prefix to seek to in stateSeekPrefix.
seekPrefix roachpb.Key

// limitHint is a hint as to the number of rows that the caller expects to
// be returned from this fetch.
limitHint int

// remainingValueColsByIdx is the set of value columns that are yet to be
// seen during the decoding of the current row.
remainingValueColsByIdx util.FastIntSet
Expand Down Expand Up @@ -649,6 +653,7 @@ func (rf *cFetcher) StartScan(
}
rf.fetcher = f
rf.machine.lastRowPrefix = nil
rf.machine.limitHint = int(limitHint)
rf.machine.state[0] = stateInitFetch
return nil
}
Expand Down Expand Up @@ -1038,7 +1043,19 @@ func (rf *cFetcher) nextBatch(ctx context.Context) (coldata.Batch, error) {
}
rf.machine.rowIdx++
rf.shiftState()

var emitBatch bool
if rf.machine.rowIdx >= rf.machine.batch.Capacity() {
// We have no more room in our batch, so output it immediately.
emitBatch = true
} else if rf.machine.limitHint > 0 && rf.machine.rowIdx >= rf.machine.limitHint {
// If we made it to our limit hint, output our batch early to make sure
// that we don't bother filling in extra data if we don't need to.
emitBatch = true
rf.machine.limitHint = 0
}

if emitBatch {
rf.pushState(stateResetBatch)
rf.machine.batch.SetLength(rf.machine.rowIdx)
rf.machine.rowIdx = 0
Expand Down

0 comments on commit 240f96d

Please sign in to comment.