Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: optimize read path for the no-range-keys case #82559

Closed
5 tasks done
nicktrav opened this issue Jun 7, 2022 · 10 comments
Closed
5 tasks done

storage: optimize read path for the no-range-keys case #82559

nicktrav opened this issue Jun 7, 2022 · 10 comments
Assignees
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-storage Storage Team

Comments

@nicktrav
Copy link
Collaborator

nicktrav commented Jun 7, 2022

It was observed in #82045 that there is a ~20% slowdown in the existing, typical read path when using an iterator stack that supports range keys, even in the case when there are no range keys to iterate over.

We'll want to claw back some / all of this slowdown.

HasPointAndRange() has also been seen to be a significant contributor, and should be optimized. Related to #83801.

Once this is done, we should optimize and benchmark the following performance-critical paths again, comparing them with 22.1:

  • MVCCScan
  • MVCCGet
  • ComputeStats
  • MVCCExportToSST
  • CatchUpIterator.CatchUpScan

Jira issue: CRDB-16511

Epic CRDB-2624

@nicktrav nicktrav added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-storage Relating to our storage engine (Pebble) on-disk storage. T-storage Storage Team labels Jun 7, 2022
jbowens added a commit to jbowens/pebble that referenced this issue Jun 9, 2022
Previously the keyspan.Span type used to represent range deletions and range
keys were passed primarily by value, including through the FragmentIterator
interface. This was convenient when transforming spans (eg, truncating bounds),
but passing them by pointer is enough of a performance win to be worthwhile.

Informs cockroachdb/cockroach#82559.

```
name                                    old time/op    new time/op    delta
IteratorSeekNoRangeKeys/batch=false-10    4.28µs ± 1%    2.90µs ± 4%  -32.27%  (p=0.000 n=19+20)
IteratorSeekNoRangeKeys/batch=true-10     8.78µs ± 1%    5.37µs ± 1%  -38.86%  (p=0.000 n=20+20)

name                                    old alloc/op   new alloc/op   delta
IteratorSeekNoRangeKeys/batch=false-10     16.0B ± 0%     16.0B ± 0%     ~     (all equal)
IteratorSeekNoRangeKeys/batch=true-10       192B ± 0%      128B ± 0%  -33.33%  (p=0.000 n=20+20)
```
jbowens added a commit to jbowens/pebble that referenced this issue Jun 9, 2022
Previously the keyspan.Span type used to represent range deletions and range
keys were passed primarily by value, including through the FragmentIterator
interface. This was convenient when transforming spans (eg, truncating bounds),
but passing them by pointer is enough of a performance win to be worthwhile.

Informs cockroachdb/cockroach#82559.

```
name                                    old time/op    new time/op    delta
IteratorSeekNoRangeKeys/batch=false-10    4.28µs ± 1%    2.90µs ± 4%  -32.27%  (p=0.000 n=19+20)
IteratorSeekNoRangeKeys/batch=true-10     8.78µs ± 1%    5.37µs ± 1%  -38.86%  (p=0.000 n=20+20)

name                                    old alloc/op   new alloc/op   delta
IteratorSeekNoRangeKeys/batch=false-10     16.0B ± 0%     16.0B ± 0%     ~     (all equal)
IteratorSeekNoRangeKeys/batch=true-10       192B ± 0%      128B ± 0%  -33.33%  (p=0.000 n=20+20)
```
jbowens added a commit to jbowens/pebble that referenced this issue Jun 9, 2022
Previously the keyspan.Span type used to represent range deletions and range
keys was passed primarily by value, including through the FragmentIterator
interface. This was convenient when transforming spans (eg, truncating bounds),
but passing them by pointer is enough of a performance win to be worthwhile.

Informs cockroachdb/cockroach#82559.

```
name                                    old time/op    new time/op    delta
IteratorSeekNoRangeKeys/batch=false-10    4.28µs ± 1%    2.90µs ± 4%  -32.27%  (p=0.000 n=19+20)
IteratorSeekNoRangeKeys/batch=true-10     8.78µs ± 1%    5.37µs ± 1%  -38.86%  (p=0.000 n=20+20)

name                                    old alloc/op   new alloc/op   delta
IteratorSeekNoRangeKeys/batch=false-10     16.0B ± 0%     16.0B ± 0%     ~     (all equal)
IteratorSeekNoRangeKeys/batch=true-10       192B ± 0%      128B ± 0%  -33.33%  (p=0.000 n=20+20)
```
jbowens added a commit to jbowens/pebble that referenced this issue Jun 9, 2022
Previously the keyspan.Span type used to represent range deletions and range
keys was passed primarily by value, including through the FragmentIterator
interface. This was convenient when transforming spans (eg, truncating bounds),
but passing them by pointer is enough of a performance win to be worthwhile.

Informs cockroachdb/cockroach#82559.

```
name                                    old time/op    new time/op    delta
IteratorSeekNoRangeKeys/batch=false-10    4.28µs ± 1%    2.90µs ± 4%  -32.27%  (p=0.000 n=19+20)
IteratorSeekNoRangeKeys/batch=true-10     8.78µs ± 1%    5.37µs ± 1%  -38.86%  (p=0.000 n=20+20)

name                                    old alloc/op   new alloc/op   delta
IteratorSeekNoRangeKeys/batch=false-10     16.0B ± 0%     16.0B ± 0%     ~     (all equal)
IteratorSeekNoRangeKeys/batch=true-10       192B ± 0%      128B ± 0%  -33.33%  (p=0.000 n=20+20)
```
jbowens added a commit to cockroachdb/pebble that referenced this issue Jun 9, 2022
Previously the keyspan.Span type used to represent range deletions and range
keys was passed primarily by value, including through the FragmentIterator
interface. This was convenient when transforming spans (eg, truncating bounds),
but passing them by pointer is enough of a performance win to be worthwhile.

Informs cockroachdb/cockroach#82559.

```
name                                    old time/op    new time/op    delta
IteratorSeekNoRangeKeys/batch=false-10    4.28µs ± 1%    2.90µs ± 4%  -32.27%  (p=0.000 n=19+20)
IteratorSeekNoRangeKeys/batch=true-10     8.78µs ± 1%    5.37µs ± 1%  -38.86%  (p=0.000 n=20+20)

name                                    old alloc/op   new alloc/op   delta
IteratorSeekNoRangeKeys/batch=false-10     16.0B ± 0%     16.0B ± 0%     ~     (all equal)
IteratorSeekNoRangeKeys/batch=true-10       192B ± 0%      128B ± 0%  -33.33%  (p=0.000 n=20+20)
```
@jbowens
Copy link
Collaborator

jbowens commented Jun 22, 2022

Linking to cockroachdb/pebble#1780

@erikgrinaker
Copy link
Contributor

I ran some benchmarks comparing release-22.1 against master. The results aren't too bad -- I think we should aim to get the regressions below 5% before we ship, which seems like it should be within reach.

name                                                           old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64-24                5.04µs ± 1%    5.28µs ± 2%   +4.79%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64-24               6.87µs ± 1%    7.17µs ± 1%   +4.31%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64-24              34.6µs ± 1%    37.9µs ± 1%   +9.64%  (p=0.000 n=8+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64-24              113µs ± 1%     119µs ± 1%   +6.02%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64-24            2.45ms ± 2%    2.72ms ± 1%  +10.77%  (p=0.000 n=10+7)
MVCCScan_Pebble/rows=10000/versions=10/valueSize=64-24           9.12ms ± 1%   10.18ms ± 1%  +11.70%  (p=0.000 n=10+9)
MVCCReverseScan_Pebble/rows=1/versions=1/valueSize=64-24         5.51µs ± 1%    5.82µs ± 2%   +5.62%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=1/versions=10/valueSize=64-24        9.21µs ± 1%    9.77µs ± 0%   +6.03%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=100/versions=1/valueSize=64-24       46.3µs ± 1%    50.9µs ± 1%   +9.90%  (p=0.000 n=10+9)
MVCCReverseScan_Pebble/rows=100/versions=10/valueSize=64-24       316µs ± 1%     334µs ± 1%   +5.86%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=10000/versions=1/valueSize=64-24     3.62ms ± 1%    3.94ms ± 1%   +8.62%  (p=0.000 n=10+9)
MVCCReverseScan_Pebble/rows=10000/versions=10/valueSize=64-24    29.4ms ± 9%    31.7ms ± 2%   +7.58%  (p=0.002 n=10+10)
MVCCGet_Pebble/versions=1/valueSize=8-24                         4.81µs ± 1%    5.05µs ± 1%   +5.09%  (p=0.000 n=10+9)
MVCCGet_Pebble/versions=10/valueSize=8-24                        5.81µs ± 1%    6.14µs ± 1%   +5.60%  (p=0.000 n=9+10)
MVCCGet_Pebble/versions=100/valueSize=8-24                       13.9µs ± 3%    14.5µs ± 2%   +4.15%  (p=0.000 n=10+10)
MVCCComputeStats_Pebble/valueSize=32-24                           143ms ± 1%     173ms ± 1%  +20.89%  (p=0.000 n=9+10)

craig bot pushed a commit that referenced this issue Aug 14, 2022
86110: storage: use `RangeKeyChanged()` in `pebbleMVCCScanner` r=nicktrav a=erikgrinaker

This patch uses `RangeKeyChanged()` in `pebbleMVCCScanner` to detect
MVCC range keys and enable point key synthesis. This shows a ~5% speedup
for large scans with no range keys, but a minor regression for point
gets with range keys -- the latter caused by having to call both
`HasPointAndRange()` and `RangeKeyChanged()`, while previously only the
first was called.

```
name                                                                  old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24        5.62µs ± 2%    5.68µs ± 1%    ~     (p=0.058 n=10+8)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=1-24        12.2µs ± 3%    12.3µs ± 1%    ~     (p=0.113 n=10+9)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24      38.9µs ± 1%    38.5µs ± 1%  -0.78%  (p=0.019 n=9+9)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=1-24      72.1µs ± 2%    72.7µs ± 2%  +0.83%  (p=0.043 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=0-24    2.85ms ± 1%    2.72ms ± 1%  -4.63%  (p=0.000 n=9+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=1-24    5.46ms ± 1%    5.48ms ± 1%    ~     (p=0.065 n=9+10)
MVCCGet_Pebble/batch=true/versions=1/valueSize=8/numRangeKeys=0-24      3.34µs ± 1%    3.33µs ± 2%    ~     (p=0.591 n=10+10)
MVCCGet_Pebble/batch=true/versions=1/valueSize=8/numRangeKeys=1-24      6.84µs ± 2%    7.00µs ± 1%  +2.35%  (p=0.000 n=10+8)
MVCCGet_Pebble/batch=true/versions=10/valueSize=8/numRangeKeys=0-24     4.31µs ± 3%    4.28µs ± 2%    ~     (p=0.225 n=10+10)
MVCCGet_Pebble/batch=true/versions=10/valueSize=8/numRangeKeys=1-24     9.56µs ± 2%    9.83µs ± 3%  +2.82%  (p=0.000 n=10+10)
```

Touches #82559.

Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
@erikgrinaker
Copy link
Contributor

Updated benchmarks show some improvement, but still a ~10% regression for scans:

name                                                          old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64-24               5.09µs ± 2%    5.33µs ± 1%   +4.78%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64-24             34.7µs ± 0%    38.2µs ± 2%  +10.09%  (p=0.000 n=9+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64-24           2.47ms ± 2%    2.74ms ± 1%  +10.91%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=1/versions=1/valueSize=64-24        5.53µs ± 1%    5.81µs ± 1%   +5.10%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=100/versions=1/valueSize=64-24      46.4µs ± 1%    50.4µs ± 2%   +8.65%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=10000/versions=1/valueSize=64-24    3.68ms ± 2%    3.91ms ± 1%   +6.31%  (p=0.000 n=9+10)
MVCCGet_Pebble/versions=1/valueSize=8-24                        4.83µs ± 1%    5.02µs ± 1%   +3.93%  (p=0.000 n=9+8)
MVCCGet_Pebble/versions=10/valueSize=8-24                       5.88µs ± 2%    6.10µs ± 1%   +3.76%  (p=0.000 n=10+10)
MVCCGet_Pebble/versions=100/valueSize=8-24                      14.0µs ± 4%    14.1µs ± 3%     ~     (p=0.853 n=10+10)
MVCCComputeStats_Pebble/valueSize=8-24                           199ms ± 2%     219ms ± 1%   +9.77%  (p=0.000 n=10+10)
MVCCComputeStats_Pebble/valueSize=32-24                          139ms ± 1%     152ms ± 0%   +9.30%  (p=0.000 n=10+8)
MVCCComputeStats_Pebble/valueSize=256-24                        46.2ms ± 7%    48.5ms ± 9%     ~     (p=0.165 n=10+10)

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Aug 19, 2022

Some more benchmarks with #86440, getting down into the single digits:

name                                                   old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64-24        5.12µs ± 1%    5.49µs ± 2%  +7.15%  (p=0.000 n=9+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64-24      36.3µs ± 2%    38.3µs ± 1%  +5.49%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64-24    2.52ms ± 1%    2.71ms ± 0%  +7.78%  (p=0.000 n=10+7)
MVCCGet_Pebble/versions=1/valueSize=8-24                 4.83µs ± 1%    5.10µs ± 2%  +5.61%  (p=0.000 n=10+10)
MVCCGet_Pebble/versions=10/valueSize=8-24                5.91µs ± 1%    6.17µs ± 2%  +4.50%  (p=0.000 n=10+9)
MVCCGet_Pebble/versions=100/valueSize=8-24               14.6µs ± 3%    14.9µs ± 4%  +2.07%  (p=0.043 n=10+10)
MVCCComputeStats_Pebble/valueSize=32-24                   141ms ± 2%     154ms ± 2%  +9.21%  (p=0.000 n=10+10)

There is some variance compared to previous results (also for 22.1 case), but this may well be due to VM/cloud differences, so I'm not sure if they can really be compared precisely across runs.

jbowens added a commit to jbowens/pebble that referenced this issue Aug 19, 2022
Use base.Equal for equality comparisons in more places. Equal(a,b) is faster
than Compare(a,b)==0. Additionally, embed the Comparer directly into the
Iterator struct for better cache locality and to avoid the additional pointer
indirection.

Informs cockroachdb#1819.
Informs cockroachdb/cockroach#82559.

```
name                                                                     old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24           6.80µs ± 1%    6.72µs ± 2%    ~     (p=0.165 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=1-24           14.1µs ± 1%    14.0µs ± 2%    ~     (p=0.107 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=100-24          154µs ± 2%     152µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=0-24           7.16µs ± 2%    7.10µs ± 1%    ~     (p=0.147 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=1-24           15.0µs ± 1%    14.9µs ± 1%  -0.94%  (p=0.007 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=100-24          155µs ± 1%     155µs ± 1%    ~     (p=0.513 n=5+10)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=0-24          9.15µs ± 2%    8.91µs ± 0%  -2.61%  (p=0.003 n=5+7)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=1-24          18.2µs ± 2%    18.1µs ± 1%    ~     (p=0.518 n=5+9)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=100-24         147µs ± 4%     145µs ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=0-24         18.1µs ± 2%    17.7µs ± 1%  -1.98%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=1-24         27.5µs ± 1%    27.2µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=100-24        112µs ± 4%     109µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=0-24          12.1µs ± 2%    11.8µs ± 2%  -3.00%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=1-24          21.5µs ± 1%    20.8µs ± 1%  -3.02%  (p=0.001 n=5+9)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=100-24         105µs ± 3%     102µs ± 1%  -2.73%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=0-24          13.8µs ± 2%    13.4µs ± 4%  -2.89%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=1-24          24.3µs ± 2%    23.8µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=100-24         109µs ± 2%     105µs ± 2%  -3.43%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=0-24         25.0µs ± 1%    24.2µs ± 3%  -3.27%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=1-24         42.6µs ± 2%    42.0µs ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=100-24        129µs ± 3%     127µs ± 2%  -1.61%  (p=0.040 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=0-24        63.2µs ± 1%    61.7µs ± 1%  -2.30%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=1-24        83.4µs ± 1%    81.6µs ± 1%  -2.16%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=100-24       178µs ± 2%     173µs ± 1%  -2.46%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24         47.4µs ± 2%    46.0µs ± 1%  -2.89%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=1-24         74.2µs ± 1%    71.2µs ± 2%  -4.06%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=100-24        173µs ± 1%     170µs ± 3%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=0-24         60.8µs ± 2%    59.1µs ± 2%  -2.82%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=1-24         95.8µs ± 1%    94.2µs ± 2%  -1.75%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=100-24        197µs ± 2%     193µs ± 2%  -1.85%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=0-24         148µs ± 1%     145µs ± 1%  -1.66%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=1-24         247µs ± 1%     241µs ± 2%  -2.32%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=100-24       390µs ± 2%     382µs ± 2%  -1.95%  (p=0.028 n=5+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=0-24        441µs ± 2%     437µs ± 1%    ~     (p=0.222 n=5+8)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=1-24        558µs ± 1%     547µs ± 1%  -2.01%  (p=0.004 n=4+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=100-24      790µs ± 2%     774µs ± 2%  -1.93%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=0-24         350µs ± 1%     346µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=1-24         539µs ± 1%     534µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=100-24       759µs ± 1%     747µs ± 1%  -1.61%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=0-24         472µs ± 0%     466µs ± 0%  -1.12%  (p=0.003 n=4+9)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=1-24         755µs ± 1%     740µs ± 1%  -2.03%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=100-24      1.02ms ± 3%    0.96ms ± 2%  -5.05%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=0-24       1.29ms ± 1%    1.28ms ± 1%  -1.42%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=1-24       2.19ms ± 1%    2.13ms ± 1%  -2.64%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=100-24     2.82ms ± 2%    2.77ms ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=0-24      4.11ms ± 1%    4.08ms ± 2%    ~     (p=0.859 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=1-24      5.17ms ± 2%    5.08ms ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=100-24    6.69ms ± 2%    6.61ms ± 4%    ~     (p=0.206 n=5+10)
```
jbowens added a commit to jbowens/pebble that referenced this issue Aug 19, 2022
Replace the base.InternalIteratorWithStats interface with logic to propagate a
pointer to a shared *InternalIteratorStats down the iterator tree during
iterator construction. This reduces the cost of collecting stats, which no
longer needs to descend the iterator tree summing stats as it goes.

Informs cockroachdb/cockroach#86083.
Informs cockroachdb/cockroach#82559.

```
name                                                                     old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24           6.80µs ± 1%    6.58µs ± 1%  -3.16%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=1-24           14.1µs ± 1%    13.4µs ± 2%  -4.65%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=100-24          154µs ± 2%     155µs ± 1%    ~     (p=1.000 n=5+5)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=0-24           7.16µs ± 2%    7.05µs ± 1%    ~     (p=0.095 n=5+5)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=1-24           15.0µs ± 1%    14.4µs ± 2%  -3.90%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=100-24          155µs ± 1%     156µs ± 2%    ~     (p=0.690 n=5+5)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=0-24          9.15µs ± 2%    8.72µs ± 3%  -4.71%  (p=0.032 n=5+5)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=1-24          18.2µs ± 2%    17.7µs ± 1%  -3.01%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=100-24         147µs ± 4%     144µs ± 4%    ~     (p=0.151 n=5+5)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=0-24         18.1µs ± 2%    17.6µs ± 2%  -2.52%  (p=0.032 n=5+5)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=1-24         27.5µs ± 1%    26.9µs ± 1%  -2.18%  (p=0.032 n=5+5)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=100-24        112µs ± 4%     111µs ± 1%    ~     (p=0.151 n=5+5)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=0-24          12.1µs ± 2%    11.7µs ± 4%    ~     (p=0.056 n=5+5)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=1-24          21.5µs ± 1%    20.8µs ± 3%  -3.26%  (p=0.016 n=5+5)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=100-24         105µs ± 3%     105µs ± 1%    ~     (p=0.310 n=5+5)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=0-24          13.8µs ± 2%    13.4µs ± 1%  -3.07%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=1-24          24.3µs ± 2%    23.5µs ± 2%  -2.98%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=100-24         109µs ± 2%     108µs ± 2%    ~     (p=0.690 n=5+5)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=0-24         25.0µs ± 1%    24.4µs ± 2%  -2.37%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=1-24         42.6µs ± 2%    41.0µs ± 1%  -3.83%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=100-24        129µs ± 3%     127µs ± 1%    ~     (p=0.095 n=5+5)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=0-24        63.2µs ± 1%    61.6µs ± 1%  -2.48%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=1-24        83.4µs ± 1%    81.1µs ± 1%  -2.71%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=100-24       178µs ± 2%     175µs ± 2%    ~     (p=0.056 n=5+5)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24         47.4µs ± 2%    47.5µs ± 3%    ~     (p=1.000 n=5+5)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=1-24         74.2µs ± 1%    72.1µs ± 2%  -2.74%  (p=0.032 n=5+5)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=100-24        173µs ± 1%     174µs ± 2%    ~     (p=0.690 n=5+5)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=0-24         60.8µs ± 2%    60.5µs ± 3%    ~     (p=0.421 n=5+5)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=1-24         95.8µs ± 1%    93.4µs ± 1%  -2.55%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=100-24        197µs ± 2%     195µs ± 3%    ~     (p=0.421 n=5+5)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=0-24         148µs ± 1%     146µs ± 2%    ~     (p=0.310 n=5+5)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=1-24         247µs ± 1%     240µs ± 1%  -2.89%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=100-24       390µs ± 2%     385µs ± 1%    ~     (p=0.421 n=5+5)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=0-24        441µs ± 2%     443µs ± 2%    ~     (p=0.690 n=5+5)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=1-24        558µs ± 1%     540µs ± 2%  -3.22%  (p=0.016 n=4+5)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=100-24      790µs ± 2%     771µs ± 1%  -2.36%  (p=0.016 n=5+5)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=0-24         350µs ± 1%     352µs ± 1%    ~     (p=0.548 n=5+5)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=1-24         539µs ± 1%     531µs ± 1%  -1.42%  (p=0.016 n=5+5)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=100-24       759µs ± 1%     752µs ± 2%    ~     (p=0.222 n=5+5)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=0-24         472µs ± 0%     474µs ± 1%    ~     (p=0.556 n=4+5)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=1-24         755µs ± 1%     752µs ± 1%    ~     (p=0.548 n=5+5)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=100-24      1.02ms ± 3%    0.97ms ± 1%  -4.30%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=0-24       1.29ms ± 1%    1.27ms ± 2%    ~     (p=0.056 n=5+5)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=1-24       2.19ms ± 1%    2.11ms ± 1%  -3.80%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=100-24     2.82ms ± 2%    2.75ms ± 3%    ~     (p=0.056 n=5+5)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=0-24      4.11ms ± 1%    4.03ms ± 1%  -1.86%  (p=0.032 n=5+5)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=1-24      5.17ms ± 2%    4.96ms ± 2%  -4.07%  (p=0.008 n=5+5)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=100-24    6.69ms ± 2%    6.51ms ± 2%    ~     (p=0.056 n=5+5)
```
@jbowens
Copy link
Collaborator

jbowens commented Aug 19, 2022

Found some old, small fruit in Pebble in cockroachdb/pebble#1917 and cockroachdb/pebble#1918. 🍎 🍌 🍏 🍋 🍇

@erikgrinaker
Copy link
Contributor

Good news! I ran some MVCCExportToSST benchmarks on top of #86259 comparing them with 22.1, and we have a decent performance boost. I believe I made some simple, opportunistic optimizations here when I first added range key support (avoiding key comparisons and allocations and such).

TBI disabled (StartTS=0):

name                                                                       old time/op  new time/op  delta
MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=false-24      60.4µs ± 1%  55.8µs ± 2%   -7.54%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=true-24       58.3µs ± 1%  52.6µs ± 1%   -9.76%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=false-24     484µs ± 2%   410µs ± 1%  -15.28%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=true-24      433µs ± 3%   355µs ± 2%  -18.08%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=false-24    3.25ms ± 0%  2.99ms ± 1%   -8.19%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=true-24     2.82ms ± 0%  2.53ms ± 1%  -10.16%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=false-24   24.5ms ± 0%  23.5ms ± 1%   -4.33%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=true-24    21.2ms ± 0%  19.9ms ± 0%   -6.21%  (p=0.008 n=5+5)

TBI enabled (StartTS>0):

name                                                                       old time/op  new time/op  delta
MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=false-24      78.0µs ± 3%  69.9µs ± 1%  -10.36%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=true-24       72.6µs ± 1%  66.6µs ± 1%   -8.25%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=false-24     698µs ± 1%   614µs ± 1%  -12.09%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=true-24      644µs ± 1%   556µs ± 1%  -13.61%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=false-24    4.94ms ± 1%  4.57ms ± 1%   -7.49%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=true-24     4.52ms ± 0%  4.14ms ± 0%   -8.41%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=false-24   37.8ms ± 0%  36.1ms ± 1%   -4.56%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=true-24    34.5ms ± 1%  32.8ms ± 0%   -5.01%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=64/numRevisions=10/exportAllRevisions=false-24      170µs ± 5%   166µs ± 4%     ~     (p=0.222 n=5+5)
MVCCExportToSST/numKeys=64/numRevisions=10/exportAllRevisions=true-24       293µs ± 1%   270µs ± 2%   -7.83%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=10/exportAllRevisions=false-24   2.16ms ± 0%  2.19ms ± 1%   +1.09%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=1024/numRevisions=10/exportAllRevisions=true-24    3.95ms ± 0%  3.79ms ± 1%   -3.97%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=10/exportAllRevisions=false-24   17.5ms ± 0%  17.8ms ± 2%     ~     (p=0.056 n=5+5)
MVCCExportToSST/numKeys=8192/numRevisions=10/exportAllRevisions=true-24    31.2ms ± 0%  30.7ms ± 0%   -1.46%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=10/exportAllRevisions=false-24   147ms ± 0%   152ms ± 4%   +3.27%  (p=0.008 n=5+5)
MVCCExportToSST/numKeys=65536/numRevisions=10/exportAllRevisions=true-24    255ms ± 0%   253ms ± 0%   -0.73%  (p=0.008 n=5+5)

@erikgrinaker
Copy link
Contributor

And #86515 will mostly eliminate the ComputeStats performance regression.

@erikgrinaker
Copy link
Contributor

CatchUpScan is looking at about ~5% with #86512, which is in line with other benchmarks, and I guess is OK for now. Note that these benchmarks are with TBI enabled, since they all use a non-zero start time.

name                                                 old time/op  new time/op  delta
CatchUpScan/mixed-case/withDiff=true/perc=0.00-24     483ms ± 1%   529ms ± 2%  +9.49%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=true/perc=50.00-24    993ms ± 0%  1026ms ± 0%  +3.30%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=true/perc=75.00-24    876ms ± 1%   903ms ± 1%  +3.10%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=true/perc=95.00-24    178ms ± 1%   195ms ± 1%  +9.43%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=true/perc=99.00-24    154ms ± 1%   163ms ± 0%  +5.82%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=false/perc=0.00-24    743ms ± 1%   786ms ± 1%  +5.75%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=false/perc=50.00-24   634ms ± 0%   666ms ± 0%  +5.05%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=false/perc=75.00-24   578ms ± 1%   597ms ± 1%  +3.28%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=false/perc=95.00-24   166ms ± 1%   176ms ± 1%  +6.08%  (p=0.008 n=5+5)
CatchUpScan/mixed-case/withDiff=false/perc=99.00-24   152ms ± 0%   159ms ± 0%  +4.64%  (p=0.016 n=5+4)

craig bot pushed a commit that referenced this issue Aug 22, 2022
86259: storage: implement `RangeKeyChanged()` for `MVCCIncrementalIterator` r=tbg a=erikgrinaker

**storage: use `RangeKeyChanged()` in `MVCCIncrementalIterator`**

This patch uses `RangeKeyChanged()` to detect changes to range keys in
`MVCCIncrementalIterator`. There are no functional changes.

Some quick benchmarks, using catchup scans:

```
name                                                                 old time/op  new time/op  delta
CatchUpScan/mixed-case/withDiff=true/perc=0.00/numRangeKeys=0-24      538ms ± 1%   535ms ± 1%     ~     (p=0.211 n=10+9)
CatchUpScan/mixed-case/withDiff=true/perc=0.00/numRangeKeys=1-24      690ms ± 0%   590ms ± 1%  -14.56%  (p=0.000 n=9+10)
CatchUpScan/mixed-case/withDiff=true/perc=0.00/numRangeKeys=100-24    743ms ± 1%   646ms ± 1%  -13.13%  (p=0.000 n=9+9)
CatchUpScan/mixed-case/withDiff=false/perc=0.00/numRangeKeys=0-24     794ms ± 1%   794ms ± 0%     ~     (p=0.579 n=10+10)
CatchUpScan/mixed-case/withDiff=false/perc=0.00/numRangeKeys=1-24     966ms ± 0%   911ms ± 1%   -5.72%  (p=0.000 n=10+10)
CatchUpScan/mixed-case/withDiff=false/perc=0.00/numRangeKeys=100-24   974ms ± 0%   920ms ± 0%   -5.51%  (p=0.000 n=10+10)
```

Release justification: bug fixes and low-risk updates to new functionality.

Release note: None

**storage: implement `RangeKeyChanged()` for `MVCCIncrementalIterator`**

This patch implements `RangeKeyChanged()` for `MVCCIncrementalIterator`.
It only fires if the time bound range keys change, not if a
`Next(Key)IgnoringTime()` operation reveals additional range keys.

Resolves #86105.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None
  
**storage: add `MVCCIncrementalIterator.RangeKeysIgnoringTime()`**

This patch changes `MVCCIncrementalIterator` range key behavior
following a `Next(Key)IgnoringTime()` call. Previously, range key
methods would then expose unfiltered range keys. Now, the standard range
key methods only apply to filtered range keys, and an additional
`RangeKeysIgnoringTime()` method provides access to unfiltered range
keys.

This implies that if such a call steps onto a range key that's entirely
outside of the time bounds then:

* `HasPointAndRange()` will return `false`,`false` if on a bare range
  key.

* `RangeKeyChanged()` will not fire, unless stepping off of a range key
  within the time bounds.

* `RangeBounds()` and `RangeKeys()` will return empty results.

This is done to avoid conditional range key handling following these
calls, except for the exact sites where the caller is interested in
the unfiltered range keys.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

86440: storage: use concrete `pebbleIterator` in `intentInterleavingIter` r=tbg a=erikgrinaker

**storage: tweak `unsafeMVCCIterator` construction**

Release justification: non-production code changes

Release note: None

**storage: inline some `intentInterleavingIter` methods**

This patch splits up `maybeSkipIntentRangeKeys()` and
`maybeSuppressRangeKeyChanged()` to allow for mid-stack inlining.

I doubt that the gains are as large as these microbenchmarks claim, and
there's a fair bit of variance between runs, but it can't hurt.

```
name                                                                         old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24               5.37µs ± 2%    5.43µs ± 2%  +1.13%  (p=0.041 n=10+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24             38.2µs ± 2%    38.2µs ± 2%    ~     (p=0.971 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=0-24           2.79ms ± 2%    2.71ms ± 2%  -2.59%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24        5.99µs ± 1%    5.99µs ± 2%    ~     (p=0.541 n=10+10)
MVCCReverseScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24      51.7µs ± 3%    52.1µs ± 1%    ~     (p=0.631 n=10+10)
MVCCReverseScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=0-24    3.88ms ± 2%    3.87ms ± 1%    ~     (p=0.897 n=10+8)
MVCCComputeStats_Pebble/valueSize=32/numRangeKeys=0-24                          158ms ± 1%     155ms ± 1%  -2.34%  (p=0.000 n=10+9)
```

Touches #82559.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None
  
**storage: use concrete `pebbleIterator` in `intentInterleavingIter`**

Since `intentInterleavingIter` always constructs the underlying
iterators from the given reader, and these readers always construct
`*pebbleIterator`, it can use the concrete type rather than interfaces
for both iterators. This avoids dynamic dispatch, yielding a decent
performance improvement.

Unfortunately, this requires disabling `unsafeMVCCIterator` inside
`intentInterleavingIter`. This wasn't always enabled anyway, since it
was omitted both for the engine iterator and when using an engine
directly (which doesn't have consistent iterators).

```
name                                                                         old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24               5.43µs ± 2%    5.45µs ± 2%    ~     (p=0.566 n=10+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24             38.2µs ± 2%    37.0µs ± 1%  -3.02%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=0-24           2.71ms ± 2%    2.66ms ± 1%  -1.83%  (p=0.000 n=10+9)
MVCCReverseScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24        5.99µs ± 2%    5.86µs ± 2%  -2.15%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24      52.1µs ± 1%    50.2µs ± 2%  -3.77%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=10000/versions=1/valueSize=64/numRangeKeys=0-24    3.87ms ± 1%    3.83ms ± 1%  -1.26%  (p=0.000 n=8+10)
MVCCComputeStats_Pebble/valueSize=32/numRangeKeys=0-24                          155ms ± 1%     155ms ± 1%    ~     (p=0.842 n=9+10)
```

Touches #82559.

Release justification: low risk, high benefit changes to existing functionality

Release note: None

86478: storage: use concrete `pebbleIterator` in `verifyingIterator` r=tbg a=erikgrinaker

**storage: add Pebble SST iterator benchmarks**

Release justification: non-production code changes

Release note: None
  
**storage: use concrete `pebbleIterator` in `verifyingIterator`**

Gives a slight performance boost, since it avoid dynamic dispatch:

```
name                                                  old time/op  new time/op  delta
SSTIterator/keys=1/variant=pebble/verify=true-24      42.8µs ± 1%  42.4µs ± 1%  -0.79%  (p=0.043 n=10+10)
SSTIterator/keys=100/variant=pebble/verify=true-24    61.8µs ± 1%  60.7µs ± 1%  -1.64%  (p=0.000 n=10+10)
SSTIterator/keys=10000/variant=pebble/verify=true-24  1.91ms ± 0%  1.88ms ± 0%  -1.79%  (p=0.000 n=10+10)
```

An attempt was also made at using `RangeKeyChanged()` instead of
`HasPointAndRange()`, but this had no effect.

Touches #83051.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

86513: storage: use `RangeKeyChanged` in `ReadAsOfIterator` r=tbg a=erikgrinaker

This patch uses `RangeKeyChanged()` to detect range keys in
`ReadAsOfIterator`, and caches them to improve performance.
It also fixes a bug where the iterator would fail to detect tombstones
with a non-empty `MVCCValueHeader`.

Resolves #84714.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

86514: storage: use `RangeKeyChanged()` in `MVCCDeleteRangeUsingTombstone()` r=tbg a=erikgrinaker

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

86529: storage: omit unnecessary range key calls during intent resolution r=tbg a=erikgrinaker

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
jbowens added a commit to jbowens/pebble that referenced this issue Aug 22, 2022
Use base.Equal for equality comparisons in more places. Equal(a,b) is faster
than Compare(a,b)==0. Additionally, embed the Comparer directly into the
Iterator struct for better cache locality and to avoid the additional pointer
indirection.

Informs cockroachdb#1819.
Informs cockroachdb/cockroach#82559.

```
name                                                                     old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24           6.80µs ± 1%    6.72µs ± 2%    ~     (p=0.165 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=1-24           14.1µs ± 1%    14.0µs ± 2%    ~     (p=0.107 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=100-24          154µs ± 2%     152µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=0-24           7.16µs ± 2%    7.10µs ± 1%    ~     (p=0.147 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=1-24           15.0µs ± 1%    14.9µs ± 1%  -0.94%  (p=0.007 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=100-24          155µs ± 1%     155µs ± 1%    ~     (p=0.513 n=5+10)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=0-24          9.15µs ± 2%    8.91µs ± 0%  -2.61%  (p=0.003 n=5+7)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=1-24          18.2µs ± 2%    18.1µs ± 1%    ~     (p=0.518 n=5+9)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=100-24         147µs ± 4%     145µs ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=0-24         18.1µs ± 2%    17.7µs ± 1%  -1.98%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=1-24         27.5µs ± 1%    27.2µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=100-24        112µs ± 4%     109µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=0-24          12.1µs ± 2%    11.8µs ± 2%  -3.00%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=1-24          21.5µs ± 1%    20.8µs ± 1%  -3.02%  (p=0.001 n=5+9)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=100-24         105µs ± 3%     102µs ± 1%  -2.73%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=0-24          13.8µs ± 2%    13.4µs ± 4%  -2.89%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=1-24          24.3µs ± 2%    23.8µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=100-24         109µs ± 2%     105µs ± 2%  -3.43%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=0-24         25.0µs ± 1%    24.2µs ± 3%  -3.27%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=1-24         42.6µs ± 2%    42.0µs ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=100-24        129µs ± 3%     127µs ± 2%  -1.61%  (p=0.040 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=0-24        63.2µs ± 1%    61.7µs ± 1%  -2.30%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=1-24        83.4µs ± 1%    81.6µs ± 1%  -2.16%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=100-24       178µs ± 2%     173µs ± 1%  -2.46%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24         47.4µs ± 2%    46.0µs ± 1%  -2.89%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=1-24         74.2µs ± 1%    71.2µs ± 2%  -4.06%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=100-24        173µs ± 1%     170µs ± 3%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=0-24         60.8µs ± 2%    59.1µs ± 2%  -2.82%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=1-24         95.8µs ± 1%    94.2µs ± 2%  -1.75%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=100-24        197µs ± 2%     193µs ± 2%  -1.85%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=0-24         148µs ± 1%     145µs ± 1%  -1.66%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=1-24         247µs ± 1%     241µs ± 2%  -2.32%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=100-24       390µs ± 2%     382µs ± 2%  -1.95%  (p=0.028 n=5+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=0-24        441µs ± 2%     437µs ± 1%    ~     (p=0.222 n=5+8)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=1-24        558µs ± 1%     547µs ± 1%  -2.01%  (p=0.004 n=4+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=100-24      790µs ± 2%     774µs ± 2%  -1.93%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=0-24         350µs ± 1%     346µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=1-24         539µs ± 1%     534µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=100-24       759µs ± 1%     747µs ± 1%  -1.61%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=0-24         472µs ± 0%     466µs ± 0%  -1.12%  (p=0.003 n=4+9)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=1-24         755µs ± 1%     740µs ± 1%  -2.03%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=100-24      1.02ms ± 3%    0.96ms ± 2%  -5.05%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=0-24       1.29ms ± 1%    1.28ms ± 1%  -1.42%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=1-24       2.19ms ± 1%    2.13ms ± 1%  -2.64%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=100-24     2.82ms ± 2%    2.77ms ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=0-24      4.11ms ± 1%    4.08ms ± 2%    ~     (p=0.859 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=1-24      5.17ms ± 2%    5.08ms ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=100-24    6.69ms ± 2%    6.61ms ± 4%    ~     (p=0.206 n=5+10)
```
jbowens added a commit to cockroachdb/pebble that referenced this issue Aug 22, 2022
Use base.Equal for equality comparisons in more places. Equal(a,b) is faster
than Compare(a,b)==0. Additionally, embed the Comparer directly into the
Iterator struct for better cache locality and to avoid the additional pointer
indirection.

Informs #1819.
Informs cockroachdb/cockroach#82559.

```
name                                                                     old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=0-24           6.80µs ± 1%    6.72µs ± 2%    ~     (p=0.165 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=1-24           14.1µs ± 1%    14.0µs ± 2%    ~     (p=0.107 n=5+10)
MVCCScan_Pebble/rows=1/versions=1/valueSize=64/numRangeKeys=100-24          154µs ± 2%     152µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=0-24           7.16µs ± 2%    7.10µs ± 1%    ~     (p=0.147 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=1-24           15.0µs ± 1%    14.9µs ± 1%  -0.94%  (p=0.007 n=5+9)
MVCCScan_Pebble/rows=1/versions=2/valueSize=64/numRangeKeys=100-24          155µs ± 1%     155µs ± 1%    ~     (p=0.513 n=5+10)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=0-24          9.15µs ± 2%    8.91µs ± 0%  -2.61%  (p=0.003 n=5+7)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=1-24          18.2µs ± 2%    18.1µs ± 1%    ~     (p=0.518 n=5+9)
MVCCScan_Pebble/rows=1/versions=10/valueSize=64/numRangeKeys=100-24         147µs ± 4%     145µs ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=0-24         18.1µs ± 2%    17.7µs ± 1%  -1.98%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=1-24         27.5µs ± 1%    27.2µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1/versions=100/valueSize=64/numRangeKeys=100-24        112µs ± 4%     109µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=0-24          12.1µs ± 2%    11.8µs ± 2%  -3.00%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=1-24          21.5µs ± 1%    20.8µs ± 1%  -3.02%  (p=0.001 n=5+9)
MVCCScan_Pebble/rows=10/versions=1/valueSize=64/numRangeKeys=100-24         105µs ± 3%     102µs ± 1%  -2.73%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=0-24          13.8µs ± 2%    13.4µs ± 4%  -2.89%  (p=0.013 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=1-24          24.3µs ± 2%    23.8µs ± 2%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=10/versions=2/valueSize=64/numRangeKeys=100-24         109µs ± 2%     105µs ± 2%  -3.43%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=0-24         25.0µs ± 1%    24.2µs ± 3%  -3.27%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=1-24         42.6µs ± 2%    42.0µs ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=10/versions=10/valueSize=64/numRangeKeys=100-24        129µs ± 3%     127µs ± 2%  -1.61%  (p=0.040 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=0-24        63.2µs ± 1%    61.7µs ± 1%  -2.30%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=1-24        83.4µs ± 1%    81.6µs ± 1%  -2.16%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=10/versions=100/valueSize=64/numRangeKeys=100-24       178µs ± 2%     173µs ± 1%  -2.46%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=0-24         47.4µs ± 2%    46.0µs ± 1%  -2.89%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=1-24         74.2µs ± 1%    71.2µs ± 2%  -4.06%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64/numRangeKeys=100-24        173µs ± 1%     170µs ± 3%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=0-24         60.8µs ± 2%    59.1µs ± 2%  -2.82%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=1-24         95.8µs ± 1%    94.2µs ± 2%  -1.75%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=2/valueSize=64/numRangeKeys=100-24        197µs ± 2%     193µs ± 2%  -1.85%  (p=0.019 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=0-24         148µs ± 1%     145µs ± 1%  -1.66%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=1-24         247µs ± 1%     241µs ± 2%  -2.32%  (p=0.003 n=5+10)
MVCCScan_Pebble/rows=100/versions=10/valueSize=64/numRangeKeys=100-24       390µs ± 2%     382µs ± 2%  -1.95%  (p=0.028 n=5+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=0-24        441µs ± 2%     437µs ± 1%    ~     (p=0.222 n=5+8)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=1-24        558µs ± 1%     547µs ± 1%  -2.01%  (p=0.004 n=4+10)
MVCCScan_Pebble/rows=100/versions=100/valueSize=64/numRangeKeys=100-24      790µs ± 2%     774µs ± 2%  -1.93%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=0-24         350µs ± 1%     346µs ± 1%    ~     (p=0.075 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=1-24         539µs ± 1%     534µs ± 2%    ~     (p=0.206 n=5+10)
MVCCScan_Pebble/rows=1000/versions=1/valueSize=64/numRangeKeys=100-24       759µs ± 1%     747µs ± 1%  -1.61%  (p=0.008 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=0-24         472µs ± 0%     466µs ± 0%  -1.12%  (p=0.003 n=4+9)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=1-24         755µs ± 1%     740µs ± 1%  -2.03%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=2/valueSize=64/numRangeKeys=100-24      1.02ms ± 3%    0.96ms ± 2%  -5.05%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=0-24       1.29ms ± 1%    1.28ms ± 1%  -1.42%  (p=0.005 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=1-24       2.19ms ± 1%    2.13ms ± 1%  -2.64%  (p=0.001 n=5+10)
MVCCScan_Pebble/rows=1000/versions=10/valueSize=64/numRangeKeys=100-24     2.82ms ± 2%    2.77ms ± 2%    ~     (p=0.129 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=0-24      4.11ms ± 1%    4.08ms ± 2%    ~     (p=0.859 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=1-24      5.17ms ± 2%    5.08ms ± 2%    ~     (p=0.055 n=5+10)
MVCCScan_Pebble/rows=1000/versions=100/valueSize=64/numRangeKeys=100-24    6.69ms ± 2%    6.61ms ± 4%    ~     (p=0.206 n=5+10)
```
jbowens added a commit to jbowens/pebble that referenced this issue Aug 23, 2022
Replace the base.InternalIteratorWithStats interface with logic to propagate a
pointer to a shared *InternalIteratorStats down the iterator tree during
iterator construction. This reduces the cost of collecting stats, which no
longer needs to descend the iterator tree summing stats as it goes.

Informs cockroachdb/cockroach#86083.
Informs cockroachdb/cockroach#82559.
jbowens added a commit to cockroachdb/pebble that referenced this issue Aug 23, 2022
Replace the base.InternalIteratorWithStats interface with logic to propagate a
pointer to a shared *InternalIteratorStats down the iterator tree during
iterator construction. This reduces the cost of collecting stats, which no
longer needs to descend the iterator tree summing stats as it goes.

Informs cockroachdb/cockroach#86083.
Informs cockroachdb/cockroach#82559.
@erikgrinaker
Copy link
Contributor

Ran some more scan benchmarks, and despite recent optimizations we're unfortunately still at about 8%:

name                                                          old time/op    new time/op    delta
MVCCScan_Pebble/rows=1/versions=1/valueSize=64-24               4.99µs ± 1%    5.37µs ± 1%  +7.68%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=100/versions=1/valueSize=64-24             34.6µs ± 2%    37.4µs ± 2%  +7.81%  (p=0.000 n=10+10)
MVCCScan_Pebble/rows=10000/versions=1/valueSize=64-24           2.44ms ± 1%    2.64ms ± 1%  +8.24%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=1/versions=1/valueSize=64-24        5.55µs ± 1%    5.75µs ± 1%  +3.50%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=100/versions=1/valueSize=64-24      46.4µs ± 2%    49.5µs ± 1%  +6.52%  (p=0.000 n=10+10)
MVCCReverseScan_Pebble/rows=10000/versions=1/valueSize=64-24    3.61ms ± 0%    3.82ms ± 1%  +5.72%  (p=0.000 n=9+10)

@erikgrinaker
Copy link
Contributor

Given where we are in the release cycle, I think we've done what we can here. Closing this one out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-storage Storage Team
Projects
None yet
Development

No branches or pull requests

3 participants