-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: MVCCGarbageCollect is slow in v22.2 compared to v22.1 #87277
Comments
cc @cockroachdb/replication |
For reference - this is a change to benchmark on master to add stats and range tombstone used to test #87261 |
Thanks for running these benchmarks! This seems fairly pathological, so we'll definitely need to address this for 22.2. Seems likely to be an issue in Pebble. Maybe it'd be worthwhile to come up with a minimal repro benchmark for the Pebble team to investigate? Btw, for the future, we usually use benchstat to compare benchmark results, and do 5-10 runs of each benchmark to account for variance. It's also a good idea to stop any other processes on the system, and do another run first to warm up the OS/CPU for the workload -- I've seen this affect benchmarks as much as 10%. Something like:
|
I tried a test with the same data that GC test has but that uses MVCCIterator and SeekLT for every object sequentially: The results 22.1 vs master are: |
Those tests use inmem engine, maybe that makes difference more noticeable as we don't have any IO operations involved and impact of compares and such is more visible? |
20% is significant and worth looking into. However, it's pretty far from 1600%, disk IO or not -- do you have any thoughts on where the discrepancy is coming from? Could we use the same engine setup for this benchmark and the GC benchmark to rule that out? Also, enabling intents and range keys on the iterator might make a difference. |
Looking on stacks, those are not the same paths that Next() takes. So it would be a red herring probably. As Jackson mentioned, Next does switchToMinHeap internally and that is calling batchskl.(*Iterator).Next which is causing slowness. It doesn't happen when I just do SeekGT/SeekLT. Naively iterating with Next() over the data set doesn't trigger this behaviour either. |
I think the key to the slowness is updated batch. Tests with just seek or seeklt/next doesn't have significant slowdown by itself. But if test does:
for eligible keys, then degradation is significant: name old time/op new time/op delta This test case is in the draft PR I posted above. Also note that function above is executed in:
and if batch is replaced by eng (create iterator and make updates directly), performance improves 10x. |
To verify the idea that batch gives the worst effects:
batch was also affecting performance in the 22.1 giving 7x slowdown, on master it is giving 98x slowdown. |
cc: @jbowens |
Had a closer look at the minimal repro benchmark in #87373, and it's fairly trivial -- it uses a plain old |
@jbowens I see the multiple Pebble issues opened in response to this, but I don't quite understand what is it about this use case that is causing slowness for this release (compared to the past). |
@sumeerbhola — The change to batch mutation visibility (cockroachdb/pebble#1640) is new in this release. I filed the issue cockroachdb/pebble#1640 as something we might want to explore in the longer term. For avoiding the regression this release, I'm going to look at whether we can enforce batch visibility at the merging iterator level rather than within the batch iterator itself. This would make batch visibility behave more like normal visibility. We don't need to step the batch iterator further unless we actually need to know if it contains a lessor key. |
Previously, visibility of batch point keys was enforced within the batch iterator. This commit moves this visibility enforcement into the merging iterator, alongside the visibility enforcement of committed keys. Span keys (eg, range deletions and range keys) are still subject to visibility filtering when the batch iterator is constructed, because it's necessary to process all these spans to fragment them appropriately. This change addresses the pathological behavior observed in cockroachdb/cockroach#87277: filtering keys within leaf iterators can force the leaf iterators to iterate through more keys than necessary. If many mutations are made to a batch since the iterator was opened, the batch iterator may need to iterate over many entries before finding a visible key. This iteration may extend well beyond the next or previous key in the merging iterator. Using the SeekLT benchmark from cockroachdb/cockroach#87373, relative to master: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 185ms ± 1% 185ms ± 2% ~ (p=0.841 n=5+5) SeekLTNext/useBatch=true-10 26.1s ± 1% 0.2s ± 2% -99.34% (p=0.008 n=5+5) ``` And relative to 22.1: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 182ms ± 6% 185ms ± 2% ~ (p=0.690 n=5+5) SeekLTNext/useBatch=true-10 2.71s ± 7% 0.17s ± 2% -93.68% (p=0.008 n=5+5) ```
Previously, visibility of batch point keys was enforced within the batch iterator. This commit moves this visibility enforcement into the merging iterator, alongside the visibility enforcement of committed keys. Span keys (eg, range deletions and range keys) are still subject to visibility filtering when the batch iterator is constructed, because it's necessary to process all these spans to fragment them appropriately. This change addresses the pathological behavior observed in cockroachdb/cockroach#87277: filtering keys within leaf iterators can force the leaf iterators to iterate through more keys than necessary. If many mutations are made to a batch since the iterator was opened, the batch iterator may need to iterate over many entries before finding a visible key. This iteration may extend well beyond the next or previous key in the merging iterator. Using the SeekLT benchmark from cockroachdb/cockroach#87373, relative to master: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 185ms ± 1% 185ms ± 2% ~ (p=0.841 n=5+5) SeekLTNext/useBatch=true-10 26.1s ± 1% 0.2s ± 2% -99.34% (p=0.008 n=5+5) ``` And relative to 22.1: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 182ms ± 6% 185ms ± 2% ~ (p=0.690 n=5+5) SeekLTNext/useBatch=true-10 2.71s ± 7% 0.17s ± 2% -93.68% (p=0.008 n=5+5) ```
Previously, visibility of batch point keys was enforced within the batch iterator. This commit moves this visibility enforcement into the merging iterator, alongside the visibility enforcement of committed keys. Span keys (eg, range deletions and range keys) are still subject to visibility filtering when the batch iterator is constructed, because it's necessary to process all these spans to fragment them appropriately. This change addresses the pathological behavior observed in cockroachdb/cockroach#87277: filtering keys within leaf iterators can force the leaf iterators to iterate through more keys than necessary. If many mutations are made to a batch since the iterator was opened, the batch iterator may need to iterate over many entries before finding a visible key. This iteration may extend well beyond the next or previous key in the merging iterator. Using the SeekLT benchmark from cockroachdb/cockroach#87373, relative to master: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 185ms ± 1% 185ms ± 2% ~ (p=0.841 n=5+5) SeekLTNext/useBatch=true-10 26.1s ± 1% 0.2s ± 2% -99.34% (p=0.008 n=5+5) ``` And relative to 22.1: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 182ms ± 6% 185ms ± 2% ~ (p=0.690 n=5+5) SeekLTNext/useBatch=true-10 2.71s ± 7% 0.17s ± 2% -93.68% (p=0.008 n=5+5) ```
Previously, visibility of batch point keys was enforced within the batch iterator. This commit moves this visibility enforcement into the merging iterator, alongside the visibility enforcement of committed keys. Span keys (eg, range deletions and range keys) are still subject to visibility filtering when the batch iterator is constructed, because it's necessary to process all these spans to fragment them appropriately. This change addresses the pathological behavior observed in cockroachdb/cockroach#87277: filtering keys within leaf iterators can force the leaf iterators to iterate through more keys than necessary. If many mutations are made to a batch since the iterator was opened, the batch iterator may need to iterate over many entries before finding a visible key. This iteration may extend well beyond the next or previous key in the merging iterator. Using the SeekLT benchmark from cockroachdb/cockroach#87373, relative to master: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 185ms ± 1% 185ms ± 2% ~ (p=0.841 n=5+5) SeekLTNext/useBatch=true-10 26.1s ± 1% 0.2s ± 2% -99.34% (p=0.008 n=5+5) ``` And relative to 22.1: ``` name old time/op new time/op delta SeekLTNext/useBatch=false-10 182ms ± 6% 185ms ± 2% ~ (p=0.690 n=5+5) SeekLTNext/useBatch=true-10 2.71s ± 7% 0.17s ± 2% -93.68% (p=0.008 n=5+5) ```
Fixed by #87714. |
Running BenchmarkMVCCGarbageCollect (+modified) to include MVCCStats to make it more comparable to real runs:
If we compare highlighted (with >) case:
we see a 16x degradation which looks scary.
Flame graph of reruning only this test:
(running 22.1 release branch with just enginepb.MVCCStats pointer added to GC invocation)
And on master:
(with now extra options for range tombstones, but they are disabled)
Jira issue: CRDB-19249
Epic CRDB-2624
The text was updated successfully, but these errors were encountered: