-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvclient: regression in BenchmarkColBatchScan due to dist sender metrics #111142
Labels
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
GA-blocker
regression
Regression from a release.
T-kv
KV Team
Comments
yuzefovich
added
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
regression
Regression from a release.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
T-kv
KV Team
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
labels
Sep 22, 2023
This regression can also be observed on
|
nvanbenschoten
added
GA-blocker
and removed
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
labels
Oct 25, 2023
7 tasks
craig bot
pushed a commit
that referenced
this issue
Oct 30, 2023
113069: kvserver: add BenchmarkNodeLivenessScanStorage to measure liveness scan r=andrewbaptist,jbowens a=sumeerbhola Node liveness scans, like the one done in MaybeGossipNodeLivenessRaftMuLocked, while holding raftMu, are performance sensitive, and slowness has caused production issues (https://github.com/cockroachlabs/support/issues/2665, https://github.com/cockroachlabs/support/issues/2107). This benchmark measures the scan performance both when DELs (due to GC) have not been compacted away, and when they have. It also sets up a varying number of live versions since decommissioned nodes will have a single live version. Results on M1 macbook on master with dead-keys=false and compacted=true: ``` NodeLivenessScanStorage/num-live=2/compacted=true-10 26.80µ ± 9% NodeLivenessScanStorage/num-live=5/compacted=true-10 30.34µ ± 3% NodeLivenessScanStorage/num-live=10/compacted=true-10 38.88µ ± 8% NodeLivenessScanStorage/num-live=1000/compacted=true-10 861.5µ ± 3% ``` When compacted=false the scan takes ~10ms, which is > 100x slower, but probably acceptable for this workload. ``` NodeLivenessScanStorage/num-live=2/compacted=false-10 9.430m ± 5% NodeLivenessScanStorage/num-live=5/compacted=false-10 9.534m ± 4% NodeLivenessScanStorage/num-live=10/compacted=false-10 9.456m ± 2% NodeLivenessScanStorage/num-live=1000/compacted=false-10 10.34m ± 7% ``` dead-keys=true (and compacted=false) defeats the NextPrefix optimization, since the next prefix can have all its keys deleted and the iterator has to step through all of them (it can't be sure that all the keys for that next prefix are deleted). This case should not occur in the liveness range, as we don't remove decommissioned entries, but is included for better understanding. ``` NodeLivenessScanStorage/num-live=2/dead-keys=true/compacted=false-10 58.33m ``` Compared to v22.2, the results are sometimes > 10x faster, when the pebbleMVCCScanner seek optimization in v22.2 was defeated. ``` │ sec/op │ sec/op vs base │ NodeLivenessScanStorage/num-live=2/compacted=false-10 117.280m ± 2% 9.430m ± 5% -91.96% (p=0.002 n=6) NodeLivenessScanStorage/num-live=5/compacted=false-10 117.298m ± 0% 9.534m ± 4% -91.87% (p=0.002 n=6) NodeLivenessScanStorage/num-live=10/compacted=false-10 12.009m ± 0% 9.456m ± 2% -21.26% (p=0.002 n=6) NodeLivenessScanStorage/num-live=1000/compacted=false-10 13.04m ± 0% 10.34m ± 7% -20.66% (p=0.002 n=6) │ block-bytes/op │ block-bytes/op vs base │ NodeLivenessScanStorage/num-live=2/compacted=false-10 14.565Mi ± 0% 8.356Mi ± 0% -42.63% (p=0.002 n=6) NodeLivenessScanStorage/num-live=5/compacted=false-10 14.570Mi ± 0% 8.361Mi ± 0% -42.61% (p=0.002 n=6) NodeLivenessScanStorage/num-live=10/compacted=false-10 11.094Mi ± 0% 8.368Mi ± 0% -24.57% (p=0.002 n=6) NodeLivenessScanStorage/num-live=1000/compacted=false-10 12.235Mi ± 0% 8.990Mi ± 0% -26.53% (p=0.002 n=6) │ B/op │ B/op vs base │ NodeLivenessScanStorage/num-live=2/compacted=false-10 42.83Ki ± 4% 41.87Ki ± 0% -2.22% (p=0.002 n=6) NodeLivenessScanStorage/num-live=5/compacted=false-10 43.28Ki ± 3% 41.84Ki ± 0% -3.32% (p=0.002 n=6) NodeLivenessScanStorage/num-live=10/compacted=false-10 37.59Ki ± 0% 41.92Ki ± 0% +11.51% (p=0.002 n=6) NodeLivenessScanStorage/num-live=1000/compacted=false-10 37.67Ki ± 1% 42.66Ki ± 0% +13.23% (p=0.002 n=6) │ allocs/op │ allocs/op vs base │ NodeLivenessScanStorage/num-live=2/compacted=false-10 105.00 ± 8% 85.00 ± 0% -19.05% (p=0.002 n=6) NodeLivenessScanStorage/num-live=5/compacted=false-10 107.00 ± 5% 85.00 ± 0% -20.56% (p=0.002 n=6) NodeLivenessScanStorage/num-live=10/compacted=false-10 74.00 ± 1% 85.00 ± 0% +14.86% (p=0.002 n=6) NodeLivenessScanStorage/num-live=1000/compacted=false-10 79.00 ± 1% 92.00 ± 1% +16.46% (p=0.002 n=6) ``` Relates to https://github.com/cockroachlabs/support/issues/2665 Epic: none Release note: None 113229: kv,server,roachpb: avoid error overhead for x-locality comparison r=pavelkalinnikov a=kvoli Cross locality traffic instrumentation was added to raft, snapshots and batch requests to quantify the amount of cross region/zone traffic. Errors would be returned from `CompareWithLocality` when the region, or zone locality flags were set in an unsupported manner according to our documentation. These error allocations added overhead (cpu/mem) when hit. Alter `CompareWithLocality` to return booleans in place of an error to reduce overhead. Resolves: #111148 Resolves: #111142 Informs: #111561 Release note: None Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Austen McClernon <[email protected]>
blathers-crl bot
pushed a commit
that referenced
this issue
Oct 30, 2023
Cross locality traffic instrumentation was added to raft, snapshots and batch requests to quantify the amount of cross region/zone traffic. Errors would be returned from `CompareWithLocality` when the region, or zone locality flags were set in an unsupported manner according to our documentation. These error allocations added overhead (cpu/mem) when hit. Alter `CompareWithLocality` to return booleans in place of an error to reduce overhead. Resolves: #111148 Resolves: #111142 Informs: #111561 Release note: None
Will be closed on #113302 merging. |
Closed on #113302. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
GA-blocker
regression
Regression from a release.
T-kv
KV Team
I bisected most of the regression on
BenchmarkColBatchScan
frompkg/sql/colflow
between 23.1 and master to 3502ca2My guess (based on a quick glance at the code) is that this is due to having to compute
ba.Size()
andbr.Size()
, so we might not be able to do much about it, but will defer to KV.Jira issue: CRDB-31787
The text was updated successfully, but these errors were encountered: