kvserver: evaluate cpu time as a rebalancing signal in place of qps #90590

kvoli · 2022-10-25T00:30:07Z

Is your feature request related to a problem? Please describe.
QPS is currently used as the primary metric of load upon a replica and when summed over leaseholder replicas, within a store.

However QPS has challenges in accurately accounting for the resource usage, in cases where requests are not uniformly composed. In such cases, a queries-per-second of 1000 on a store could be enough to saturate CPU, while 10k on the same hardware would not.

This issue is to protoype and evaluate replacing QPS within the allocation system (store_rebalancer) with cpu instead.

The evaluation of the two approaches should be w.r.t to a standardized benchmark such as in #86661.

The questions which should be answered upon completion of this issue:

Does using CPU perform better than QPS in balancing CPU usage within a cluster?
Does using CPU perform better than QPS in balancing disk resources (primarily write bandwidth) within a cluster?
To what extent does using CPU prevent admission control overload (io admission.io.overload and queuing latency).
To what extent does using CPU disperse admission control overload (same as above).

Jira issue: CRDB-20851

Epic CRDB-20845

The text was updated successfully, but these errors were encountered:

kvoli · 2023-01-30T15:20:21Z

Summary

Using CPU in place of QPS for allocation rebalancing showed significant improvements in CPU balance with skewed workloads and no improvement/error margin for uniform workloads. Likewise, for disk write bandwidth there were moderate improvements for skewed workloads and no improvement for uniform workloads.

Results

TPCE

Details

TPCC

Details

	tpccbench/nodes=9/cpu=4/multi-region	tpccbench/nodes=6/cpu=16/multi-az
QPS	2007	5048
CPU	2037	4984

Allocbench

Details

kv/r=0/access=skew
master
    median cost(gb):05.81 cpu(%):14.97 write(%):37.83
    range  cost(gb):05.20 cpu(%):10.48 write(%):20.74
    stddev cost(gb):01.87 cpu(%):03.98 write(%):07.01
cpu rebalancing
    median cost(gb):08.76 cpu(%):14.42 write(%):36.61
    range  cost(gb):07.78 cpu(%):04.88 write(%):13.04
    stddev cost(gb):02.66 cpu(%):01.85 write(%):04.80

kv/r=0/ops=skew
master
    median cost(gb):06.23 cpu(%):26.05 write(%):57.33
    range  cost(gb):07.96 cpu(%):16.96 write(%):19.67
    stddev cost(gb):02.92 cpu(%):05.83 write(%):08.20
cpu rebalancing
    median cost(gb):04.28 cpu(%):11.45 write(%):31.28
    range  cost(gb):06.71 cpu(%):07.03 write(%):16.47
    stddev cost(gb):02.25 cpu(%):02.51 write(%):06.68


kv/r=50/ops=skew
master
    median cost(gb):04.36 cpu(%):22.84 write(%):48.09
    range  cost(gb):03.33 cpu(%):08.13 write(%):15.22
    stddev cost(gb):01.12 cpu(%):02.71 write(%):05.51
cpu rebalancing
    median cost(gb):04.64 cpu(%):13.49 write(%):43.05
    range  cost(gb):02.87 cpu(%):03.73 write(%):24.16
    stddev cost(gb):01.07 cpu(%):01.26 write(%):08.58

kv/r=95/access=skew
master
    median cost(gb):00.00 cpu(%):09.51 write(%):01.24
    range  cost(gb):00.00 cpu(%):04.70 write(%):00.72
    stddev cost(gb):00.00 cpu(%):01.74 write(%):00.27
cpu rebalancing
    median cost(gb):00.00 cpu(%):05.66 write(%):01.31
    range  cost(gb):00.00 cpu(%):04.05 write(%):00.77
    stddev cost(gb):00.00 cpu(%):01.56 write(%):00.26


kv/r=95/ops=skew
master
    median cost(gb):0.00 cpu(%):47.29 write(%):00.93
    range  cost(gb):0.24 cpu(%):11.67 write(%):00.51
    stddev cost(gb):0.09 cpu(%):04.30 write(%):00.17
cpu rebalancing
    median cost(gb):0.00 cpu(%):08.16 write(%):01.30
    range  cost(gb):0.03 cpu(%):12.78 write(%):00.50
    stddev cost(gb):0.01 cpu(%):04.59 write(%):00.20

Evaluation Questions

1. Does using CPU perform better than QPS in balancing CPU usage within a cluster?

Using CPU does perform better at balancing CPU usage within a cluster when the
cost per operation in terms of CPU is not uniform. The results from TPCC bench
show no meaningful different in terms of performance between CPU and QPS
balancing; whereas allocbench with skewed read operations (r=95/ops=skew) does
show a significant difference (8% vs 47% max-min cpu).

When there is significant load, not attributable to replicas, both perform
equally poorly balancing CPU usage. An example of this is during TPCE import
that runs on a single node. In this example, there are very few replicas/leases
to rebalance away from this store however it has the highest CPU. There are no
actions available to the allocator here which could meaningfully impact the CPU
balance.

See the below profile of a hot node during TPCE bulk load phase, note the
highlighted parts are attributable to a replica.

CPU Profile

2. Does using CPU perform better than QPS in balancing disk resources (primarily write bandwidth) within a cluster?

Using CPU does perform marginally better than QPS at balancing disk write
bandwidth among stores within a cluster. In a workload with skewed accesses and
only writes, kv/r=0/ops=skew, cpu balancing has better performance in balancing
disk write bandwidth (57% vs 31% max-min disk write bandwidth utilization).
However both are above a reasonable target of 30% max-min and perform weakly in
absolute terms.

When there is a mix of read and write operations (kv/r=50/ops=skew) again both
perform poorly whilst CPU is marginally stronger (43% vs 48% max-min disk write
bandwidth utilization). The stddev is 5.5% and 8.5% respectively
meaning that the different may not be significant (n=5).

3. To what extent does using CPU prevent admission control overload?

In cases where there is CPU resource saturation due to foreground operations
such as the r=95/ops=skew benchmark, cpu balancing does prevent AC
overload. In cases where CPU resource saturation is due to background
operations that do not attribute to a replica e.g. TPCE import step, then
neither QPS nor CPU balancing prevents AC overload.

In cases where there is write resource saturation, cpu balancing does not
appears mildly stronger at preventing AC overload.

4. To what extent does using CPU disperse admission control overload (same as above).

This is much the same as above. In the CPU case, when the CPU contributing to overload is attributed then it is recognized and acted upon. In the inverted LSM case (io overload), read operations tended to consume additional CPU which is again attributed and acted upon - dispersing IO overload.

This commit switches the default load based rebalancing objective from `qps` to `cpu`. A performance comparison can be found on cockroachdb#90590. resolves: cockroachdb#90582 Release note (ops change): CPU balancing is enabled as the default load based rebalancing objective. This can be reverted by setting `kv.allocator.load_based_rebalancing.objective` to `qps`.

97424: kvserver: enable cpu balancing by default r=nvanbenschoten a=kvoli This commit switches the default load based rebalancing objective from `qps` to `cpu`. A performance comparison can be found on #90590. resolves: #90582 Release note (ops change): CPU balancing is enabled as the default load based rebalancing objective. This can be reverted by setting `kv.allocator.load_based_rebalancing.objective` to `qps`. Co-authored-by: Austen McClernon <[email protected]>

kvoli added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-distribution Relating to rebalancing and leasing. labels Oct 25, 2022

kvoli added this to the 23.1 milestone Oct 25, 2022

blathers-crl bot added the T-kv KV Team label Oct 25, 2022

kvoli mentioned this issue Oct 25, 2022

kvserver: allocator cpu balancing for overload protection #90582

Closed

13 tasks

exalate-issue-sync bot assigned kvoli Nov 7, 2022

kvoli closed this as completed Jan 30, 2023

kvoli mentioned this issue Feb 21, 2023

kvserver: enable cpu balancing by default #97424

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: evaluate cpu time as a rebalancing signal in place of qps #90590

kvserver: evaluate cpu time as a rebalancing signal in place of qps #90590

kvoli commented Oct 25, 2022 •

edited by exalate-issue-sync bot

Loading

kvoli commented Jan 30, 2023 •

edited

Loading

kvserver: evaluate cpu time as a rebalancing signal in place of qps #90590

kvserver: evaluate cpu time as a rebalancing signal in place of qps #90590

Comments

kvoli commented Oct 25, 2022 • edited by exalate-issue-sync bot Loading

kvoli commented Jan 30, 2023 • edited Loading

Summary

Results

TPCE

TPCC

Allocbench

Evaluation Questions

1. Does using CPU perform better than QPS in balancing CPU usage within a cluster?

2. Does using CPU perform better than QPS in balancing disk resources (primarily write bandwidth) within a cluster?

3. To what extent does using CPU prevent admission control overload?

4. To what extent does using CPU disperse admission control overload (same as above).

kvoli commented Oct 25, 2022 •

edited by exalate-issue-sync bot

Loading

kvoli commented Jan 30, 2023 •

edited

Loading