kvserver: introduce cpu rebalancing #96127

kvoli · 2023-01-27T22:21:07Z

This patch allows the store rebalancer to use CPU in place of QPS when
balancing load on a cluster. This patch adds cpu as an option with the
cluster setting:

kv.allocator.load_based_rebalancing.objective

When set to cpu, rather than qps. The store rebalancer will perform
a mostly identical function, however target balancing the sum of all
replica's cpu time on each store, rather than qps. The default remains
as qps here.

Similar to QPS, the rebalance threshold can be set to allow controlling
the range above and below the mean store CPU is considered imbalanced,
either overfull or underfull respectively:

kv.allocator.cpu_rebalance_threshold: 0.1

In order to manage with mixed versions during upgrade and some
architectures not supporting the cpu sampling method, a rebalance
objective manager is introduced in rebalance_objective.go. The manager
mediates access to the rebalance objective and overwrites it in cases
where the objective set in the cluster setting cannot be supported.

The results when using CPU in comparison to QPS can be found here (internal).

Results Summary

Detailed Allocbench Results

kv/r=0/access=skew
master
    median cost(gb):05.81 cpu(%):14.97 write(%):37.83
    stddev cost(gb):01.87 cpu(%):03.98 write(%):07.01
cpu rebalancing
    median cost(gb):08.76 cpu(%):14.42 write(%):36.61
    stddev cost(gb):02.66 cpu(%):01.85 write(%):04.80
kv/r=0/ops=skew
master
    median cost(gb):06.23 cpu(%):26.05 write(%):57.33
    stddev cost(gb):02.92 cpu(%):05.83 write(%):08.20
cpu rebalancing
    median cost(gb):04.28 cpu(%):11.45 write(%):31.28
    stddev cost(gb):02.25 cpu(%):02.51 write(%):06.68
kv/r=50/ops=skew
master
    median cost(gb):04.36 cpu(%):22.84 write(%):48.09
    stddev cost(gb):01.12 cpu(%):02.71 write(%):05.51
cpu rebalancing
    median cost(gb):04.64 cpu(%):13.49 write(%):43.05
    stddev cost(gb):01.07 cpu(%):01.26 write(%):08.58
kv/r=95/access=skew
master
    median cost(gb):00.00 cpu(%):09.51 write(%):01.24
    stddev cost(gb):00.00 cpu(%):01.74 write(%):00.27
cpu rebalancing
    median cost(gb):00.00 cpu(%):05.66 write(%):01.31
    stddev cost(gb):00.00 cpu(%):01.56 write(%):00.26
kv/r=95/ops=skew
master
    median cost(gb):0.00 cpu(%):47.29 write(%):00.93
    stddev cost(gb):0.09 cpu(%):04.30 write(%):00.17
cpu rebalancing
    median cost(gb):0.00 cpu(%):08.16 write(%):01.30
    stddev cost(gb):0.01 cpu(%):04.59 write(%):00.20

resolves: #95380

Release note (ops change) Add option to balance cpu time (cpu)
instead of queries per second (qps) among stores in a cluster. This is
done by setting kv.allocator.load_based_rebalancing.objective='cpu'.
kv.allocator.cpu_rebalance_threshold is also added, similar to
kv.allocator.qps_rebalance_threshold to control the target range for
store cpu above and below the cluster mean.

cockroach-teamcity · 2023-01-27T22:21:21Z

This change is

nvanbenschoten

Reviewed 1 of 1 files at r1, 3 of 3 files at r2, 20 of 20 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @kvoli)

-- commits line 33 at r3:
", however, it will target"

-- commits line 51 at r3:
Release note (ops change): or this will trip up automation.

pkg/kv/kvserver/rebalance_objective.go line 32 at r3 (raw file):

// qps: which is the original default setting and looks at the number of batch
//
//	requests, on a range and store.

nit: something's off with this indentation.

pkg/kv/kvserver/rebalance_objective.go line 37 at r3 (raw file):

type LBRebalancingObjective int64

const (

This feels like the right place to add some discussion about the details of each of these approaches. For instance, we could discuss weighted batch requests for QPS-based rebalancing. We could also discuss how CPU is attributed to a given replica and how we decided between the two alternative approaches presented in https://docs.google.com/document/d/1_FJvTV0y3q0nOC-1xjfoyPZfo1yEV2oMriQek7LT7dg/edit#heading=h.g5iv0fx0qnlf. Also, you could mention the details of how multi-store rebalancing works in each case.

Let's not let all of your hard-earned learning be lost to the sands of time.

pkg/kv/kvserver/rebalance_objective.go line 68 at r3 (raw file):

// uniquely to a load.Dimension. However, in the future it is forseeable that
// LBRebalancingObjective could be a value that encompassese many different
// dimension within a single objective e.g. bytes written, cpu usage and

"dimensions"

pkg/kv/kvserver/rebalance_objective.go line 85 at r3 (raw file):

// of the cluster. It is possible that the cluster setting objective may not be
// the objective returned, when the cluster environment is unsupported or mixed
// mixed versions exist.

"mixed mixed"

pkg/kv/kvserver/rebalance_objective.go line 86 at r3 (raw file):

// the objective returned, when the cluster environment is unsupported or mixed
// mixed versions exist.
type RebalanceObjectiveProvider interface {

Does this type need to be exported? Same question for ResolveLBRebalancingObjective.

pkg/kv/kvserver/rebalance_objective.go line 92 at r3 (raw file):

// RebalanceObjectiveManager implements the RebalanceObjectiveProvider
// interface and registering a callback at creation time, that will be called

"registers"

pkg/kv/kvserver/rebalance_objective.go line 165 at r3 (raw file):

	// in such cases we cannot balance another objective since the data may not
	// exist. Fall back to QPS balancing.
	if !st.Version.IsActive(ctx, clusterversion.V23_1AllocatorCPUBalancing) {

Is there a risk of some nodes in the system recognizing that the cluster has been fully upgraded to v23.1 and others not? The propagation of this information is asynchronous and there's no bound on the delay. Do you have a sense of how bad the thrashing could be if two nodes in the system are using different objectives?

pkg/kv/kvserver/rebalance_objective.go line 170 at r3 (raw file):

	}
	// When the cpu timekeeping utility is unsupported on this aarch, the cpu
	// usage cannot be gathered. Fall back to QPS balancing.

What if another node in the system does not support grunning? Will that lead to different nodes using different objectives? We may not need to handle that case well, but we should make sure that we don't handle it catastrophically.

pkg/kv/kvserver/store.go line 1225 at r3 (raw file):

	s.rebalanceObjManager = newRebalanceObjectiveManager(ctx, s.cfg.Settings,
		func(ctx context.Context, obj LBRebalancingObjective) {
			s.VisitReplicas(func(r *Replica) bool {

nit: can we name the return arg so that it's clear what it signifies?

pkg/kv/kvserver/store_rebalancer.go line 268 at r3 (raw file):

			hottestRanges := sr.replicaRankings.TopLoad()
			options := sr.scorerOptions(ctx)
			rctx := sr.NewRebalanceContext(ctx, options, hottestRanges, sr.RebalanceMode())

mode?

RebalanceMode and RebalanceObjective are both volatile. They can change each time they are checked. Do we need to be any more deliberate than we already are about ensuring that a single pass through the allocator only acts on a single mode/objective?

pkg/kv/kvserver/allocator/range_usage_info.go line 45 at r2 (raw file):

}

// TransferImpact returns the impact of transferring the lease for the range,

"impact" is ambiguous. Is it implied here that the outgoing leaseholder will see a reduction in its load by the returned amount and the incoming leaseholder will see an increase in its load by the returned amount? If so, consider spelling that out in the comment.

pkg/kv/kvserver/allocator/allocatorimpl/threshold.go line 129 at r3 (raw file):

// SetAllDims returns a load vector with all dimensions filled in with the
// value given.
func SetAllDims(v float64) load.Load {

nit: WithAllDims might be a bit more reflective of what the function does. I would expect a function starting with Set to accept an object to set the value on. Feel free to ignore.

pkg/kv/kvserver/allocator/storepool/store_pool.go line 645 at r3 (raw file):

			if toDetail := sp.GetStoreDetailLocked(old.StoreID); toDetail != nil {
				toDetail.Desc.Capacity.RangeCount--
				toDetail.Desc.Capacity.CPUPerSecond -= rangeUsageInfo.RaftCPUNanosPerSecond

Should this saturate at 0 instead of going negative as well?

kvoli

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @nvanbenschoten)

-- commits line 33 at r3:

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

", however, it will target"

Updated.

-- commits line 51 at r3:

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Release note (ops change): or this will trip up automation.

Updated

pkg/kv/kvserver/rebalance_objective.go line 32 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: something's off with this indentation.

Updated

pkg/kv/kvserver/rebalance_objective.go line 37 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This feels like the right place to add some discussion about the details of each of these approaches. For instance, we could discuss weighted batch requests for QPS-based rebalancing. We could also discuss how CPU is attributed to a given replica and how we decided between the two alternative approaches presented in https://docs.google.com/document/d/1_FJvTV0y3q0nOC-1xjfoyPZfo1yEV2oMriQek7LT7dg/edit#heading=h.g5iv0fx0qnlf. Also, you could mention the details of how multi-store rebalancing works in each case.

Let's not let all of your hard-earned learning be lost to the sands of time.

Updated.

pkg/kv/kvserver/rebalance_objective.go line 68 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"dimensions"

Updated

pkg/kv/kvserver/rebalance_objective.go line 85 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"mixed mixed"

Updated

pkg/kv/kvserver/rebalance_objective.go line 86 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Does this type need to be exported? Same question for ResolveLBRebalancingObjective.

Exported for use in the simulator shortly. It is passed as an argument to create a new store rebalancer.

pkg/kv/kvserver/rebalance_objective.go line 92 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"registers"

Updated.

pkg/kv/kvserver/rebalance_objective.go line 165 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is there a risk of some nodes in the system recognizing that the cluster has been fully upgraded to v23.1 and others not? The propagation of this information is asynchronous and there's no bound on the delay. Do you have a sense of how bad the thrashing could be if two nodes in the system are using different objectives?

There's a risk, thrashing could occur between the new and old nodes if the different versions lasted >=2 the store rebalancer interval (1 min default).

Thrashing would only occur however if the values for the objectives are fairly different - like allocbench/nodes=7/cpu=8/kv/r=95/ops=skew. Even then, it's not guaranteed that thrashing occurs but likely.

Are there alternatives to this approach? I don't consider this a massive risk but curious if there's better approaches that eliminate the risk.

pkg/kv/kvserver/rebalance_objective.go line 170 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What if another node in the system does not support grunning? Will that lead to different nodes using different objectives? We may not need to handle that case well, but we should make sure that we don't handle it catastrophically.

It will lead to nodes using different objectives. However the impact is much worse, the stores on node's that don't support grunning will gossip CPU=0 and every other store will move load onto it.

I've reworked a few things to now gossip -1 CPU when the store is on a node that doesn't support cpu attribution. ResolveRebalancingObjectivenow checks the gossipped store descriptors and will revert to QPS if any of them have the CPUPerSecond set to -1. I want avoid adding another field to gossip if possible. Overloading the value itself does complicate things but seems logical given that it would otherwise be 0 and a negative value should never occur otherwise.

pkg/kv/kvserver/store.go line 1225 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: can we name the return arg so that it's clear what it signifies?

Updated.

pkg/kv/kvserver/store_rebalancer.go line 268 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

mode?

RebalanceMode and RebalanceObjective are both volatile. They can change each time they are checked. Do we need to be any more deliberate than we already are about ensuring that a single pass through the allocator only acts on a single mode/objective?

Updated this to only instantiate in the outer process loop, which feeds the argument to the scorer options and acts as the source of truth.

pkg/kv/kvserver/allocator/range_usage_info.go line 45 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"impact" is ambiguous. Is it implied here that the outgoing leaseholder will see a reduction in its load by the returned amount and the incoming leaseholder will see an increase in its load by the returned amount? If so, consider spelling that out in the comment.

Updated to spell this out.

pkg/kv/kvserver/allocator/allocatorimpl/threshold.go line 129 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: WithAllDims might be a bit more reflective of what the function does. I would expect a function starting with Set to accept an object to set the value on. Feel free to ignore.

I agree, updated.

pkg/kv/kvserver/allocator/storepool/store_pool.go line 645 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this saturate at 0 instead of going negative as well?

This should saturate at 0 too. Updated.

nvanbenschoten

Reviewed 1 of 3 files at r5, 20 of 20 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @kvoli)

pkg/kv/kvserver/rebalance_objective.go line 165 at r3 (raw file):

Previously, kvoli (Austen) wrote…

There's a risk, thrashing could occur between the new and old nodes if the different versions lasted >=2 the store rebalancer interval (1 min default).

Thrashing would only occur however if the values for the objectives are fairly different - like allocbench/nodes=7/cpu=8/kv/r=95/ops=skew. Even then, it's not guaranteed that thrashing occurs but likely.

Are there alternatives to this approach? I don't consider this a massive risk but curious if there's better approaches that eliminate the risk.

One approach that could be workable here is to include each node's rebalancing objective in its gossiped store descriptor or store capacity. Nodes would then only rebalance to other nodes that are still using the same rebalance objective. Without any synchronous coordination to verify the objective before rebalancing (think causality passing), this is still racy, but I think it would have less chance of thrashing. Once either node noticed that the other was using a different objective than it, it would stop rebalancing and wait for the objectives to converge. Or, if they never converged (see below with the grunning question), this would prevent rebalancing between incompatible nodes entirely.

EDIT: This is similar to what you're doing now with CPUPerSecond == -1. What you're doing there makes sense.

pkg/kv/kvserver/rebalance_objective.go line 52 at r6 (raw file):

	// impact of an action by using the QPS of the leaseholder replica invovled
	// e.g. the impact of lease transfers on the stores invovled is
	// +leaseholder replica QPS on the store that receives the lease and

The inclusion of the "replica" here but not in the next line looks intentional. Is it?

pkg/kv/kvserver/rebalance_objective.go line 64 at r6 (raw file):

	// LBRebalancingCPU is a rebalance objective that aims balances the store
	// CPU usage. The store CPU usage is calculated as the sum of replica's cpu

replicas'

pkg/kv/kvserver/rebalance_objective.go line 69 at r6 (raw file):

the behavior doesn't change

This could mean a few different things. Could you call out that we try to balance CPU across stores on the same node?

pkg/kv/kvserver/rebalance_objective.go line 77 at r6 (raw file):

and balance each stores' process usage

Consider adding something like ", using the measured replica cpu usage only to determine which replica to rebalance, but not when to rebalance or who to rebalance to". Or something along those lines to better contrast the two approaches.

pkg/kv/kvserver/rebalance_objective.go line 80 at r6 (raw file):

	// process usage. This approach benefits from observing the "true" cpu
	// usage, rather than just the sum of replica's usage. However, unlike the
	// implemented approach, the estmated impact of actions was less reliable

"estimated"

pkg/kv/kvserver/store.go line 2612 at r6 (raw file):

	// balancing objective is permitted. If this is not updated, the cpu per
	// second will be zero and other stores will likely begin rebalancing
	// towards this store as it will appear underfull.

This is probably not worth it, but if there is a concern about forgetting that -1 is a sentinel value in some code paths and rebalancing towards a node with !grunning.Supported(), we could also make math.MaxInt64 the sentinel value.

pkg/kv/kvserver/allocator/storepool/store_pool.go line 691 at r6 (raw file):

		// When CPU attribution is unsupported, the store will set the
		// CPUPerSecond of its store capacity to be -1.
		if fromDetail.Desc.Capacity.CPUPerSecond >= 0 {

These all feel a bit error-prone, but I don't have a good suggestion for how to make it harder to forget to check this.

kvoli

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @nvanbenschoten)

pkg/kv/kvserver/rebalance_objective.go line 52 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The inclusion of the "replica" here but not in the next line looks intentional. Is it?

Somewhat intentional but now that I think about it, it doesn't seem right to make any distinction. Updated to be leaseholder replica for both.

pkg/kv/kvserver/rebalance_objective.go line 64 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

replicas'

updated

pkg/kv/kvserver/rebalance_objective.go line 69 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

the behavior doesn't change

This could mean a few different things. Could you call out that we try to balance CPU across stores on the same node?

Updated with an example.

pkg/kv/kvserver/rebalance_objective.go line 77 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

and balance each stores' process usage

Consider adding something like ", using the measured replica cpu usage only to determine which replica to rebalance, but not when to rebalance or who to rebalance to". Or something along those lines to better contrast the two approaches.

Added

pkg/kv/kvserver/rebalance_objective.go line 80 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"estimated"

Updated

pkg/kv/kvserver/store.go line 2612 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This is probably not worth it, but if there is a concern about forgetting that -1 is a sentinel value in some code paths and rebalancing towards a node with !grunning.Supported(), we could also make math.MaxInt64 the sentinel value.

I prefer the negative value because it feels like a pattern we could follow if we bake it into the store pool updates - see below comment.

pkg/kv/kvserver/allocator/storepool/store_pool.go line 691 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

These all feel a bit error-prone, but I don't have a good suggestion for how to make it harder to forget to check this.

There's probably some work that could go into the storepool either this release or early next release to make this easier to handle. Something akin to "virtualizing" the store descriptors and keeping the underlying gossiped descriptor as a physical value of sorts.

related issues:
#89395
#93532

kvoli

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @nvanbenschoten)

pkg/kv/kvserver/rebalance_objective.go line 186 at r9 (raw file):

// Objective returns the current rebalance objective.
func (rom *RebalanceObjectiveManager) Objective(ctx context.Context) LBRebalancingObjective {
	rom.maybeUpdateRebalanceObjective(ctx)

I added this maybeUpdateRebalanceObjective so that the storepool would be checked for CPU=-1. The check is now more expensive. The check being more expensive isn't an issue for CPU balancing, since it is only checked when calling Capacity and every time the store rebalancer wakes up. The check being more expensive is an issue for CPU load splitting as that checks the RebalanceObjective more frequently, including every request to the replica.

I'm going to change the design of how we get the information. I'll ping when I've updated.

nvanbenschoten

Reviewed 1 of 3 files at r11, 22 of 22 files at r12, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist and @kvoli)

pkg/kv/kvserver/rebalance_objective.go line 69 at r6 (raw file):

Previously, kvoli (Austen) wrote…

Updated with an example.

Thanks, the added example is helpful.

pkg/kv/kvserver/rebalance_objective.go line 202 at r12 (raw file):

	capacityChangeNotifier.SetOnCapacityChange(
		func(storeID roachpb.StoreID, old, cur roachpb.StoreCapacity) {
			if old.CPUPerSecond < 0 && cur.CPUPerSecond >= 0 ||

This might be easier to understand if written as if (old.CPUPerSecond < 0) != (new.CPUPerSecond < 0) {

kvoli

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist and @nvanbenschoten)

pkg/kv/kvserver/rebalance_objective.go line 202 at r12 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This might be easier to understand if written as if (old.CPUPerSecond < 0) != (new.CPUPerSecond < 0) {

That is easier to understand than what I had, updated to this.

Previously `Queries` was hardcoded as the dimension to use when creating test ranges for use in store rebalancer tests. This patch enables passing in any `dimension`. Release note: None

Previously, when estimating the impact a lease transfer would have, we would not differentiate between rebalance/transfer. This commit adds a utility method `TransferImpact` to `RangeUsageInfo` which is now used when making estimations about lease transers. Currently, this is identical to `Load`, which was previously used instead for transfers. Release note: None

This patch allows the store rebalancer to use CPU in place of QPS when balancing load on a cluster. This patch adds `cpu` as an option with the cluster setting: `kv.allocator.load_based_rebalancing.objective` When set to `cpu`, rather than `qps`. The store rebalancer will perform a mostly identical function, however, it will target balancing the sum of all replica's cpu time on each store, rather than qps. The default remains as `qps` here. Similar to QPS, the rebalance threshold can be set to allow controlling the range above and below the mean store CPU is considered imbalanced, either overfull or underfull respectively: `kv.allocator.cpu_rebalance_threshold`: 0.1 In order to manage with mixed versions during upgrade and some architectures not supporting the cpu sampling method, a rebalance objective manager is introduced in `rebalance_objective.go`. The manager mediates access to the rebalance objective and overwrites it in cases where the objective set in the cluster setting cannot be supported. resolves: cockroachdb#95380 Release note (ops change): Add option to balance cpu time (cpu) instead of queries per second (qps) among stores in a cluster. This is done by setting `kv.allocator.load_based_rebalancing.objective='cpu'`. `kv.allocator.cpu_rebalance_threshold` is also added, similar to `kv.allocator.qps_rebalance_threshold` to control the target range for store cpu above and below the cluster mean.

kvoli · 2023-02-09T18:03:12Z

TYFTR

kvoli · 2023-02-09T18:16:22Z

bors r=nvanbenschoten

craig · 2023-02-09T19:59:28Z

Build succeeded:

Bazel Essential CI (Cockroach)

96128: kvserver: introduce cpu load based splits r=nvanbenschoten a=kvoli This PR adds the ability to perform load based splitting with replica cpu usage rather than queries per second. Load based splitting now will use either cpu or qps for deciding split points, depending on the cluster setting `kv.allocator.load_based_rebalancing.objective`. When set to `qps`, qps is used in deciding split points and when splitting should occur; similarly, `cpu` means that request cpu against the leaseholder replica is to decide split points. The split threshold when using `cpu` is the cluster setting `kv.range_split.load_cpu_threshold` which defaults to `250ms` of cpu time per second, i.e. a replica using 1/4 processor of a machine consistently. The merge queue uses the load based splitter to make decisions on whether to merge two adjacent ranges due to low load. This commit also updates the merge queue to be consistent with the load based splitter signal. When switching between `qps` and `cpu`, the load based splitter for every replica is reset to avoid spurious results. ![image](https://user-images.githubusercontent.com/39606633/215580650-b12ff509-5cf5-4ffa-880d-8387e2ef0afa.png) resolves: #95377 resolves: #83519 depends on: #96127 Release note (ops change): Load based splitter now supports using request cpu usage to split ranges. This is introduced with the previous cluster setting `kv.allocator.load_based_rebalancing.objective`, which when set to `cpu`, will use request cpu usage. The threshold above which CPU usage of a range is considered for splitting is defined in the cluster setting `kv.range_split.load_cpu_threshold`, which has a default value of `250ms`. 96129: bazel: is_dev_linux config setting should be false when cross_flag is r=rickystewart a=srosenberg true When cross-compiling on a gce worker and invoking bazel directly, build fails with "cannot find -lresolv_wrapper", owing to the fact that is_dev_linux is not mutually exclusive of cross_linux. That is, when both configs are active "-lresolv_wrapper" is passed to clinkopts in pkg/ccl/gssapiccl/BUILD.bazel; the cross-compiler doesn't have this lib nor does it need it. Note, when using ./dev instead of bazel, the above issue is side-stepped because the dev wrapper invokes bazel inside docker which ignores ~/.bazelrc. The workaround is to make is_dev_linux false when cross_flag is true. Epic: none Release note: None 96431: ui: databases shows partial results for size limit error r=j82w a=j82w The databases page displays partial results instead of just showing an error message. Sorting is disabled if there are more than 2 pages of results which is currently configured to 40dbs. This still allows most user to use sort functionality, but prevents large customers from breaking when it would need to do a network call per a database. The database details are now loaded on demand for the first page only. Previously a network call was done for all databases which would result in 2k network calls. It now only loads the page of details the user is looking at. part of: #94333 https://www.loom.com/share/31b213b2f1764d0f9868bd967b9388b8 Release note: none Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Stan Rosenberg <[email protected]> Co-authored-by: j82w <[email protected]>

In cockroachdb#96127 we added the option to load balance replica CPU instead of QPS across stores in a cluster. It is desirable to view the signal being controlled for rebalancing in the replication dashboard, similar to QPS. This commit adds the `rebalancing.cpunanospersecond` metric to the replication metrics dashboard. Resolves: cockroachdb#98109 Release note (ui change): `rebalancing.cpunanospersecond` is now included in the replication metrics dashboard.

97717: multitenant: add AdminUnsplitRequest capability r=knz a=ecwall Fixes #97716 1) Add a new `tenantcapabilitiespb.CanAdminUnsplit` capability. 2) Check capability in `Authorizer.HasCapabilityForBatch`. 3) Remove `ExecutorConfig.RequireSystemTenant` call from `execFactory.ConstructAlterTableUnsplit`, `execFactory.ConstructAlterTableUnsplitAll`. Release note: None 98250: kvserver: add minimum cpu lb split threshold r=andrewbaptist a=kvoli Previously, `kv.range_split.load_cpu_threshold` had no minimum setting value. It is undesirable to allow users to set this setting to low as excessive splitting may occur. `kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`. See #96869 for additional context on the threshold. Resolves: #98107 Release note (ops change): `kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`. 98270: dashboards: add replica cpu to repl dashboard r=xinhaoz a=kvoli In #96127 we added the option to load balance replica CPU instead of QPS across stores in a cluster. It is desirable to view the signal being controlled for rebalancing in the replication dashboard, similar to QPS. This pr adds the `rebalancing.cpunanospersecond` metric to the replication metrics dashboard. The avg QPS graph on the replication graph previously described the metric as "Exponentially weighted average", however this is not true. This pr updates the description to just be "moving average" which is accurate. Note that follow the workload does use an exponentially weighted value, however the metric in the dashboard is not the same. This pr also updates the graph header to include Replica in the title: "Average Replica Queries per Node". QPS is specific to replicas. This is already mentioned in the description. Resolves: #98109 98289: storage: mvccExportToWriter should always return buffered range keys r=adityamaru a=stevendanna In #96691, we changed the return type of mvccExportToWriter such that it now indicates when a CPU limit has been reached. As part of that change, when the CPU limit was reached, we would immedately `return` rather than `break`ing out of the loop. This introduced a bug, since the code after the loop that the `break` was taking us to is important. Namely, we may have previously buffered range keys that we need to write into our response still. By replacing the break with a return, these range keys were lost. The end-user impact of this is that a BACKUP that _ought_ to have included range keys (such as a backup of a table with a rolled back IMPORT) would not include those range keys and thus would end up resurecting deleted keys upon restore. This PR brings back the `break` and adds a test that covers exporting with range keys under CPU exhaustion. This bug only ever existed on master. Informs #97694 Epic: none Release note: None 98329: sql: fix iteration conditions in crdb_internal.scan r=ajwerner a=stevendanna Rather than using the Next() key of the last key in the response when iterating, we should use the resume span. The previous code could result in a failure in the rare case that the end key of our scan exactly matched the successor key of the very last key in the iteration. Epic: none Release note: None Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Steven Danna <[email protected]>

kvoli mentioned this pull request Jan 27, 2023

kvserver: introduce cpu load based splits #96128

Merged

kvoli marked this pull request as ready for review January 27, 2023 22:24

kvoli requested a review from a team as a code owner January 27, 2023 22:24

kvoli self-assigned this Jan 27, 2023

kvoli requested review from nvanbenschoten and andrewbaptist January 27, 2023 22:24

kvoli changed the title ~~kvserver: introdue cpu rebalancing~~ kvserver: introduce cpu rebalancing Jan 27, 2023

kvoli mentioned this pull request Jan 27, 2023

kvserver: introduce cpu rebalancing #95152

Closed

nvanbenschoten reviewed Feb 6, 2023

View reviewed changes

kvoli force-pushed the 230120.cpu-balancing branch from c8cf548 to 178aa23 Compare February 7, 2023 20:51

kvoli commented Feb 7, 2023

View reviewed changes

kvoli force-pushed the 230120.cpu-balancing branch from 178aa23 to d8bf342 Compare February 8, 2023 00:52

nvanbenschoten reviewed Feb 8, 2023

View reviewed changes

kvoli force-pushed the 230120.cpu-balancing branch from d8bf342 to 0b0dfd5 Compare February 8, 2023 13:56

kvoli commented Feb 8, 2023

View reviewed changes

kvoli force-pushed the 230120.cpu-balancing branch 3 times, most recently from c9db908 to 66d42dd Compare February 8, 2023 18:48

kvoli requested a review from nvanbenschoten February 8, 2023 18:51

kvoli force-pushed the 230120.cpu-balancing branch 4 times, most recently from a4e30d4 to 130c1d5 Compare February 9, 2023 00:31

nvanbenschoten approved these changes Feb 9, 2023

View reviewed changes

kvoli force-pushed the 230120.cpu-balancing branch from 130c1d5 to a65f21f Compare February 9, 2023 13:56

kvoli commented Feb 9, 2023

View reviewed changes

kvoli added 2 commits February 9, 2023 15:11

kvserver: pass load dimension to test load ranges

a766bc8

Previously `Queries` was hardcoded as the dimension to use when creating test ranges for use in store rebalancer tests. This patch enables passing in any `dimension`. Release note: None

kvoli force-pushed the 230120.cpu-balancing branch from a65f21f to c28ed6b Compare February 9, 2023 15:11

craig bot merged commit 732c3a7 into cockroachdb:master Feb 9, 2023

cockroach-teamcity mentioned this pull request Feb 10, 2023

PR #96127 - kvserver: introduce cpu rebalancing cockroachdb/docs#16235

Open

pav-kv mentioned this pull request Feb 10, 2023

kvserver: TestLoadBasedRebalancingObjective failed #96934

Closed

kvoli mentioned this pull request Mar 6, 2023

dbconsole: include rebalancing.cpupersecond graph in the replication dashboard #98109

Closed

kvoli mentioned this pull request Mar 9, 2023

dashboards: add replica cpu to repl dashboard #98270

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: introduce cpu rebalancing #96127

kvserver: introduce cpu rebalancing #96127

kvoli commented Jan 27, 2023 •

edited

Loading

cockroach-teamcity commented Jan 27, 2023

nvanbenschoten left a comment

kvoli left a comment

nvanbenschoten left a comment

kvoli left a comment

kvoli left a comment

nvanbenschoten left a comment

kvoli left a comment

kvoli commented Feb 9, 2023 •

edited

Loading

kvoli commented Feb 9, 2023

craig bot commented Feb 9, 2023

kvserver: introduce cpu rebalancing #96127

kvserver: introduce cpu rebalancing #96127

Conversation

kvoli commented Jan 27, 2023 • edited Loading

cockroach-teamcity commented Jan 27, 2023

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

kvoli commented Feb 9, 2023 • edited Loading

kvoli commented Feb 9, 2023

craig bot commented Feb 9, 2023

kvoli commented Jan 27, 2023 •

edited

Loading

kvoli commented Feb 9, 2023 •

edited

Loading