storage: make lease rebalancing decisions at the store level #28340

a-robinson · 2018-08-07T17:28:25Z

In order to better balance the QPS being served by each store to avoid
overloaded nodes.

Release note (performance improvement): Range leases will be
automatically rebalanced throughout the cluster to even out the amount
of QPS being handled by each node.

cockroach-teamcity · 2018-08-07T17:28:33Z

This change is

a-robinson · 2018-08-07T17:29:41Z

Tested primarily via tpc-c on roachprod and the new roachtest. I should give allocsim a look too but am going to start integrating replica-rebalancing decisions as a follow-up.

BramGruneir

with some nits.

But I worry about the test being appropriate with KV.

Reviewed 3 of 3 files at r1, 8 of 8 files at r2, 3 of 3 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/cmd/roachtest/rebalance_load.go, line 16 at r3 (raw file):

// for names of contributors.

package main

So this is still checking with kv to make sure the leases are evenly distributed.

But the workload is evenly distributed.

Perhaps KV is not the right load for this and instead use something that stresses some parts more than others. And instead checks to see if QPS is evenly distributed. (This might be tough without load based rebalancing)

pkg/storage/allocator.go, line 1037 at r2 (raw file):

	// rebalancing is enabled? In happy cases it's nice to keep this working
	// to even out the number of leases in addition to the number of replicas,
	// but it's certainly a blunt instrument that could undo what we want.

I think this might be a good idea. But of course, requires a lot of testing.

pkg/storage/replicate_queue.go, line 563 at r2 (raw file):

			return false, nil
		}
		if err := rq.transferLease(ctx, repl, target); err != nil {

I think you can simplify this.

err := rq.transferLease(ctx, repl, target)
return err == nil, err

And reverse if case above from target != (roachpb.ReplicaDescriptor{}) to target == (roachpb.ReplicaDesciptor{}) and early exit. to avoid nesting.

pkg/storage/store_rebalancer.go, line 129 at r2 (raw file):

	for localDesc.Capacity.QueriesPerSecond > qpsMaxThreshold {
		log.Infof(ctx, "considering load-based lease transfers for s%d with %.2f qps (mean=%.2f, upperThreshold=%2.f)",

Are these debug messages? Need to be venet 3 I think.

pkg/storage/store_rebalancer.go, line 152 at r2 (raw file):

// TODO(a-robinson): Should we take the number of leases on each store into
// account here or just continue to let that happen in allocator.go?

How much do lease counts matter if we're balancing on qps anyway?

pkg/storage/store_rebalancer.go, line 187 at r2 (raw file):

		}

		// TODO: Don't bother moving leases whose QPS is below some fraction of the

nit: format the todo with your name. If we still rebalance leases based on counts, this is needed. If we turn that off, then this is fine.

a-robinson

TFTR!

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/cmd/roachtest/rebalance_load.go, line 16 at r3 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

So this is still checking with kv to make sure the leases are evenly distributed.

But the workload is evenly distributed.

Perhaps KV is not the right load for this and instead use something that stresses some parts more than others. And instead checks to see if QPS is evenly distributed. (This might be tough without load based rebalancing)

Not quite -- this is checking that the 3 leases for the table are evenly distributed, not that all leases are evenly distributed. This was a very simple test that passes with this change and fails most of the time without it because the total number of ranges in a new cluster greatly exceeds the number of ranges (3) in the table used by kv. Without load-based rebalancing, the leases for the kv table will typically all just stay on the node that had the initial range lease before the ALTER TABLE ... SPLIT command, since the other nodes have enough leases for other various ranges that the cluster is still considered balanced.

I agree that you could get slightly different test coverage with a couple other workloads, but in general all the tests I can think of would be testing that a minority of hot ranges get balanced properly, which this one does as well, just in a very simple way.

pkg/storage/allocator.go, line 1037 at r2 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

I think this might be a good idea. But of course, requires a lot of testing.

Yeah, I don't know that we have the time to fully remove this right now. We'll have to try out a couple antagonistic workloads before sticking with this, though.

pkg/storage/replicate_queue.go, line 563 at r2 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

I think you can simplify this.
err := rq.transferLease(ctx, repl, target)
return err == nil, err
And reverse if case above from target != (roachpb.ReplicaDescriptor{}) to target == (roachpb.ReplicaDesciptor{}) and early exit. to avoid nesting.

Done.

pkg/storage/store_rebalancer.go, line 129 at r2 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Are these debug messages? Need to be venet 3 I think.

Sorry, I meant to have this outside the loop. Fixed.

I do think it'll be nice to actually log this at info level, since it'll only happen a max of once a minute and even then only on heavily loaded clusters with a significant qps imbalance.

pkg/storage/store_rebalancer.go, line 152 at r2 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

How much do lease counts matter if we're balancing on qps anyway?

There's still a (small) fixed cost per lease for things like running the replica through all the various queues. It makes the time it takes to drain a node or recover from a node crash more predictable. Also, it may seem silly, but it looks a lot better to new users if the number of leases on each node is roughly even.

It seems desirable to keep lease count roughly even. I'd be happy to hear counterarguments, though, since that would make things simpler. Maybe simplicity is reason enough.

pkg/storage/store_rebalancer.go, line 187 at r2 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

nit: format the todo with your name. If we still rebalance leases based on counts, this is needed. If we turn that off, then this is fine.

Thanks for calling this out. I create TODOs without a name when I want to make sure they're resolved before merging the code.

BramGruneir

This is great, and a big step forward.

Reviewed 11 of 11 files at r4, 9 of 9 files at r5, 3 of 3 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/cmd/roachtest/rebalance_load.go, line 16 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Not quite -- this is checking that the 3 leases for the table are evenly distributed, not that all leases are evenly distributed. This was a very simple test that passes with this change and fails most of the time without it because the total number of ranges in a new cluster greatly exceeds the number of ranges (3) in the table used by kv. Without load-based rebalancing, the leases for the kv table will typically all just stay on the node that had the initial range lease before the ALTER TABLE ... SPLIT command, since the other nodes have enough leases for other various ranges that the cluster is still considered balanced.

I agree that you could get slightly different test coverage with a couple other workloads, but in general all the tests I can think of would be testing that a minority of hot ranges get balanced properly, which this one does as well, just in a very simple way.

SGTM!

pkg/storage/store_rebalancer.go, line 129 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Sorry, I meant to have this outside the loop. Fixed.

I do think it'll be nice to actually log this at info level, since it'll only happen a max of once a minute and even then only on heavily loaded clusters with a significant qps imbalance.

SGTM

pkg/storage/store_rebalancer.go, line 152 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

There's still a (small) fixed cost per lease for things like running the replica through all the various queues. It makes the time it takes to drain a node or recover from a node crash more predictable. Also, it may seem silly, but it looks a lot better to new users if the number of leases on each node is roughly even.

It seems desirable to keep lease count roughly even. I'd be happy to hear counterarguments, though, since that would make things simpler. Maybe simplicity is reason enough.

Less movement of leases is better overall. Which I think is a strong argument on it's own.

Anyway, not blocking for this PR. But good for a follow up.

pkg/storage/store_rebalancer.go, line 198 at r5 (raw file):

		// just unnecessary churn with no benefit to move leases responsible for,
		// for example, 1 qps on a store with 5000 qps.
		const minQPSFraction = .001

Is it worth pulling this const up to highlight it outside of the function?

a-robinson

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/storage/store_rebalancer.go, line 198 at r5 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Is it worth pulling this const up to highlight it outside of the function?

It'd have to be given a more descriptive name if it was pulled outside, so I'd go with no.

resolved/acked all comments

a-robinson · 2018-08-15T21:08:08Z

cc @nvanbenschoten in case you wanted to take a look at this before submitting. Feel free to pass on doing so, but I wanted to give you a chance if you're interested.

nvanbenschoten

Nice work @a-robinson! I'm really excited that we're getting this in for 2.1.

Reviewed 2 of 3 files at r1, 11 of 11 files at r4, 9 of 9 files at r5, 3 of 3 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained

docs/generated/settings/settings.html, line 21 at r5 (raw file):

<tr><td><code>kv.allocator.load_based_lease_rebalancing.enabled</code></td><td>boolean</td><td><code>true</code></td><td>set to enable rebalancing of range leases based on load and latency</td></tr>
<tr><td><code>kv.allocator.range_rebalance_threshold</code></td><td>float</td><td><code>0.05</code></td><td>minimum fraction away from the mean a store's range count can be before it is considered overfull or underfull</td></tr>
<tr><td><code>kv.allocator.stat_based_rebalancing.enabled</code></td><td>boolean</td><td><code>true</code></td><td>set to enable rebalancing of range replicas based on write load and disk usage</td></tr>

Did this get out of sync with the code?

pkg/cmd/roachtest/rebalance_load.go, line 16 at r3 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

SGTM!

Had the same question as @BramGruneir. That reasoning checks out with me.

Probably worth describing what happens without kv.allocator.stat_based_rebalancing.enabled=true somewhere in the test.

pkg/storage/allocator.go, line 44 at r5 (raw file):

	// baseLoadBasedLeaseRebalanceThreshold is the equivalent of
	// leaseRebalanceThreshold for load-based lease rebalance decisions (i.e.
	// "follow-the-workload"). Its the base threshold for decisions that gets

decisions that get adjusted

pkg/storage/allocator.go, line 1011 at r5 (raw file):

// TODO(a-robinson): Should this be changed to avoid even thinking about lease
// counts now that we try to spread leases and replicas based on QPS? As is it
// may fight back a little bit against store-level QPS--based rebalancing.

double "-"

pkg/storage/replica_rankings.go, line 62 at r4 (raw file):

	rr.mu.Lock()
	defer rr.mu.Unlock()
	if len(rr.mu.byQPS) == 0 && rr.mu.accumulator.qps.Len() > 0 {

The interaction between byQPS and rrAccumulator in this function is subtle. It looked very strange to me until I realized why you gave it these semantics (so that all 128 top replicas are considered and we don't end up staring at the top few every time the StoreRebalancer runs, right?). I'd call that out somewhere. Also call out that because of this behavior we have to be careful about increasing numTopReplicasToTrack or we could end up adding large delays between when the hottest few replicas are considered for rebalancing.

pkg/storage/replica_rankings.go, line 75 at r4 (raw file):

func (a *rrAccumulator) addReplica(repl replicaWithStats) {
	heap.Push(&a.qps, repl)
	if a.qps.Len() > numTopReplicasToTrack {

This is almost always going to be the case during steady-state operations, right? In that case, it seems like there's a natural fast-path that falls out where we compare against the head of the priority queue and only pop and push if the new replica will result in the head being evicted.

pkg/storage/replica_rankings.go, line 80 at r4 (raw file):

}

type rrAccumulator struct {

I'd add a comment to this as well because it is used externally of the replicaRankings abstraction. That will help justify why we have this extra layer of structure instead of having replicaRankings hold an rrPriorityQueue directly.

pkg/storage/store_rebalancer.go, line 45 at r5 (raw file):

// compares to the load on other stores in the cluster and transferring leases
// or replicas away if the local store is overloaded.
type StoreRebalancer struct {

There's an interesting reason why this can't be a queue itself. Might be worth adding that to this comment.

pkg/storage/store_rebalancer.go, line 208 at r5 (raw file):

		desc := replWithStats.repl.Desc()
		log.VEventf(ctx, 3, "considering lease transfer for r%d with %.2f qps", desc.RangeID, replWithStats.qps)
		// TODO(a-robinson): Should we sort these to first examine the stores with lower QPS?

Sorting or even just randomizing sounds like a good idea to me. Otherwise, won't there be a system-wide tendency to transfer leases to replicas at the front of their range descriptors. The comment on RangeDescriptor.Replicas says the order is arbitrary, but I'm not sure that that's true in practice.

a-robinson

TFTR! I think the only question that may deserve more thought is the naming of the cluster settings. Using the inherited setting names is convenient, but we could do better if we don't mind throwing them out.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

docs/generated/settings/settings.html, line 21 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did this get out of sync with the code?

Good catch, looks like I messed that up. Fixed, although I couldn't figure out how to get the file regenerated automatically.

pkg/cmd/roachtest/rebalance_load.go, line 16 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Had the same question as @BramGruneir. That reasoning checks out with me.

Probably worth describing what happens without kv.allocator.stat_based_rebalancing.enabled=true somewhere in the test.

Done.

pkg/storage/allocator.go, line 44 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

decisions that get adjusted

Done. After this settles down I'd like to rename all the "load-based lease rebalance" naming to reference follow-the-workload instead.

pkg/storage/allocator.go, line 1011 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

double "-"

Done.

pkg/storage/replica_rankings.go, line 62 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The interaction between byQPS and rrAccumulator in this function is subtle. It looked very strange to me until I realized why you gave it these semantics (so that all 128 top replicas are considered and we don't end up staring at the top few every time the StoreRebalancer runs, right?). I'd call that out somewhere. Also call out that because of this behavior we have to be careful about increasing numTopReplicasToTrack or we could end up adding large delays between when the hottest few replicas are considered for rebalancing.

That's one reason, although given that the storeRebalancer continues until it either runs out of replicas or determines the store is balanced, I don't think it's a very important one.

The issue that was more on my mind was that we could be (and often will be, given that sufficient rebalancing activity triggers a re-gossiping of the store descriptor) loading a new accumulator while the storeRebalancer is partway through iterating over all the replicas, causing us to have to re-inspect a bunch of the same replicas again. I guess that's not a huge deal either, though, as long as it doesn't happen repeatedly.

It seems like maybe the consumer of the rankings should just pull out all the rankings at once (similar to how the producer puts all the rankings in at once via the accumulator).

pkg/storage/replica_rankings.go, line 75 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This is almost always going to be the case during steady-state operations, right? In that case, it seems like there's a natural fast-path that falls out where we compare against the head of the priority queue and only pop and push if the new replica will result in the head being evicted.

Good point, done.

pkg/storage/replica_rankings.go, line 80 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd add a comment to this as well because it is used externally of the replicaRankings abstraction. That will help justify why we have this extra layer of structure instead of having replicaRankings hold an rrPriorityQueue directly.

Done.

pkg/storage/store_rebalancer.go, line 45 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

There's an interesting reason why this can't be a queue itself. Might be worth adding that to this comment.

Done.

pkg/storage/store_rebalancer.go, line 208 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Sorting or even just randomizing sounds like a good idea to me. Otherwise, won't there be a system-wide tendency to transfer leases to replicas at the front of their range descriptors. The comment on RangeDescriptor.Replicas says the order is arbitrary, but I'm not sure that that's true in practice.

Done.

nvanbenschoten

I think the only question that may deserve more thought is the naming of the cluster settings. Using the inherited setting names is convenient, but we could do better if we don't mind throwing them out.

I don't have any strong opinions about this. @BramGruneir do you?

Reviewed 1 of 9 files at r8, 1 of 3 files at r9.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale)

pkg/storage/allocator.go, line 44 at r5 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Done. After this settles down I'd like to rename all the "load-based lease rebalance" naming to reference follow-the-workload instead.

"follow-the-workload" SGTM.

pkg/storage/replica_rankings.go, line 62 at r4 (raw file):

The issue that was more on my mind was that we could be (and often will be, given that sufficient rebalancing activity triggers a re-gossiping of the store descriptor) loading a new accumulator while the storeRebalancer is partway through iterating over all the replicas, causing us to have to re-inspect a bunch of the same replicas again. I guess that's not a huge deal either, though, as long as it doesn't happen repeatedly.

It's relatively cheap to inspect replicas that don't need to be rebalanced though, right?

It seems like maybe the consumer of the rankings should just pull out all the rankings at once (similar to how the producer puts all the rankings in at once via the accumulator).

That would be easier to understand. This is kind of prescriptive at the moment, so I think it makes sense to push the specialized complexity to the consumer. Feel free to leave a TODO instead of having that slow down the merge.

This makes it easier to pick replicas whose leases should be transferred when a store is overloaded. Release note: None

a-robinson

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale)

pkg/storage/replica_rankings.go, line 62 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The issue that was more on my mind was that we could be (and often will be, given that sufficient rebalancing activity triggers a re-gossiping of the store descriptor) loading a new accumulator while the storeRebalancer is partway through iterating over all the replicas, causing us to have to re-inspect a bunch of the same replicas again. I guess that's not a huge deal either, though, as long as it doesn't happen repeatedly.

It's relatively cheap to inspect replicas that don't need to be rebalanced though, right?

It seems like maybe the consumer of the rankings should just pull out all the rankings at once (similar to how the producer puts all the rankings in at once via the accumulator).

That would be easier to understand. This is kind of prescriptive at the moment, so I think it makes sense to push the specialized complexity to the consumer. Feel free to leave a TODO instead of having that slow down the merge.

Yeah, done.

In order to better balance the QPS being served by each store to avoid overloaded nodes. Fixes cockroachdb#21419 Release note (performance improvement): Range leases will be automatically rebalanced throughout the cluster to even out the amount of QPS being handled by each node.

It consistently passes with store-level load-based lease rebalancing, but fails more often than not without it. Release note: None

BramGruneir · 2018-08-20T18:59:06Z

I think the only question that may deserve more thought is the naming of the cluster settings. Using the inherited setting names is convenient, but we could do better if we don't mind throwing them out.

I don't have any strong opinions about this. @BramGruneir do you?

I have no problem throwing out the old ones. But we have to be sure that we stop a cluster upgrade if someone is using an old one that not set to the default. They most likely changed it for a reason.

a-robinson · 2018-08-20T19:04:56Z

I have no problem throwing out the old ones. But we have to be sure that we stop a cluster upgrade if someone is using an old one that not set to the default. They most likely changed it for a reason.

Is there any precedent for that? Using the stats-based rebalancing feature has been pretty strongly discouraged. People may be using it in test clusters, but I'd be pretty surprised if it's being used in any real production environments. Also, I want to remove the old stats-based rebalancing code as part of #28852, so the old setting isn't gonna be very useful.

BramGruneir · 2018-08-20T19:11:44Z

Is there any precedent for that? Using the stats-based rebalancing feature has been pretty strongly discouraged. People may be using it in test clusters, but I'd be pretty surprised if it's being used in any real production environments. Also, I want to remove the old stats-based rebalancing code as part of #28852, so the old setting isn't gonna be very useful.

How do we know that nobody anyone's using it? It's there's an option of it and I'd bet people have toyed with it. I have no idea how we go about deprecating a cluster setting.
But we should warn people (at least in upgrade notes) that this is the case.

BramGruneir

Reviewed 13 of 13 files at r10, 9 of 9 files at r11, 3 of 3 files at r12.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale)

a-robinson · 2018-08-20T19:13:54Z

We've removed cluster settings before (https://github.com/cockroachdb/cockroach/blob/master/pkg/settings/registry.go#L36), we've just never done anything like stopping an upgrade due to them. I'd think doing the same as before (retiring the setting and including a release note) is sufficient.

BramGruneir · 2018-08-20T19:14:57Z

SGTM

a-robinson · 2018-08-20T20:05:29Z

bors r+

I'll follow up with the setting changes.

28340: storage: make lease rebalancing decisions at the store level r=a-robinson a=a-robinson In order to better balance the QPS being served by each store to avoid overloaded nodes. Fixes #21419 Release note (performance improvement): Range leases will be automatically rebalanced throughout the cluster to even out the amount of QPS being handled by each node. 28848: opt: Fix rule cycle bug r=andy-kimball a=andy-kimball The PushFilterIntoJoinLeftAndRight and TryDecorrelateSelect rules can cycle with one another in a rare case: 1. Right side of join has outer column due to being un-decorrelatable. 2. Filter conjunct is pushed down to both left and right side by mapping equivalencies in PushFilterIntoJoinLeftAndRight. 3. Left conjunct is pulled back up to join condition by TryDecorrelateSelect. Steps #2 and #3 will cycle with one another. Cycle detection is not possible in this case, because the left side keeps changing (because new conjuct is pushed down to it each time). The fix is simple: don't let PushFilterIntoJoinLeftAndRight push down filters if either the left or right side has outer column(s). This fixes #28818. Release note: None Co-authored-by: Alex Robinson <[email protected]> Co-authored-by: Andrew Kimball <[email protected]>

craig · 2018-08-20T20:20:18Z

Build succeeded

GitHub CI (Cockroach)

Follow-up to cockroachdb#28340, which did this for just leases. Fixes cockroachdb#17979 Release note (performance improvement): Range replicas will be automatically rebalanced throughout the cluster to even out the amount of QPS being handled by each node.

28852: storage: make load-based replica rebalancing decisions at the store level r=a-robinson a=a-robinson Built on top of #28340, which is where the first 3 commits are from. This is still somewhat incomplete, in that it's missing unit tests and I'm only just now running tpc-c 10k on it. Sending out now to start the discussion of whether to include it in 2.1, since we're obviously very late in the intended development cycle. Co-authored-by: Alex Robinson <[email protected]>

a-robinson requested review from BramGruneir and a team August 7, 2018 17:28

a-robinson force-pushed the store-rebalance branch 2 times, most recently from 05241f9 to b3cbe23 Compare August 13, 2018 20:58

BramGruneir approved these changes Aug 15, 2018

View reviewed changes

a-robinson force-pushed the store-rebalance branch from b3cbe23 to f1e39e5 Compare August 15, 2018 19:42

a-robinson commented Aug 15, 2018

View reviewed changes

a-robinson force-pushed the store-rebalance branch from f1e39e5 to a082410 Compare August 15, 2018 19:48

BramGruneir previously requested changes Aug 15, 2018

View reviewed changes

a-robinson commented Aug 15, 2018

View reviewed changes

nvanbenschoten requested changes Aug 17, 2018

View reviewed changes

a-robinson force-pushed the store-rebalance branch from a082410 to 87d8d74 Compare August 18, 2018 20:22

a-robinson commented Aug 18, 2018

View reviewed changes

nvanbenschoten approved these changes Aug 18, 2018

View reviewed changes

storage: Track the top-k replicas by QPS in each store

a11ff0f

This makes it easier to pick replicas whose leases should be transferred when a store is overloaded. Release note: None

a-robinson force-pushed the store-rebalance branch from 87d8d74 to 13d5263 Compare August 20, 2018 18:05

a-robinson commented Aug 20, 2018

View reviewed changes

a-robinson added 2 commits August 20, 2018 13:28

roachtest: Add test for load-based lease rebalancing

094bf14

It consistently passes with store-level load-based lease rebalancing, but fails more often than not without it. Release note: None

a-robinson force-pushed the store-rebalance branch from 13d5263 to 094bf14 Compare August 20, 2018 18:28

a-robinson mentioned this pull request Aug 20, 2018

storage: make load-based replica rebalancing decisions at the store level #28852

Merged

a-robinson mentioned this pull request Aug 20, 2018

[prototype] storage: Make rebalance decisions at store-level #26608

Closed

BramGruneir approved these changes Aug 20, 2018

View reviewed changes

craig bot merged commit 094bf14 into cockroachdb:master Aug 20, 2018

a-robinson mentioned this pull request Aug 21, 2018

storage: diagnose performance issues with tpc-c when not using partitioning #26059

Closed

This was referenced Aug 28, 2018

storage: make lease rebalancing decisions at the store level cockroachdb/docs#3663

Closed

Re-enable stats-based replica rebalancing cockroachdb/docs#2051

Closed

a-robinson mentioned this pull request Sep 6, 2018

backport-2.1: storage: make load-based replica rebalancing decisions at the store level #29663

Merged

a-robinson mentioned this pull request Sep 10, 2018

qa: Load-based rebalancing of leases and replicas #30007

Closed

a-robinson mentioned this pull request Oct 1, 2018

storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: make lease rebalancing decisions at the store level #28340

storage: make lease rebalancing decisions at the store level #28340

a-robinson commented Aug 7, 2018

cockroach-teamcity commented Aug 7, 2018

a-robinson commented Aug 7, 2018

BramGruneir left a comment

a-robinson left a comment

BramGruneir left a comment

a-robinson left a comment

a-robinson commented Aug 15, 2018

nvanbenschoten left a comment

a-robinson left a comment

nvanbenschoten left a comment

a-robinson left a comment

BramGruneir commented Aug 20, 2018

a-robinson commented Aug 20, 2018

BramGruneir commented Aug 20, 2018

BramGruneir left a comment

a-robinson commented Aug 20, 2018

BramGruneir commented Aug 20, 2018

a-robinson commented Aug 20, 2018

craig bot commented Aug 20, 2018

storage: make lease rebalancing decisions at the store level #28340

storage: make lease rebalancing decisions at the store level #28340

Conversation

a-robinson commented Aug 7, 2018

cockroach-teamcity commented Aug 7, 2018

a-robinson commented Aug 7, 2018

BramGruneir left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

BramGruneir left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

a-robinson commented Aug 15, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

BramGruneir commented Aug 20, 2018

a-robinson commented Aug 20, 2018

BramGruneir commented Aug 20, 2018

BramGruneir left a comment

Choose a reason for hiding this comment

a-robinson commented Aug 20, 2018

BramGruneir commented Aug 20, 2018

a-robinson commented Aug 20, 2018

craig bot commented Aug 20, 2018

Build succeeded