-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: rebalance ranges with one voter using joint configurations #124284
kvserver: rebalance ranges with one voter using joint configurations #124284
Conversation
9bb5dff
to
0a12f1c
Compare
d3edeaa
to
e6eaca8
Compare
The existing leaseholder store is never selected as the rebalance target. That is to be expected initially, as scatter only jitters replica count stats, it doesn't completely randomize them. The leaseholder store has 1k+ replicas. However, after the replica counts get closer, we'd expect scatter to also do nothing 1/n of the time but it is still continuing to only choose other stores with this patch. |
e6eaca8
to
79f3177
Compare
Add a simulation test where the replication factor is set to 1. This highlights lease thrashing, which causes replica and lease count convergence to take upwards of 20 minutes with two nodes. Informs: cockroachdb#108420 Release note: None
db23e58
to
0ecb487
Compare
Turns out this is just due to timing. The allocator rebalance targets are returned before any rebalancing activity occurs. The disparity in replica counts initially causes the initial node to not be selected. If we iteratively ran scatter, the results would be closer to expectation w/ an even distribution. The lease/replicate queue will also adjust lease/replica counts as the scatter occurs. |
0ecb487
to
06635fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r5, 6 of 7 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/asim/tests/testdata/non_rand/one_voter
line 0 at r7 (raw file):
🔥
pkg/kv/kvserver/allocator/plan/util_test.go
line 56 at r7 (raw file):
{ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 1}}, }, expectedErrStr: "",
nit: expectedPerformingSwap: false,
pkg/sql/scatter_test.go
line 215 at r7 (raw file):
t, tc.ServerConn(0), "t", "k INT PRIMARY KEY, v INT", 500,
nit: /* numRows */
pkg/sql/scatter_test.go
line 262 at r7 (raw file):
} // Expect that the number of replicas on store 1 to have changed. We can't
It feels like we're leaving test flakiness up to chance here, but I guess (1/3)^50
is a small enough number that we don't think it's at risk of flaking.
The allocator would add a voter, instead of both adding and removing the existing voter when rebalancing ranges with one replica. Removing the leaseholder replica was not possible prior to cockroachdb#74077, so the addition only was necessary. This restriction is no longer necessary. Allow rebalancing a one voter range between stores using joint configurations, where the lease will be transferred to the incoming voter store, from the outgoing demoting voter. Scattering ranges with one voter will now leave the range with exactly one voter, where previously both the leaseholder voter evaluating the scatter, and the new voter would be left. Before this patch, scattering 1000 ranges with RF=1 on a 5 store cluster: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 1 | 1001 | ########## | 500 | ########## 5 | 291 | ### | 147 | ### 4 | 275 | ### | 137 | ### 3 | 229 | ### | 118 | ### 2 | 206 | ### | 99 | ## ``` After: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 2 | 242 | ########## | 241 | ########## 4 | 227 | ########## | 227 | ########## 5 | 217 | ######### | 216 | ######### 3 | 209 | ######### | 208 | ######### 1 | 106 | ##### | 109 | ##### ``` Fixes: cockroachdb#108420 Fixes: cockroachdb#124171 Release note (bug fix): Scattering a range with replication factor=1, no longer erroneously up-replicates the range to two replicas. Leases will also no longer thrash between nodes when perturbed with replication factor=1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/asim/tests/testdata/non_rand/one_voter
line at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
🔥
Thanks 😄
pkg/kv/kvserver/allocator/plan/util_test.go
line 56 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
expectedPerformingSwap: false,
Updated.
pkg/sql/scatter_test.go
line 215 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
/* numRows */
Updated.
pkg/sql/scatter_test.go
line 262 at r7 (raw file):
Seems reasonably small, in addition to:
Turns out this is just due to timing. The allocator rebalance targets are returned before any rebalancing activity occurs. The disparity in replica counts initially causes the initial node to not be selected. If we iteratively ran scatter, the results would be closer to expectation w/ an even distribution. The lease/replicate queue will also adjust lease/replica counts as the scatter occurs.
06635fc
to
1995005
Compare
bors r=nvanbenschoten |
Build failed (retrying...): |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 1995005 to blathers/backport-release-23.1-124284: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 1995005 to blathers/backport-release-23.2-124284: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The allocator would add a voter, instead of both adding and removing the
existing voter when rebalancing ranges with one replica. Removing the
leaseholder replica was not possible prior to #74077, so the addition
only was necessary.
This restriction is no longer necessary. Allow rebalancing a one voter
range between stores using joint configurations, where the lease will be
transferred to the incoming voter store, from the outgoing demoting
voter.
Scattering ranges with one voter will now leave the range with exactly
one voter, where previously both the leaseholder voter evaluating the
scatter, and the new voter would be left.
Before this patch, scattering 1000 ranges with RF=1 on a 5 store
cluster:
After:
Fixes: #108420
Fixes: #124171
Release note (bug fix): Scattering a range with replication factor=1, no
longer erroneously up-replicates the range to two replicas. Leases will
also no longer thrash between nodes when perturbed with replication
factor=1.