Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: synchronize admin commands on the leaseholder #41392

Closed
ajwerner opened this issue Oct 7, 2019 · 1 comment
Closed

kvserver: synchronize admin commands on the leaseholder #41392

ajwerner opened this issue Oct 7, 2019 · 1 comment
Labels
A-kv-distribution Relating to rebalancing and leasing. no-issue-activity T-kv KV Team X-stale
Milestone

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Oct 7, 2019

Is your feature request related to a problem? Please describe.

In 19.2 and beyond, with the addition of learners and atomic rebalancing, our admin commands step generally now step through multiple range descriptor changes all of which are backed by some form of CPut. When issued concurrently these commands can step on each other leading to the need for more retries. We recently (#41385) just increased the maximum number of retries to infinity from 10 retries [1] before boiling up to the client for splits.

[1]

retryOpts.MaxRetries = 10

Not all admin commands can be retried as many run with a client-provided range descriptor. When these commands interfere they always boil back up to the caller which often is a queue. These queues then act as another retry loop above the admin command. The simple retry loop worked when it was just a single round (so once you laid down an intent on the descriptor you'd be able to succeed), but now there's no reason to allow admin commands on the same range (which are all issued by the same leaseholder) to race against each other.

Describe the solution you'd like

Admin commands always attempt to run on the leaseholder. We could eliminate the vast majority of races by using in-memory synchronization on the leaseholder. A simple solution would be to add an adminMu to the Replica which is acquired when processing AdminSplit, AdminUnsplit and AdminChangeReplicas commands. Note that we do not acquire this mutex in AdminRelocateRange or AdminMerge as those commands issue AdminChangeReplicas underneath.

It will be important to unblock these waiters when the lease changes.

Relates to #41028.

Jira issue: CRDB-5453

@ajwerner ajwerner added the A-kv-distribution Relating to rebalancing and leasing. label Oct 7, 2019
@nvanbenschoten nvanbenschoten added this to the Later milestone Mar 12, 2020
@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
@ajwerner ajwerner changed the title storage: synchronize admin commands on the leaseholder kvserver: synchronize admin commands on the leaseholder Nov 8, 2021
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 2, 2023
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-distribution Relating to rebalancing and leasing. no-issue-activity T-kv KV Team X-stale
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants