Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: include only voters in state replicate when checking if range can make progress on replication changes #114570

Open
kvoli opened this issue Nov 16, 2023 · 1 comment
Labels
A-kv-distribution Relating to rebalancing and leasing. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@kvoli
Copy link
Collaborator

kvoli commented Nov 16, 2023

Is your feature request related to a problem? Please describe.

When making replication changes, the proposer will check that the range can make progress:

if !replicas.CanMakeProgress(
func(rDesc roachpb.ReplicaDescriptor) bool {
for _, inner := range liveReplicas {
if inner.ReplicaID == rDesc.ReplicaID {
return true
}
}
return false
}) {

This check excludes non-live voters from the perspective of node liveness, but doesn't consider the raft status of replicas.

This can lead to replication changes which result in temporary unavailability like seen in #114349.

Where replicas which are behind StateSnapshot are included in the quorum, despite not participating until being caught up.

Describe the solution you'd like

Exclude replicas which are not in StateReplicate when assessing if the range can make progress here:

https://github.com/cockroachdb/cockroach/blob/b06849b9a2a2c0aab6950764ddff1bb125119df5/pkg/kv/kvserver/replica_command.go#L2386C30-L2386C30

Jira issue: CRDB-33570

@kvoli kvoli added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-distribution Relating to rebalancing and leasing. labels Nov 16, 2023
@blathers-crl blathers-crl bot added the T-kv KV Team label Nov 16, 2023
@erikgrinaker
Copy link
Contributor

We should probably check that they're not lagging too much either, for some reasonable value of "too much".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-distribution Relating to rebalancing and leasing. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
No open projects
Status: Incoming
Development

No branches or pull requests

2 participants