Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: allocating replacement replicas needs to consider fully satisfying constraints #94809

Closed
AlexTalks opened this issue Jan 6, 2023 · 3 comments · Fixed by #94810
Closed
Assignees
Labels
A-kv-decom-rolling-restart Decommission and Rolling Restarts A-kv-distribution Relating to rebalancing and leasing. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@AlexTalks
Copy link
Contributor

AlexTalks commented Jan 6, 2023

In a recent investigation, it was discovered that when we have constraints applying to some but not all the replicas needed for a range, it is possible for a replacement operation (such as during decommission) to not consider that all constraints are no longer satisfied. This occurs when we have configurations such as num_replicas = 3, constraints = '{<some constraint>: 1}', and thus would expect to have 2 replicas that do not need to satisfy any constraints, known as "unconstrained replicas"; however replacement of the one replica that satisfies the constraint should not be possible.

This can be reproduced simply with the following:

roachprod create local -n4
roachprod stage local release v22.1.12
roachprod start local --racks=4
roachprod ssh local:1 -- './cockroach workload init kv --splits=100'
roachprod sql local:1 -- -e "alter database kv configure zone using num_replicas=3, constraints='{+rack=3: 1}';"
# wait for rebalancing
roachprod ssh local:1 -- './cockroach node decommission 4 --insecure'
# this should not succeed, but it does

Jira issue: CRDB-23152

@AlexTalks AlexTalks added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-distribution Relating to rebalancing and leasing. A-kv-decom-rolling-restart Decommission and Rolling Restarts T-kv KV Team labels Jan 6, 2023
@AlexTalks AlexTalks self-assigned this Jan 6, 2023
@AlexTalks
Copy link
Contributor Author

This comment on allocateConstraintsCheck(..) in allocator_scorer.go is interesting:

// NB: This assumes that the sum of all constraints.NumReplicas is equal to
// configured number of replicas for the range, or that there's just one set of
// constraints with NumReplicas set to 0. This is meant to be enforced in the
// config package.

Except, clearly, we don't enforce this, and we allow there to be "unconstrained" replicas. It seems that those need to be accounted for.

@irfansharif
Copy link
Contributor

Nice find!

@nvanbenschoten
Copy link
Member

cc. @joshimhoff

AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 6, 2023
This change adds tests using `store.AllocatorCheckRange(..)` to
investigate the behavior or attempting to decommission a node when doing
so would cause constraints to be broken. This test is needed because it
was uncovered recently that a partially constrained range (i.e. a range
configured with `num_replicas = 3, constraint = '{<some constraint>:
1}'` may not have the constraint fully respected if the only node that
satisfies said constraint is being decommissioned.

Part of cockroachdb#94809

Depends on cockroachdb#94024.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 6, 2023
This changes the allocator to be fully aware of the replica that is
beingn replaced when allocating new target replicas due to a dead or
decommissioning node. This ensures that if constraints were respected
before the dead or decommissioning node left the cluster, they will
still be respected afterwards, particularly in the case at issue, which
is when partial constraints are set on a range (e.g. `num_replicas = 3,
<some_constraint>: 1`). This is achieved by rejecting candidate
stores to allocate the replacement on when they do not satisfy a
constraint that was satisfied by the existing store. In doing so, this
means that some node decommissions that would previously be able to
execute will now not allow the user to decommission the node and violate
the configured constraints.

Fixes cockroachdb#94809.

Release note (bug fix): Decommissions that would violate constraints set
on a subset of replicas for a range (e.g. `num_replicas = 3,
<constraint>: 1`) will no longer be able to execute, respecting
constraints during and after the decommission.
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 8, 2023
This change adds tests using `store.AllocatorCheckRange(..)` to
investigate the behavior or attempting to decommission a node when doing
so would cause constraints to be broken. This test is needed because it
was uncovered recently that a partially constrained range (i.e. a range
configured with `num_replicas = 3, constraint = '{<some constraint>:
1}'` may not have the constraint fully respected if the only node that
satisfies said constraint is being decommissioned.

Part of cockroachdb#94809

Depends on cockroachdb#94024.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 8, 2023
This changes the allocator to be fully aware of the replica that is
beingn replaced when allocating new target replicas due to a dead or
decommissioning node. This ensures that if constraints were respected
before the dead or decommissioning node left the cluster, they will
still be respected afterwards, particularly in the case at issue, which
is when partial constraints are set on a range (e.g. `num_replicas = 3,
<some_constraint>: 1`). This is achieved by rejecting candidate
stores to allocate the replacement on when they do not satisfy a
constraint that was satisfied by the existing store. In doing so, this
means that some node decommissions that would previously be able to
execute will now not allow the user to decommission the node and violate
the configured constraints.

Fixes cockroachdb#94809.

Release note (bug fix): Decommissions that would violate constraints set
on a subset of replicas for a range (e.g. `num_replicas = 3,
<constraint>: 1`) will no longer be able to execute, respecting
constraints during and after the decommission.
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 9, 2023
This change adds tests using `store.AllocatorCheckRange(..)` to
investigate the behavior or attempting to decommission a node when doing
so would cause constraints to be broken. This test is needed because it
was uncovered recently that a partially constrained range (i.e. a range
configured with `num_replicas = 3, constraint = '{<some constraint>:
1}'` may not have the constraint fully respected if the only node that
satisfies said constraint is being decommissioned.

Part of cockroachdb#94809

Depends on cockroachdb#94024.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 9, 2023
This changes the allocator to be fully aware of the replica that is
beingn replaced when allocating new target replicas due to a dead or
decommissioning node. This ensures that if constraints were respected
before the dead or decommissioning node left the cluster, they will
still be respected afterwards, particularly in the case at issue, which
is when partial constraints are set on a range (e.g. `num_replicas = 3,
<some_constraint>: 1`). This is achieved by rejecting candidate
stores to allocate the replacement on when they do not satisfy a
constraint that was satisfied by the existing store. In doing so, this
means that some node decommissions that would previously be able to
execute will now not allow the user to decommission the node and violate
the configured constraints.

Fixes cockroachdb#94809.

Release note (bug fix): Decommissions that would violate constraints set
on a subset of replicas for a range (e.g. `num_replicas = 3,
<constraint>: 1`) will no longer be able to execute, respecting
constraints during and after the decommission.
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Feb 15, 2023
This change adds tests using `store.AllocatorCheckRange(..)` to
investigate the behavior or attempting to decommission a node when doing
so would cause constraints to be broken. This test is needed because it
was uncovered recently that a partially constrained range (i.e. a range
configured with `num_replicas = 3, constraint = '{<some constraint>:
1}'` may not have the constraint fully respected if the only node that
satisfies said constraint is being decommissioned.

Part of cockroachdb#94809

Depends on cockroachdb#94024.

Release note: None
craig bot pushed a commit that referenced this issue Feb 15, 2023
94810: kvserver: fix constraint analysis on replica replacement r=AlexTalks a=AlexTalks

This changes the allocator to be fully aware of the replica that is
beingn replaced when allocating new target replicas due to a dead or
decommissioning node. This ensures that if constraints were respected
before the dead or decommissioning node left the cluster, they will
still be respected afterwards, particularly in the case at issue, which
is when partial constraints are set on a range (e.g. `num_replicas = 3,
<some_constraint>: 1`). This is achieved by rejecting candidate
stores to allocate the replacement on when they do not satisfy a
constraint that was satisfied by the existing store. In doing so, this
means that some node decommissions that would previously be able to
execute will now not allow the user to decommission the node and violate
the configured constraints.

Fixes #94809.

Depends on #94024.

Release note (bug fix): Decommissions that would violate constraints set
on a subset of replicas for a range (e.g. `num_replicas = 3,
<constraint>: 1`) will no longer be able to execute, respecting
constraints during and after the decommission.

Co-authored-by: Alex Sarkesian <[email protected]>
@craig craig bot closed this as completed in f7db7e3 Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-decom-rolling-restart Decommission and Rolling Restarts A-kv-distribution Relating to rebalancing and leasing. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants