Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asim: add allocator repair actions #90141

Closed
Tracked by #90137
kvoli opened this issue Oct 18, 2022 · 0 comments · Fixed by #101745
Closed
Tracked by #90137

asim: add allocator repair actions #90141

kvoli opened this issue Oct 18, 2022 · 0 comments · Fixed by #101745
Assignees
Labels
A-kv-simulation Relating to allocation simulation. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@kvoli
Copy link
Collaborator

kvoli commented Oct 18, 2022

The asim pkg currently mocks the replicate queue and supports AllocatorConsiderRebalance action only. It also misses several crucial components of the replicate queue code. This issue is to add support for every allocator action, via the replicate queue in the simulator.

AllocatorNoop
AllocatorRemoveVoter
AllocatorRemoveNonVoter
AllocatorAddVoter
AllocatorAddNonVoter
AllocatorReplaceDeadVoter
AllocatorReplaceDeadNonVoter
AllocatorRemoveDeadVoter
AllocatorRemoveDeadNonVoter
AllocatorReplaceDecommissioningVoter
AllocatorReplaceDecommissioningNonVoter
AllocatorRemoveDecommissioningVoter
AllocatorRemoveDecommissioningNonVoter
AllocatorRemoveLearner
+ AllocatorConsiderRebalance
AllocatorRangeUnavailable
AllocatorFinalizeAtomicReplicationChange

The solution can be broken into two parts.

  1. Separate the decision for the replicate queue change (change replicas, transfer lease) and the application of that decision.
  2. Hook the simulator replicate queue into the decision of the real replicate queue, then call the asim application of that decision.

Jira issue: CRDB-20611

@kvoli kvoli added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-simulation Relating to allocation simulation. labels Oct 18, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Oct 18, 2022
@kvoli kvoli added this to the 23.1 milestone Oct 18, 2022
@kvoli kvoli changed the title asim: add allocator repair actions asim: add allocator repair actions Oct 18, 2022
@kvoli kvoli removed this from the 23.1 milestone Dec 8, 2022
craig bot pushed a commit that referenced this issue Jan 3, 2023
94023: kvserver: add shed lease target to repl queue r=andrewbaptist a=kvoli

Previously, a call to `shedLease` was made within the process loop and outside of planning, in the replicate queue. This patch moves the shed lease into consider rebalance, where it originally was and converts it into a plannable action.

Part of #90141

Release note: None

Co-authored-by: Austen McClernon <[email protected]>
kvoli added a commit to kvoli/cockroach that referenced this issue Apr 8, 2023
Benign error is used to declare an error that is not u enough to log an
error when returned in queue processing. As part of a refactor to remove
decision logic out of the replicate queue, this patch moves benign error
into its own, separate package.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue Apr 10, 2023
The replicate queue processes replicas that require replication changes,
including recovery and rebalancing. Previously, a lot of the logic
associated with these replication changes existed within replicate
queue, as well as the allocator. This is not ideal as it confuses the
line between decision logic and application - making testing and
simulation much more difficult than necessary.

This patch introduces the `ReplicationPlanner`. The previous decision
logic that existed in the replicate queue is wholesale moved into the
`ReplicationPlanner`, with no behavior changes. The `AllocatorReplica`
interface is also introduced, to enable mocking a replica in testing and
simulation.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue May 17, 2023
This commit adds `SetNodeLiveness` and `SetNodeLocality` as methods on
the `State` interface. These methods can be used to update a simulated
node's properties and liveness status. The liveness status introduced is
the end "status", rather than separated liveness and membership.

Note that the output of `example_rebalancing` changes due to generated
clusters now having a region and zone set, even though the region and
zone are the identical - the change reorders allocator targets. The same
applies for the store rebalancer range rebalancing test.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue May 17, 2023
Previously, a span config could only be set for a range with and an ID
in the allocation simulator. This commit adds the `SetSpanConfig`
command which allows setting a span config for an arbitrary key span.
The key span is split accordingly to apply the span config.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue May 17, 2023
This commit adds an extra argument to `State.AddReplica`, `ReplicaType`.
`ReplicaType` may be specified to add specific replica types to the
simulator state. Note that support is still limited for actually acting
upon replication changes involving non-voting replicas in the state
changer.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue May 17, 2023
The `state.Changer` previously had no support for non-voting replicas.
This commit adds support for promotions, demotions, adding and removing
non-voting replicas.

Additionally, the `new_state.go` functions are refactored to enable
better re-use of state generation by segmenting range creation and
cluster (node,store) creation.

Part of: cockroachdb#90141

Release note: None
kvoli added a commit to kvoli/cockroach that referenced this issue May 17, 2023
Previously, the `State.ReplicaLoad` method would return the load
interface for a range. In order to allow updating range size more
easily, this commit removes direct access and instead updates the method
to only return `RangeUsageInfo`, just what is required - in order to
allow updating the logical bytes separately.

Part of: cockroachdb#90141

Release note: None
craig bot pushed a commit that referenced this issue May 17, 2023
101745: asim: simulate recovery actions r=sumeerbhola a=kvoli

Previously, it was not possible to simulate recovery actions such as
replacing a dead or decommissioning (non-)voter, as the simulator
replicate queue previously had limited support to only rebalancing. With
the changes introduced in #99247, it is now possible to simulate all
actions returned from the Allocator's `ComputeChange`.

This commit updates the simulator replicate queue to call
`ShouldPlanChange` and `PlanOneChange`, in the same manner that the
actual replicate queue does.

Note the output of  `example_rebalancing` has changed, this is expected
as the replicate queue now returns lease transfer changes, previously it
did not.

Resolves: #90141

Release note: None

103467: sql: check schema privileges when dropping role r=postamar a=postamar

Previously, dropping a role which has privileges on a schema did not result in a error. This patch fixes this bug by adding the missing logic which performs this check.

Fixes #102962.

Release note (bug fix): DROP ROLE now correctly returns an 2BP01 error when the given role has been granted privileges on a schema.

Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Marius Posta <[email protected]>
@craig craig bot closed this as completed in e1e51a5 May 17, 2023
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
This commit adds `SetNodeLiveness` and `SetNodeLocality` as methods on
the `State` interface. These methods can be used to update a simulated
node's properties and liveness status. The liveness status introduced is
the end "status", rather than separated liveness and membership.

Note that the output of `example_rebalancing` changes due to generated
clusters now having a region and zone set, even though the region and
zone are the identical - the change reorders allocator targets. The same
applies for the store rebalancer range rebalancing test.

Part of: cockroachdb#90141

Release note: None
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
Previously, a span config could only be set for a range with and an ID
in the allocation simulator. This commit adds the `SetSpanConfig`
command which allows setting a span config for an arbitrary key span.
The key span is split accordingly to apply the span config.

Part of: cockroachdb#90141

Release note: None
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
This commit adds an extra argument to `State.AddReplica`, `ReplicaType`.
`ReplicaType` may be specified to add specific replica types to the
simulator state. Note that support is still limited for actually acting
upon replication changes involving non-voting replicas in the state
changer.

Part of: cockroachdb#90141

Release note: None
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
The `state.Changer` previously had no support for non-voting replicas.
This commit adds support for promotions, demotions, adding and removing
non-voting replicas.

Additionally, the `new_state.go` functions are refactored to enable
better re-use of state generation by segmenting range creation and
cluster (node,store) creation.

Part of: cockroachdb#90141

Release note: None
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
Previously, the `State.ReplicaLoad` method would return the load
interface for a range. In order to allow updating range size more
easily, this commit removes direct access and instead updates the method
to only return `RangeUsageInfo`, just what is required - in order to
allow updating the logical bytes separately.

Part of: cockroachdb#90141

Release note: None
raggar pushed a commit to raggar/cockroach that referenced this issue May 23, 2023
Previously, it was not possible to simulate recovery actions such as
replacing a dead or decommissioning (non-)voter, as the simulator
replicate queue previously had limited support to only rebalancing. With
the changes introduced in cockroachdb#99247, it is now possible to simulate all
actions returned from the Allocator's `ComputeChange`.

This commit updates the simulator replicate queue to call
`ShouldPlanChange` and `PlanOneChange`, in the same manner that the
actual replicate queue does.

Note the output of `example_rebalancing` has changed, this is expected
as the replicate queue now returns lease transfer changes, previously it
did not.

Resolves: cockroachdb#90141

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-simulation Relating to allocation simulation. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant