Skip to content

Commit

Permalink
kv: retry AdminSplit on LeaseTransferRejectedBecauseTargetMayNeedSnap…
Browse files Browse the repository at this point in the history
…shot

Informs cockroachdb#84635
Informs cockroachdb#84162

This commit considers the `LeaseTransferRejectedBecauseTargetMayNeedSnapshot`
status to be a form of a retriable replication change error. It then hooks
`Replica.executeAdminCommandWithDescriptor` up to consult this status in its
retry loop.

This avoids spurious errors when a split gets blocked behind a lateral replica
move like we see in the following situation:
1. issue AdminSplit
2. range in joint config, first needs to leave (maybeLeaveAtomicChangeReplicas)
3. to leave, needs to transfer lease from voter_outgoing to voter_incoming
4. can’t transfer lease because doing so is deemed to be potentially unsafe

Release note: None
  • Loading branch information
nvanbenschoten committed Aug 1, 2022
1 parent b83e83b commit db2d653
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 4 deletions.
4 changes: 3 additions & 1 deletion pkg/kv/kvserver/markers.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@ var errMarkCanRetryReplicationChangeWithUpdatedDesc = errors.New("should retry w
// assumed to have been emitted by the KV layer during a replication change
// operation) is likely to succeed when retried with an updated descriptor.
func IsRetriableReplicationChangeError(err error) bool {
return errors.Is(err, errMarkCanRetryReplicationChangeWithUpdatedDesc) || isSnapshotError(err)
return errors.Is(err, errMarkCanRetryReplicationChangeWithUpdatedDesc) ||
IsLeaseTransferRejectedBecauseTargetMayNeedSnapshotError(err) ||
isSnapshotError(err)
}

const (
Expand Down
6 changes: 3 additions & 3 deletions pkg/kv/kvserver/replica_command.go
Original file line number Diff line number Diff line change
Expand Up @@ -559,9 +559,9 @@ func (r *Replica) executeAdminCommandWithDescriptor(
}

lastErr = updateDesc(r.Desc())
// On seeing a ConditionFailedError or an AmbiguousResultError, retry the
// command with the updated descriptor.
if !errors.HasType(lastErr, (*roachpb.ConditionFailedError)(nil)) &&
// On seeing a retryable replication change or an AmbiguousResultError,
// retry the command with the updated descriptor.
if !IsRetriableReplicationChangeError(lastErr) &&
!errors.HasType(lastErr, (*roachpb.AmbiguousResultError)(nil)) {
break
}
Expand Down

0 comments on commit db2d653

Please sign in to comment.