Skip to content

Commit

Permalink
kvserver: assert that local descriptor "always" contains the replica
Browse files Browse the repository at this point in the history
This is an invariant that was established in [cockroachdb#40892]. We now check
it more aggressively. There is an awkward exception thanks to
uninitialized replicas.

[cockroachdb#40892]: cockroachdb#40892

Release note: None

fixupassert
  • Loading branch information
tbg committed Oct 29, 2021
1 parent 959a9b9 commit e17fc3d
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 7 deletions.
31 changes: 31 additions & 0 deletions pkg/kv/kvserver/replica.go
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,9 @@ type Replica struct {
// The replica's Raft group "node".
internalRaftGroup *raft.RawNode
// The ID of the replica within the Raft group. This value may never be 0.
// It will not change over the lifetime of this replica. If addressed under
// a newer replicaID, the replica immediately replicaGCs itself to make
// way for the newer incarnation.
replicaID roachpb.ReplicaID
// The minimum allowed ID for this replica. Initialized from
// RangeTombstone.NextReplicaID.
Expand Down Expand Up @@ -1217,6 +1220,34 @@ func (r *Replica) assertStateRaftMuLockedReplicaMuRLocked(
log.Fatalf(ctx, "denormalized start key %s diverged from %s", r.startKey, r.mu.state.Desc.StartKey)
}
}
// A replica is always contained in its descriptor. This is an invariant. When
// the replica applies a ChangeReplicasTrigger that removes it, it will
// eagerly replicaGC itself. Similarly, snapshots that don't contain the
// recipient are refused. In fact, a stronger invariant holds - replicas
// will never change replicaID in-place. When a replica receives a raft
// message addressing it through a higher replicaID, the replica is
// immediately garbage collected as well.
//
// Unfortunately, the invariant does not hold when the descriptor is
// uninitialized, as we are hitting this code during instantiation phase of
// replicas where they can briefly be in an inconsistent state. These calls
// generally go through tryGetOrCreateReplica and first create a replica from
// an uninitialized descriptor that they then populate if on-disk state is
// present. This is all complex and we would be better off if we made sure
// that a Replica is always initialized (i.e. replace uninitialized replicas
// with a different type, similar to ReplicaPlaceholder).
//
// See:
// https://github.com/cockroachdb/cockroach/pull/40892
if r.mu.state.Desc.IsInitialized() {
replDesc, ok := r.mu.state.Desc.GetReplicaDescriptor(r.store.StoreID())
if !ok {
log.Fatalf(ctx, "%+v does not contain local store s%d", r.mu.state.Desc, r.store.StoreID())
}
if replDesc.ReplicaID != r.mu.replicaID {
log.Fatalf(ctx, "replica's replicaID %d diverges from descriptor %+v", r.mu.replicaID, r.mu.state.Desc)
}
}
}

// TODO(nvanbenschoten): move the following 5 methods to replica_send.go.
Expand Down
7 changes: 0 additions & 7 deletions pkg/kv/kvserver/replica_application_state_machine.go
Original file line number Diff line number Diff line change
Expand Up @@ -560,13 +560,6 @@ func (b *replicaAppBatch) stageWriteBatch(ctx context.Context, cmd *replicatedCm
func changeRemovesStore(
desc *roachpb.RangeDescriptor, change *kvserverpb.ChangeReplicas, storeID roachpb.StoreID,
) (removesStore bool) {
_, existsInDesc := desc.GetReplicaDescriptor(storeID)
// NB: if we're catching up from a preemptive snapshot then we won't
// exist in the current descriptor and we can't be removed.
if !existsInDesc {
return false
}

// NB: We don't use change.Removed() because it will include replicas being
// transitioned to VOTER_OUTGOING.

Expand Down

0 comments on commit e17fc3d

Please sign in to comment.