Skip to content

Commit

Permalink
Merge #74073
Browse files Browse the repository at this point in the history
74073: kv: add to replicaGCQueue in replicaMsgAppDropper, not gcQueue r=tbg a=nvanbenschoten

Fixes #73838.

This commit is the first of the three "next steps" identified in #73838. It fixes a case where we were accidentally adding a replica to the wrong queue. When dropping a MsgApp in `maybeDropMsgApp`, we want to GC the replica on the LHS of the split if it has been removed from its range. However, we were instead passing it to the MVCC GC queue, which was both irrelevant and a no-op because the LHS was not the leaseholder.

It's possible that we have seen the effects of this in roachtests like `splits/largerange`. This but could have delayed a snapshot to the RHS of a split for up to `maxDelaySplitTriggerTicks * 200ms = 20s` in some rare cases. We've seen the logs corresponding to this issue in a few tests over the past year: https://github.com/cockroachdb/cockroach/issues?q=is%3Aissue+%22would+have+dropped+incoming+MsgApp+to+wait+for+split+trigger%22+is%3Aclosed.

Co-authored-by: Nathan VanBenschoten <[email protected]>
  • Loading branch information
craig[bot] and nvanbenschoten committed Dec 21, 2021
2 parents d6dc05a + f11f912 commit ea6bfb2
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions pkg/kv/kvserver/client_merge_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2756,7 +2756,7 @@ func TestStoreRangeMergeSlowUnabandonedFollower_WithSplit(t *testing.T) {
t.Fatal(pErr)
}

// Now split the newly merged range splits back out at exactly the same key.
// Now split the newly merged range back out at exactly the same key.
// When the replica GC queue looks in meta2 it will find the new RHS range, of
// which store2 is a member. Note that store2 does not yet have an initialized
// replica for this range, since it would intersect with the old RHS replica.
Expand All @@ -2769,7 +2769,7 @@ func TestStoreRangeMergeSlowUnabandonedFollower_WithSplit(t *testing.T) {
tc.RemoveVotersOrFatal(t, lhsDesc.StartKey.AsRawKey(), tc.Target(2))

// Transfer the lease on the new RHS to store2 and wait for it to apply. This
// will force its replica to of the new RHS to become up to date, which
// will force its replica of the new RHS to become up to date, which
// indirectly tests that the replica GC queue cleans up both the LHS replica
// and the old RHS replica.
tc.TransferRangeLeaseOrFatal(t, *newRHSDesc, tc.Target(2))
Expand Down
4 changes: 2 additions & 2 deletions pkg/kv/kvserver/split_trigger_helper.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ func (rd *replicaMsgAppDropper) ShouldDrop(startKey roachpb.RKey) (fmt.Stringer,
if lhsRepl == nil {
return nil, false
}
lhsRepl.store.gcQueue.AddAsync(context.Background(), lhsRepl, replicaGCPriorityDefault)
lhsRepl.store.replicaGCQueue.AddAsync(context.Background(), lhsRepl, replicaGCPriorityDefault)
return lhsRepl, true
}

Expand All @@ -48,7 +48,7 @@ type msgAppDropper interface {

// maybeDropMsgApp returns true if the incoming Raft message should be dropped.
// It does so if the recipient replica is uninitialized (i.e. has no state) and
// is waiting for a split trigger to apply,in which case delivering the message
// is waiting for a split trigger to apply,in which case delivering the message
// in this situation would result in an unnecessary Raft snapshot: the MsgApp
// would be rejected and the rejection would prompt the leader to send a
// snapshot, while the split trigger would likely populate the replica "for
Expand Down

0 comments on commit ea6bfb2

Please sign in to comment.