Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
32549: storage: log when rebalance-after-split causes snapshot r=petermattis a=tbg

This will help track down (by means of log inspection) how big a deal
this is.

Release note: None

Co-authored-by: Tobias Schottdorf <[email protected]>
  • Loading branch information
craig[bot] and tbg committed Nov 22, 2018
2 parents 1e95377 + d0a5ab2 commit 9363293
Showing 1 changed file with 41 additions and 0 deletions.
41 changes: 41 additions & 0 deletions pkg/storage/replica_gc_queue.go
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,47 @@ func (rgcq *replicaGCQueue) process(
} else if desc.RangeID == replyDesc.RangeID {
// We are no longer a member of this range, but the range still exists.
// Clean up our local data.

repl.mu.RLock()
replicaID := repl.mu.replicaID
ticks := repl.mu.ticks
repl.mu.RUnlock()

if replicaID == 0 {
// This is a preemptive replica. GC'ing a preemptive replica is a
// good idea if and only if the up-replication that it was a part of
// did *NOT* commit. If it *did* commit and we're removing the
// preemptive snapshot, the newly added follower will first need a
// Raft snapshot to catch up, and that snapshot will be delayed by
// #31875.
// Log if the replica hasn't been around for very long.
//
// TODO(tschottdorf): avoid these, ideally without a time-based mechanism.
// The replica carrying out the replication change could keep the
// snapshot alive until it has either committed or aborted the txn.
// Or we try to use Raft learners for this purpose.
if ticks < 10 {
log.Infof(ctx, "removing young preemptive snapshot (%d ticks)", ticks)
}
} else if replyDesc.EndKey.Less(desc.EndKey) {
// The meta records indicate that the range has split but that this
// replica hasn't processed the split trigger yet. By removing this
// replica, we're also wiping out the data of what would become the
// right hand side of the split (which may or may not still have a
// replica on this store), and will need a Raft snapshot. Even worse,
// the mechanism introduced in #31875 will artificially delay this
// snapshot by seconds, during which time the RHS may see more splits
// and incur more snapshots.
//
// TODO(tschottdorf): we can look up the range descriptor for the
// RHS of the split (by querying with replyDesc.EndKey) and fetch
// the local replica (which will be uninitialized, i.e. we have to
// look it up by RangeID) to disable the mechanism in #31875 for it.
// We should be able to use prefetching unconditionally to have this
// desc ready whenever we need it.
log.Infof(ctx, "removing replica with pending split; will incur Raft snapshot for right hand side")
}

rgcq.metrics.RemoveReplicaCount.Inc(1)
log.VEventf(ctx, 1, "destroying local data")
if err := repl.store.RemoveReplica(ctx, repl, replyDesc.NextReplicaID, RemoveOptions{
Expand Down

0 comments on commit 9363293

Please sign in to comment.