Skip to content

Commit

Permalink
kvserver: recompute stats after mvcc gc
Browse files Browse the repository at this point in the history
Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: None
  • Loading branch information
lunevalex committed Jun 28, 2022
1 parent 7366ed4 commit 0aca337
Showing 1 changed file with 23 additions and 3 deletions.
26 changes: 23 additions & 3 deletions pkg/kv/kvserver/mvcc_gc_queue.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ package kvserver
import (
"context"
"fmt"
"github.com/cockroachdb/cockroach/pkg/kv"
"math"
"math/rand"
"sync/atomic"
Expand Down Expand Up @@ -611,10 +612,29 @@ func (mgcq *mvccGCQueue) process(
return false, err
}

log.Eventf(ctx, "MVCC stats after GC: %+v", repl.GetMVCCStats())
log.Eventf(ctx, "GC score after GC: %s", makeMVCCGCQueueScore(
ctx, repl, repl.store.Clock().Now(), lastGC, conf.TTL(), canAdvanceGCThreshold))
scoreAfter := makeMVCCGCQueueScore(
ctx, repl, repl.store.Clock().Now(), lastGC, conf.TTL(), canAdvanceGCThreshold)
log.VEventf(ctx, 2,"MVCC stats after GC: %+v", repl.GetMVCCStats())
log.VEventf(ctx, 2,"GC score after GC: %s", scoreAfter)
updateStoreMetricsWithGCInfo(mgcq.store.metrics, info)
// If the score after running through the queue indicates that this
// replica should be re-queued for GC it most likely means that there
// is something wrong with the stats. One such known issue is
// https://github.com/cockroachdb/cockroach/issues/82920. To fix this we
// recompute stats, it's an expensive operation but it's better to recompute
// them then to spin the GC queue.
if scoreAfter.ShouldQueue {
log.Infof(ctx, "triggering stats re-computation")
req := roachpb.RecomputeStatsRequest{
RequestHeader: roachpb.RequestHeader{Key: desc.StartKey.AsRawKey()},
}
var b kv.Batch
b.AddRawRequest(&req)
err := repl.store.db.Run(ctx, &b)
if err != nil {
log.Errorf(ctx, "Failed to recompute stats with error=%s", err)
}
}
return true, nil
}

Expand Down

0 comments on commit 0aca337

Please sign in to comment.