-
Notifications
You must be signed in to change notification settings - Fork 3.8k
/
split_stats_helper.go
220 lines (207 loc) · 10.1 KB
/
split_stats_helper.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
// Copyright 2019 The Cockroach Authors.
//
// Use of this software is governed by the Business Source License
// included in the file licenses/BSL.txt.
//
// As of the Change Date specified in that file, in accordance with
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0, included in the file
// licenses/APL.txt.
package batcheval
import "github.com/cockroachdb/cockroach/pkg/storage/enginepb"
// splitStatsHelper codifies and explains the stats computations related to a
// split. The quantities known during a split (i.e. while the split trigger
// is evaluating) are
//
// - AbsPreSplitBothEstimated: the stats of the range before the split trigger,
// i.e. without accounting for any writes in the batch. This can have
// ContainsEstimates set.
// - DeltaBatchEstimated: the writes in the batch, i.e. the stats delta accrued
// from the evaluation of the EndTxn so far (this is mostly the write to the
// transaction record, as well as resolving the intent on the range descriptor,
// but nothing in this code relies on that). Since we have no reason to
// introduce ContainsEstimates in a split trigger, this typically has
// ContainsEstimates unset, but the results will be estimate free either way.
// - AbsPostSplit{Left,Right}: the stats of either the left or right hand side
// range after applying the split, i.e. accounting both for the shrinking as
// well as for the writes in DeltaBatch related to the shrunk keyrange. In
// practice, we obtain this by recomputing the stats using the corresponding
// AbsPostSplit{Left,Right}Fn, and so we don't expect ContainsEstimates to be
// set in them. The choice of which side to scan is controlled by ScanRightFirst.
// - DeltaRangeKey: the stats delta that must be added to the non-computed
// half's stats to account for the splitting of range keys straddling the split
// point. See computeSplitRangeKeyStatsDelta() for details.
//
// We are interested in computing from this the quantities
//
// - AbsPostSplitRight(): the stats of the right hand side created by the split,
// i.e. the data taken over from the left hand side plus whatever was written to
// the right hand side in the process (metadata etc). We can recompute this, but
// try to avoid it unless necessary (when CombinedErrorDelta below is nonzero).
// - DeltaPostSplitLeft(): the stats delta that should be emitted by the split
// trigger itself, i.e. the data which the left hand side (initially comprising
// both halves) loses by moving data into the right hand side (including whatever
// DeltaBatch contained in contributions attributable to the keyspace on the
// left).
// - CombinedErrorDelta: the difference between (AbsPreSplitBoth+DeltaBatch) and
// the recomputation of the pre-split range including the batch. This is zero if
// neither of the inputs contains estimates. If it's not zero, we need to
// recompute from scratch to obtain AbsPostSplitRight. What's interesting about
// this quantity is that we never care what exactly it is, but we do care
// whether it's zero or not because if it's zero we get to do less work.
//
// Moreover, we want both neither of AbsPostSplit{Right,Left} to end up with
// estimates. The way splits are set up right now, we sort of get this "for
// free" for the left hand side (since we recompute that unconditionally; there
// is a guarantee that the left hand side is never too large). We also don't
// want to create new ranges that start out with estimates (just to prevent the
// unbounded proliferation of estimates).
//
// The two unknown quantities can be expressed in terms of the known quantities
// because
//
// (1) AbsPreSplitBoth + DeltaBatch + DeltaRangeKeyRight
// - CombinedErrorDelta = AbsPostSplitLeft + AbsPostSplitRight
//
// In words, this corresponds to "all bytes are accounted for": from the initial
// stats that we have (accounting for the fact that AbsPreSplitBoth+DeltaBatch
// may contain estimates), everything we add/remove during the split ends up
// tracked either on the left and on the right, and nothing is created out of
// thin air.
//
// (2) AbsPreSplitBoth + DeltaPostSplitLeft() = AbsPostSplitLeft
//
// This expresses the fact that is always true whenever a command applies on a
// range without introducing an estimate: the stats before the command plus the
// delta emitted by the command equal the stats after the command. In this case,
// the stats before the command are that of the range before the split (remember
// that the split shrinks the range towards the start key, i.e. the left hand
// side is the same range as the pre-split one).
//
// These two equations are easily solved for the unknowns. First, we can express
// DeltaPostSplitLeft() in known quantities via (2) as
//
// DeltaPostSplitLeft() = AbsPostSplitLeft - AbsPreSplitBothEstimated.
//
// Note that if we start out with estimates, DeltaPostSplitLeft() will wipe out
// those estimates when added to the absolute stats.
//
// For AbsPostSplitRight(), there are two cases. First, due to the identity
//
// CombinedErrorDelta = AbsPreSplitBothEstimated + DeltaBatchEstimated
// -(AbsPostSplitLeft + AbsPostSplitRight)
// + DeltaRangeKeyRight.
//
// and the fact that the second and third lines contain no estimates, we know
// that CombinedErrorDelta is zero if the first line contains no estimates.
// Using this, we can rearrange as
//
// AbsPostSplitRight() = AbsPreSplitBoth + DeltaBatch - AbsPostSplitLeft
// + DeltaRangeKeyRight.
//
// where all quantities on the right are known. If CombinedErrorDelta is
// nonzero, we effectively have one more unknown in our linear system and we
// need to recompute AbsPostSplitRight from scratch. (As fallout, we can in
// principle compute CombinedError, but we don't care).
type splitStatsHelper struct {
in splitStatsHelperInput
absPostSplitLeft *enginepb.MVCCStats
absPostSplitRight *enginepb.MVCCStats
}
// splitStatsScanFn scans a post-split keyspace to compute its stats. The
// computed stats should not contain estimates.
type splitStatsScanFn func() (enginepb.MVCCStats, error)
// splitStatsHelperInput is passed to makeSplitStatsHelper.
type splitStatsHelperInput struct {
AbsPreSplitBothEstimated enginepb.MVCCStats
DeltaBatchEstimated enginepb.MVCCStats
DeltaRangeKey enginepb.MVCCStats
// AbsPostSplitLeftFn returns the stats for the left hand side of the
// split.
AbsPostSplitLeftFn splitStatsScanFn
// AbsPostSplitRightFn returns the stats for the right hand side of the
// split.
AbsPostSplitRightFn splitStatsScanFn
// ScanRightFirst controls whether the left hand side or the right hand
// side of the split is scanned first. In cases where neither of the
// input stats contain estimates, this is the only side that needs to
// be scanned.
ScanRightFirst bool
}
// makeSplitStatsHelper initializes a splitStatsHelper. The values in the input
// are assumed to not change outside of the helper and must no longer be used.
// The provided AbsPostSplitLeftFn and AbsPostSplitRightFn recompute the left
// and right hand sides of the split after accounting for the split trigger
// batch. Each are only invoked at most once, and only when necessary.
func makeSplitStatsHelper(input splitStatsHelperInput) (splitStatsHelper, error) {
h := splitStatsHelper{
in: input,
}
// Scan to compute the stats for the first side.
var absPostSplitFirst enginepb.MVCCStats
var err error
if h.in.ScanRightFirst {
absPostSplitFirst, err = input.AbsPostSplitRightFn()
h.absPostSplitRight = &absPostSplitFirst
} else {
absPostSplitFirst, err = input.AbsPostSplitLeftFn()
h.absPostSplitLeft = &absPostSplitFirst
}
if err != nil {
return splitStatsHelper{}, err
}
if h.in.AbsPreSplitBothEstimated.ContainsEstimates == 0 &&
h.in.DeltaBatchEstimated.ContainsEstimates == 0 {
// We have CombinedErrorDelta zero, so use arithmetic to compute the
// stats for the second side.
ms := h.in.AbsPreSplitBothEstimated
ms.Subtract(absPostSplitFirst)
ms.Add(h.in.DeltaBatchEstimated)
ms.Add(h.in.DeltaRangeKey)
if h.in.ScanRightFirst {
h.absPostSplitLeft = &ms
} else {
h.absPostSplitRight = &ms
}
return h, nil
}
// Estimates are contained in the input, so ask the oracle to scan to compute
// the stats for the second side. We only scan the second side when either of
// the input stats above (AbsPreSplitBothEstimated or DeltaBatchEstimated)
// contains estimates, so that we can guarantee that the post-splits stats
// don't.
var absPostSplitSecond enginepb.MVCCStats
if h.in.ScanRightFirst {
absPostSplitSecond, err = input.AbsPostSplitLeftFn()
h.absPostSplitLeft = &absPostSplitSecond
} else {
absPostSplitSecond, err = input.AbsPostSplitRightFn()
h.absPostSplitRight = &absPostSplitSecond
}
if err != nil {
return splitStatsHelper{}, err
}
return h, nil
}
// AbsPostSplitRight returns the stats of the right hand side created by the
// split. The result is returned as a pointer because the caller can freely
// modify it, assuming they're adding only stats corresponding to mutations that
// they know only affect the right hand side. (If estimates are introduced in
// the process, the right hand side will start out with estimates). Implicitly
// this changes the DeltaBatchEstimated supplied to makeSplitStatsHelper, but
// the contract assumes that that value will no longer be used.
func (h splitStatsHelper) AbsPostSplitRight() *enginepb.MVCCStats {
return h.absPostSplitRight
}
// DeltaPostSplitLeft return the stats delta to be emitted on the left hand side
// as the result of the split. It accounts for the data moved to the right hand
// side as well as any mutations to the left hand side carried out during the
// split, and additionally removes any estimates present in the pre-split stats.
func (h splitStatsHelper) DeltaPostSplitLeft() enginepb.MVCCStats {
// NB: if we ever wanted to also write to the left hand side after init'ing
// the helper, we can make that work, too.
// NB: note how none of this depends on mutations to absPostSplitRight.
ms := *h.absPostSplitLeft
ms.Subtract(h.in.AbsPreSplitBothEstimated)
return ms
}