-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add RevertSpans and RevertSpansFanout #97845
Conversation
6af38f9
to
40db612
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 6 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @msbutler, @rhu713, @rytaft, and @stevendanna)
pkg/sql/revert.go
line 49 at r1 (raw file):
var maxRevertSpanNumWorkers = settings.RegisterIntSetting( settings.TenantWritable, "max_revert_span_parallelism",
nit: we usually have several parts in a cluster setting name, delimited by periods, with the first part being about the layer, so perhaps name it something like sql.revert.max_span_parallelism
.
pkg/sql/revert.go
line 51 at r1 (raw file):
"max_revert_span_parallelism", "the maximum number of workers used to issue RevertRange request", 8)
Do we want to add NonNegativeInt
or PositiveInt
validation?
pkg/sql/revert.go
line 140 at r1 (raw file):
workerPartitions = spanPartitions } else { workerPartitions = make([]SpanPartition, maxWorkerCount)
If we allow the setting to be set to 0, then we need to handle zero case differently.
pkg/sql/revert.go
line 150 at r1 (raw file):
progressWorker := func(ctx context.Context) error { for sp := range completedSpanCh { if onCompletedCallback != nil {
nit: perhaps take the if
out of the loop and perhaps just short-circuit if the calback is nil.
40db612
to
ff6c3a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @msbutler, @rhu713, and @yuzefovich)
pkg/sql/revert.go
line 51 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Do we want to add
NonNegativeInt
orPositiveInt
validation?
Thanks! PositiveInt should do.
pkg/sql/revert.go
line 140 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
If we allow the setting to be set to 0, then we need to handle zero case differently.
Added the PositiveInt validation so the zero case should be OK now.
pkg/sql/revert.go
line 150 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: perhaps take the
if
out of the loop and perhaps just short-circuit if the calback is nil.
While addressing this comment, I noticed that I wasn't really handling the error return from this callback correctly. So, perhaps I've gone a bit overboard, but I've done a few things:
- short-circuit the 1 worker case to avoid all planning and just call RevertSpans
- short-circuit the case of a nil onCompletedCallback to avoid the extra layer of goroutines altogether
- Added an errCh that workers get the last onCompletedCallback error back from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome! eager to see benchmarks!
// the cutover process. | ||
type cutoverProgressTracker struct { | ||
minProgressUpdateInterval time.Duration | ||
progMetric *metric.Gauge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename this rangesToDoMetric
or something a bit more descriptive? It seems hardcoded to update the number of remaining ranges.
span = resp.ResumeSpan | ||
|
||
if resp.ResumeSpan != nil { | ||
if !resp.ResumeSpan.Valid() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: perhaps we add some logging around resume key spans and the number of attempts to revert the original span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a VEventf above that can be added on each revert range call. Note that a resume span is absolutely expected.
settings.TenantWritable, | ||
"sql.revert.max_span_parallelism", | ||
"the maximum number of workers used to issue RevertRange request", | ||
8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this scale by the number of nodes in the cluster?
ff6c3a7
to
fbc3106
Compare
fbc3106
to
2ade877
Compare
Under the race detector, kvclient uses a transport that disallows mutations of of a request send to the transport. This change ensures we create a new request for each resume span rather than re-using the old request. Epic: none Release note: None
This PR adds RevertSpans and RevertSpansFanout. RevertSpans de-duplicates some code between import rollback and streaming cutover. Along the way, it fixes a small bug that existed in RevertTables: Only a single RevertRangeRequest is permitted in a batch since RevertRangeRequest has the isAlone flag set. As a result, a future caller of RevertTables would have encountered a fatal error from KV. RevertSpansFanout uses DistSQL's PartitionSpans to manually fan out multiple RevertRange requests. Since all users of revert range request currently set a limit on the number of keys touched, dist sender doesn't fanout such request. Release note: None
Release note: None
Epic: none Release note: None
2ade877
to
5adc4f2
Compare
@@ -7568,12 +7568,6 @@ expires until the statement bundle is collected`, | |||
filter = kvpb.MVCCFilter_All | |||
} | |||
|
|||
req := &kvpb.ExportRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this commit merged elsewhere, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh did it? Let me go check, perhaps I hadn't git-pulled before working on this.
bors r=msbutler |
Build failed: |
bors retry |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
This PR adds RevertSpans and RevertSpansFanout.
RevertSpans de-duplicates some code between import rollback and streaming cutover.
Along the way, it fixes a small bug that existed in RevertTables: Only a single RevertRangeRequest is permitted in a batch since RevertRangeRequest has the isAlone flag set. As a result, a future caller of RevertTables would have encountered a fatal error from KV.
RevertSpansFanout uses DistSQL's PartitionSpans to manually fan out multiple RevertRange requests. Since all users of revert range request currently set a limit on the number of keys touched, dist sender doesn't fanout such request.
Epic: none
Release note: None