-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
singleflight: spruce up singleflight #91038
singleflight: spruce up singleflight #91038
Conversation
84b59f2
to
446ffb8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 7 of 23 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/util/syncutil/singleflight/singleflight.go
line 87 at r2 (raw file):
} // Do executes and returns the results of the given function, making
One thing that's not clear is what will happen if ctx
is canceled while the function is running, in particular, while it's running on behalf of another caller. I know you're not changing the behavior, but, by adding a context, I think you're making it more confusing.
446ffb8
to
cf42b22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/util/syncutil/singleflight/singleflight.go
line 87 at r2 (raw file):
Previously, ajwerner wrote…
One thing that's not clear is what will happen if
ctx
is canceled while the function is running, in particular, while it's running on behalf of another caller. I know you're not changing the behavior, but, by adding a context, I think you're making it more confusing.
See now; I've added a note about it.
I would argue that I'm hardly making things worse - virtually every closure passed to these functions was already capturing a ctx. And even if I am making it marginally worse, I think it's worth it for the new logging.
Btw, this function has a single caller; everybody else uses DoChan
. If you want, we can delete it. We could also have it take DoOpts
, if you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 22 of 23 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @andreimatei)
pkg/kv/kvserver/replica_rangefeed.go
line 712 at r3 (raw file):
select { case m.RangeFeedSlowClosedTimestampNudgeSem <- struct{}{}: case <-r.store.stopper.ShouldQuiesce():
Should we use the ctx
again?
pkg/kv/kvserver/protectedts/ptcache/cache.go
line 215 at r3 (raw file):
lastReset = timeutil.Now() case <-c.stopper.ShouldQuiesce(): if doneCh != nil {
Should we defer this in a closure up above where we declare var doneCh singleflight.Future
?
pkg/util/syncutil/singleflight/singleflight.go
line 49 at r3 (raw file):
} type waiter struct {
Is this struct used? Was it replaced by futureImpl
?
pkg/util/syncutil/singleflight/singleflight.go
line 143 at r3 (raw file):
// Future represents the result of the DoChan() call. type Future interface { // ReaderClose indicates that the reader is no longer waiting on this Future.
nit: Why ReaderClose
instead of just Close
? Was Close
ambiguous from the external interface of this package? Would Abandon
be?
pkg/util/syncutil/singleflight/singleflight.go
line 196 at r3 (raw file):
f.mu.Lock() if f.mu.sp != nil { f.mu.sp.Recordf("finished waiting on singleflight %s:%s. err: %v", groupName, key, r.Err)
Span.Recordf
is different than the call to log.Eventf
above. Will that be observable? Will we include the filename and line number in this event?
pkg/util/syncutil/singleflight/singleflight.go
line 247 at r3 (raw file):
return waiter, false } waiter := newFutureImpl(nil /* sp - the leader does not keep track of the span because it doesn't need to log*/)
nit: s/log*/log */
pkg/util/syncutil/singleflight/singleflight.go
line 275 at r3 (raw file):
// Copy the log tags and the span. ctx = logtags.AddTags(context.Background(), logtags.FromContext(ctx)) ctx = tracing.ContextWithSpan(ctx, sp)
What is the effect of this? Do the trace events make it to the caller's span? What if the caller's context is canceled but the cancelation is not inherited?
pkg/util/syncutil/singleflight/singleflight.go
line 283 at r3 (raw file):
} if opts.Stop != nil {
nit: two if opts.Stop != nil
branches in a row. Collapse them?
pkg/util/syncutil/singleflight/singleflight.go
line 298 at r3 (raw file):
res := Result{c.val, c.err, c.dups > 0} for _, waiter := range c.waiters { waiter.writerClose(res, g.opName, key)
Did you consider an approach that propagated the tracing events from the leader's span to the followers' so that everyone who waited on the singleflight ended up with it in their trace? Does that run into complications with singleflight operations that started before one of its followers?
cf42b22
to
305a38c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/kv/kvserver/replica_rangefeed.go
line 712 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we use the
ctx
again?
yes, done
pkg/kv/kvserver/protectedts/ptcache/cache.go
line 215 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we defer this in a closure up above where we declare
var doneCh singleflight.Future
?
done; please see now
pkg/util/syncutil/singleflight/singleflight.go
line 49 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this struct used? Was it replaced by
futureImpl
?
yes, removed, thanks
pkg/util/syncutil/singleflight/singleflight.go
line 143 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: Why
ReaderClose
instead of justClose
? WasClose
ambiguous from the external interface of this package? WouldAbandon
be?
This all changed.
pkg/util/syncutil/singleflight/singleflight.go
line 196 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Span.Recordf
is different than the call tolog.Eventf
above. Will that be observable? Will we include the filename and line number in this event?
this all changed
pkg/util/syncutil/singleflight/singleflight.go
line 247 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
s/log*/log */
done
pkg/util/syncutil/singleflight/singleflight.go
line 275 at r3 (raw file):
Do the trace events make it to the caller's span?
The events make it to a child span opened at the beginning of this function.
What if the caller's context is canceled but the cancelation is not inherited?
That's why we're creating a child span - to allow it to outlive the caller's span. Added a note about it.
pkg/util/syncutil/singleflight/singleflight.go
line 283 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: two
if opts.Stop != nil
branches in a row. Collapse them?
done
pkg/util/syncutil/singleflight/singleflight.go
line 298 at r3 (raw file):
Did you consider...
I have. But I can consider it again.
I considered it again, and implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r5, 27 of 27 files at r7, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @andreimatei)
pkg/util/syncutil/singleflight/singleflight.go
line 58 at r7 (raw file):
///////////////////////////////////////////////////////////////////////////// // These fields are read and written with the singleflight mutex held before // the channel is closed, and are read but not written after the chanel is
s/chanel/channel/
pkg/util/syncutil/singleflight/singleflight.go
line 208 at r7 (raw file):
// then the recording of the flight will be ingested into ctx even if this // caller was not the flight's leader. func (f Future) Result(ctx context.Context) Result {
Consider renaming to use a verb like WaitForResult
or ResolveResult
. I did not find it clear at the callers of this method that this was blocking and would return early on context cancellation.
pkg/util/syncutil/singleflight/singleflight.go
line 227 at r7 (raw file):
log.Eventf(ctx, "waiting for singleflight interrupted: %v", ctx.Err()) } return Result{
Did you consider having Result
return an error, separate from the result, in cases of context cancellation? Is it useful to callers to distinguish between cases where the signleflight call failed and cases where its own context was canceled and so it did not wait for the future to be resolved?
pkg/util/tracing/span.go
line 370 at r7 (raw file):
} var rec Trace if recType != tracingpb.RecordingOff {
Why do we need this condition here but not above in GetTraceRecording
?
pkg/util/tracing/span_inner.go
line 96 at r7 (raw file):
recType tracingpb.RecordingType, finishing bool, ) tracingpb.Recording { if s.isNoop() {
Should this call GetTraceRecording()
to avoid redundancy?
305a38c
to
c9e87d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/util/syncutil/singleflight/singleflight.go
line 58 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/chanel/channel/
done
pkg/util/syncutil/singleflight/singleflight.go
line 208 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Consider renaming to use a verb like
WaitForResult
orResolveResult
. I did not find it clear at the callers of this method that this was blocking and would return early on context cancellation.
renamed to WaitForResult
pkg/util/syncutil/singleflight/singleflight.go
line 227 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did you consider having
Result
return an error, separate from the result, in cases of context cancellation? Is it useful to callers to distinguish between cases where the signleflight call failed and cases where its own context was canceled and so it did not wait for the future to be resolved?
That doesn't appear to be useful to any current callers. I can't very clearly imagine one needing to distinguish... But I can do it if you think so.
FWIW, the error message that a leader or follower gets on ctx cancelation says something like "interrupted during singleflight".
pkg/util/tracing/span.go
line 370 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why do we need this condition here but not above in
GetTraceRecording
?
it's not needed here either; removed
pkg/util/tracing/span_inner.go
line 96 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should this call
GetTraceRecording()
to avoid redundancy?
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajwerner do you want to give this another pass as well?
Reviewed 2 of 3 files at r9, 28 of 28 files at r11, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @andreimatei)
pkg/sql/catalog/lease/lease.go
line 537 at r11 (raw file):
if result.Err != nil { return false, result.Err
nit: stray line
pkg/util/tracing/crdbspan.go
line 251 at r11 (raw file):
} func (t Trace) String() string {
Trace
looks huge. Are you sure we want this defined on a value receiver? We also then call Flatten
, which has a ptr receiver.
c9e87d8
to
cc35de9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r9, 2 of 28 files at r11.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
cc35de9
to
5915f5b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @nvanbenschoten)
pkg/sql/catalog/lease/lease.go
line 537 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: stray line
removed
pkg/util/tracing/crdbspan.go
line 251 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Trace
looks huge. Are you sure we want this defined on a value receiver? We also then callFlatten
, which has a ptr receiver.
reverted
c2c8554
to
fe6dd36
Compare
fe6dd36
to
477e714
Compare
34a52d3
to
f543dfb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajwerner check out the last commit, pls
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @nvanbenschoten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last commit LGTM
Clarify what part of the span.getStructuredRecording() result is shared with the span. Release note: None
AddStructuredRecord() did not have a good reason to take a pointer (the event struct is small enough). Taking a ptr was confusing - it suggested sharing where the function was actually making a copy internally. Release note: None
The nudger tasks were looking at context cancelation, but their context is never canceled. This patch makes them look at stopper quiescence, which is probaly what was always intended. Release note: None
This patch improves singleflight with a couple of features that were done manually by most callers. The singleflight now optionally makes a flight's context not respond to the cancelation of the caller's ctx, the flights now run in stopper tasks, and stopper quiescence cancels the flights' contexts. The callers were doing some of these things, but inconsistently. Also, the flights now get a tracing span and callers that join an existing span get a copy of the recording of the flight leader. We've wanted this multiple times when debugging, for example in the context of table descriptor lease acquisitions and range descriptor resolving. The interface for getting the results out of DoChan() had to change a bit as a result. Release note: None
Since the previous commit, descriptor leasing is correctly included in statement traces. As such, a bunch of rtt tests changed their counters. For these tests, leasing is not actually interesting; it's more representative to benchmark these tests against nodes that already have the respective leases. This patch does that by ensuring that various objects are leased before the test. For the has_<foo>_privilege() tests, the rtt number dropped to 0. These tests had flavors that were working across multiple objects. Since we're now asserting 0 requests for one object, testing across multiple objects is no longer interesting. I've deleted these tests. Release note: None
f543dfb
to
68cd03a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @nvanbenschoten)
Build succeeded: |
This patch improves singleflight with a couple of features that were
done manually by most callers. The singleflight now optionally makes a flight's
context not respond to the cancelation of the caller's ctx, the flights
now run in stopper tasks, and stopper quiescence cancels the flights'
contexts. The callers were doing some of these things, but
inconsistently.
Also, the flights now get a tracing span and callers that join an
existing span get log messages demarcating the wait time. We've wanted
this multiple times when debugging, for example in the context of table
descriptor lease acquisitions and range descriptor resolving. To
accomodate for this, the contract of singleflight.DoChan() gets a bit
more complicated: the singleflight captures the tracing spans of all
non-leaders, and these callers have to call a new ReaderClosed() method
on the DoChan() result if they're not going to wait for the flight
result.
Fixes #90854
Release note: None
Epic: None