-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvflowcontrol: annotate/fix perf regressions #109833
kvflowcontrol: annotate/fix perf regressions #109833
Conversation
fe6c9b3
to
a59d5ed
Compare
- Replace the flow controller level mutex-backed kvflowcontrol.Stream => token bucket map with sync.Map. On kv0/enc=false/nodes=3/cpu=96 accessing this map contributed to a high amount of mutex contention. We observe that this bucket is effectively read-only - entries for keys are written once (on creation) and read frequently after. We don't currently GC these buckets, but even if we did, the same access pattern would hold. We'll note that using a sync.Map is slightly more expensive CPU-wise. - Replace various map accesses with individual variables. We were needly using maps to access one of two variables, keyed by work class, for example when maintaining metrics per work class, or tracking token adjustments. The map accesses appeared prominently in CPU profiles and was unnecessary overhead. - Avoid using log.ExpensiveLogEnabled in hot code paths; it shows up in CPU profiles. - Slightly reduce the surface area of kvflowhandle.Handle.mu when returning flow tokens. - We also annotate various other points in the code where peep-hole optimizations exist, as surfaced by kv0/enc=false/nodes=3/cpu=96. Part of cockroachdb#104154. Release note: None
a59d5ed
to
e571ffe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained
I'll bors but if there are any follow ups/need to revert, I'll do it then. bors r+ |
Build succeeded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 9 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go
line 72 at r1 (raw file):
// or when completely inactive (no tokens deducted/returned over 30+ // minutes), clear these out. buckets sync.Map // kvflowcontrol.Stream => *bucket
could use a code comment that says writes require the mutex.
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go
line 113 at r1 (raw file):
// NB: We're holding the controller mutex here, which guards against // new buckets being added, synchronization we don't get out of // sync.Map.Range() directly.
is this the only reason we need updates to the map to hold the mutex?
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go
line 376 at r1 (raw file):
func (b *bucket) tokensLocked(wc admissionpb.WorkClass) kvflowcontrol.Tokens { if wc == regular { return b.mu.tokensPerWorkClass.regular
this would be simpler if tokensPerWorkClass
was an array [NumWorkClasses]kvflowcontrol.Tokens
.
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go
line 401 at r1 (raw file):
defer b.mu.Unlock() // TODO(irfansharif,aaditya): On kv0/enc=false/nodes=3/cpu=96 this mutex is // responsible for ~1.8% of the mutex contention. Maybe address it as part
We could use some more context before deciding whether to try to optimize this.
For example https://cockroachlabs.slack.com/archives/C01SRKWGHG8/p1667521022234139?thread_ts=1667515074.752629&cid=C01SRKWGHG8
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller_metrics.go
line 133 at r1 (raw file):
type metrics struct { ElasticFlowTokensDeducted *metric.Counter
why this change of not using an array?
Is array indexing really a performance hit?
Part of #104154.
Release note: None