-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replica_rac2: optimize logTracker.admitted #132140
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
@sumeerbhola Tangentially to #132137, in case more optimizations are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific doesn't seem relevant based on looking at profiles. I will look into more allocation reduction, and reduce mutex acquisitions and measure again.
Reviewable status: complete! 0 of 0 LGTMs obtained
It still separates the concerns a bit, should we merge? The slight micro-cost reduction is an optional bonus here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pav-kv)
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 42 at r1 (raw file):
// admitted returns the current admitted vector, and resets the dirty bit. func (l *logTracker) admitted() rac2.AdmittedVector {
nit: prefer admittedAndResetDirty
or something akin.
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 56 at r1 (raw file):
// allows the next logAdmitted call to return true and allow scheduling a Ready // iteration again. This flow avoids unnecessary Ready scheduling events. func (l *logTracker) admittedDirty() (av rac2.AdmittedVector, dirty bool) {
nit: prefer admittedIfDirtyAndResetDirty
or something akin.
Even if it not fully self-explanatory, it causes readers to read the method comment.
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 59 at r1 (raw file):
l.Lock() defer l.Unlock() l.scheduled = false
The thing I worry about optimizations that don't show up in profiles is that I can't judge the value of a branch reduction given out-of-order speculative execution in processors.
I don't generally have a good mental model here. https://johnnysswlab.com/how-branches-influence-the-performance-of-your-code-and-what-can-you-do-about-it/ is a decent article and concludes that one shouldn't try to reduce branches, and it is more important to be cache friendly. Unconditional writes violate the latter.
So if one were trying to micro-optimize this, I would suggest changing this to
if l.scheduled {
l.scheduled = false
}
Same for the l.dirty = false
in the admitted()
method.
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 61 at r1 (raw file):
l.scheduled = false if !l.dirty { return
nit: I believe the preference in CRDB code is to always do an explicit return with the values being returned.
ad4cef4
to
df4b52e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 42 at r1 (raw file):
Previously, sumeerbhola wrote…
nit: prefer
admittedAndResetDirty
or something akin.
Not relevant in the new version of this PR.
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 56 at r1 (raw file):
Previously, sumeerbhola wrote…
nit: prefer
admittedIfDirtyAndResetDirty
or something akin.Even if it not fully self-explanatory, it causes readers to read the method comment.
Not relevant in the new version of this PR.
pkg/kv/kvserver/kvflowcontrol/replica_rac2/log_tracker.go
line 59 at r1 (raw file):
Previously, sumeerbhola wrote…
The thing I worry about optimizations that don't show up in profiles is that I can't judge the value of a branch reduction given out-of-order speculative execution in processors.
I don't generally have a good mental model here. https://johnnysswlab.com/how-branches-influence-the-performance-of-your-code-and-what-can-you-do-about-it/ is a decent article and concludes that one shouldn't try to reduce branches, and it is more important to be cache friendly. Unconditional writes violate the latter.
So if one were trying to micro-optimize this, I would suggest changing this to
if l.scheduled { l.scheduled = false }
Same for the
l.dirty = false
in theadmitted()
method.
Done. Since both methods would be doing these conditionals anyway, I merged them back to be one method. Instead, I am now caching the latest admitted vector (see the new av
field). The idea is to avoid computing it on every call, and instead do it only if dirty == true
(since we're checking this bit anyway).
Improve cache friendliness by checking the dirty and scheduling bits before resetting them. Cache the latest admitted vector to avoid computing it on every call. Instead, it is recomputed only if dirty bit is true. Epic: none Release note: none
df4b52e
to
4483d3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)
TFTR! bors r=sumeerbhola |
Improve cache friendliness by checking the
dirty
and scheduling bits before resetting them. Cache the latest admitted vector to avoid computing it on every call. Instead, it is recomputed only ifdirty
bit is true.Informs #128033