-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: add l0 control metrics #109640
Conversation
02b4db1
to
4c2ec63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 7 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @irfansharif)
pkg/util/admission/grant_coordinator.go
line 57 at r1 (raw file):
kvIOTokensAvailable *metric.Gauge kvElasticIOTokensAvailable *metric.Gauge kvIOTokensTookWithoutPermission *metric.Counter
Both the total tokens-taken and the tokens-without-permission metric are already quite broken
- The without permission case is not interesting, if one looks at the callers. We are granting 1 token in
tryGrant
and then using the return value to subtract the remaining. The latter is counted as took without permission, but not interesting, since we are subtracting immediately. What is more interesting is thebypassed
case inupdateStoreStatsAfterWorkDone
, which could be captured in a metric but currently only in the logs. I think we should delete this metric. - The total tokens taken is broken in that there are many callers to
subtractTokensLocked
that don't add or subtract from it. This is probably worth fixing. Is the reason for existence of the tokens-returned metric that we need to make eachmetric.Counter
monotonic?
pkg/util/admission/granter.go
line 468 at r1 (raw file):
sg.coordMu.availableElasticIOTokens -= elasticCount // Only update when not unlimited. Keep it whatever it was last otherwise. if sg.coordMu.availableIOTokens != unlimitedTokens {
- As mentioned earlier, and as you noticed, the instrumentation for tokens-taken was broken. Doing it in
subtractTokensLocked
is the right idea, except when the caller issetAvailableTokens
. - I don't understand this
unlimitedTokens
logic. I don't think it is going to work since even when there areunlimitedTokens
for a 15s interval, we don't haveunlimitedTokens
for every tick, or for 250ms (since 250ms is what we use for computing the burst tokens). And I don't think it is needed since even if the available tokens is huge, here we are trying to count deductions and returns from the real work that is consuming tokens.
pkg/util/admission/io_load_listener.go
line 113 at r1 (raw file):
// L0CompactionAlpha is the exponential smoothing term used when measuring L0 // compactions, which in turn is used to generate IO tokens.
I'm not keen to make either of these cluster settings. What existing experimental evidence do we have for tweaking these?
Can we make this PR only about new metrics?
pkg/util/admission/io_load_listener.go
line 1076 at r1 (raw file):
ib(m/adjustmentInterval)) switch res.aux.tokenKind { case compactionTokenKind:
when tokenKind
is compactionTokenKind
, the value of smoothedCompactionByteTokens
is the same as totalNumByteTokens
, which we already print above.
4c2ec63
to
99350f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @sumeerbhola)
pkg/util/admission/grant_coordinator.go
line 57 at r1 (raw file):
The without permission case is not interesting,
Agreed, removed. I've not been using it for the reasons you mentioned (+cc @bananabrick).
Is the reason for existence of the tokens-returned metric that we need to make each
metric.Counter
monotonic?
Yes.
The total tokens taken is broken in that there are many callers to
subtractTokensLocked
that don't add or subtract from it.
I don't follow. I'm looking at the callers for subtractTokensLocked, and in each case, it seems valid to me to either increment the tokens-deducted metric (if subtracting a +ve amount, i.e. actually deducting) or the tokens-returned metric (when subtracting a -ve amount, i.e. returning).
Edit: Ah, you mean the use in setAvailableTokens. I was actually reading the usage wrong, so missed it. Fixed.
pkg/util/admission/granter.go
line 468 at r1 (raw file):
Doing it in
subtractTokensLocked
is the right idea, except when the caller issetAvailableTokens
.
Fixed, though awkwardly. But also not much more awkwardly than before.
I don't understand this
unlimitedTokens
logic.
Hm, I don't know what I was thinking either, I must have been tired. I've removed it. It's annoying to use this metric since the scalar value can be so high, but in grafana you can just filter out datapoints less than some value, and it works well enough. I'll leave it unchanged.
pkg/util/admission/io_load_listener.go
line 113 at r1 (raw file):
Previously, sumeerbhola wrote…
I'm not keen to make either of these cluster settings. What existing experimental evidence do we have for tweaking these?
Can we make this PR only about new metrics?
These were very handy in the MVCC GC experiments where we were running close to what we were able to compact out of the LSM pre MVCC GC and post MVCC GC we had enough compaction variability to affect foreground load. I can't find the experimental links to prove what I'm saying, but I've used this a lot. I'm inclined to keep it since these are private settings and I'm not changing the defaults. It does help in experiments when wanting to smooth over token calculations to paper over the current L0 compaction variability issues.
pkg/util/admission/io_load_listener.go
line 1076 at r1 (raw file):
Previously, sumeerbhola wrote…
when
tokenKind
iscompactionTokenKind
, the value ofsmoothedCompactionByteTokens
is the same astotalNumByteTokens
, which we already print above.
Ack, removed but added a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modulo the comment about the settings.
Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @irfansharif)
pkg/util/admission/granter.go
line 378 at r2 (raw file):
// granter was previously exhausted but is no longer so, we're allowed to // admit other waiting requests. additionalTokensTaken := cg.parent.storeReplicatedWorkAdmittedLocked(
nit: this change can be undone
pkg/util/admission/granter.go
line 470 at r2 (raw file):
sg.tokensTakenMetric.Inc(count) } else { sg.tokensReturnedMetric.Inc(count)
shouldn't this be -count
?
pkg/util/admission/io_load_listener.go
line 902 at r2 (raw file):
// Let C be smoothedIntL0CompactedBytes. // // Underload: Score is less than 0.5, which means sublevels is less than 5.
The commentary here and in the if-else blocks is based on these being constants, and would need updating.
I very much prefer taking out the settings change from this PR since it is ready to merge with the metrics changes, and I'm not quite convinced we need to make these configurable yet.
pkg/util/admission/io_load_listener.go
line 1078 at r2 (raw file):
case compactionTokenKind: // NB: res.smoothedCompactionByteTokens (printed above) is the same // as res.ioLoadListenerState.totalNumByteTokens when
nit: what we are printing above is res.ioLoadListenerState.totalNumByteTokens
.
pkg/util/admission/testdata/io_load_listener
line 665 at r2 (raw file):
---- compaction score 1.000 (10 ssts, 20 sub-levels), L0 growth 0 B (write 0 B ingest 0 B ignored 0 B): requests 0 (0 bypassed) with 0 B acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + write-model 0.00x+0 B (smoothed 1.75x+1 B) + ingested-model 0.00x+0 B (smoothed 0.75x+1 B) + at-admission-tokens 1 B, compacted 0 B [≈30 KiB], flushed 0 B [≈0 B]; admitting 65 KiB (rate 4.3 KiB/s) (elastic 1 B rate 0 B/s) due to L0 growth (used total: 0 B elastic 0 B); write stalls 0 {ioLoadListenerState:{cumL0AddedBytes:501000 curL0Bytes:10000 cumWriteStallCount:0 cumFlushWriteThroughput:{Bytes:0 WorkDuration:0 IdleDuration:0} diskBW:{bytesRead:0 bytesWritten:0 incomingLSMBytes:501000} smoothedIntL0CompactedBytes:31250 smoothedCompactionByteTokens:66406.375 smoothedNumFlushTokens:0 flushUtilTargetFraction:1.5 totalNumByteTokens:66406 byteTokensAllocated:0 byteTokensUsed:0 byteTokensUsedByElasticWork:0 totalNumElasticByteTokens:1 elasticByteTokensAllocated:0 elasticDiskBWTokens:9223372036854775807 elasticDiskBWTokensAllocated:0} requestEstimates:{writeTokens:1} l0WriteLM:{multiplier:1.75 constant:1} l0IngestLM:{multiplier:0.7505 constant:1} ingestLM:{multiplier:1 constant:1} aux:{intL0AddedBytes:0 intL0CompactedBytes:0 intFlushTokens:0 intFlushUtilization:0 intWriteStalls:0 prevTokensUsed:0 prevTokensUsedByElasticWork:0 tokenKind:0 perWorkTokensAux:{intWorkCount:0 intL0WriteBytes:0 intL0IngestedBytes:0 intLSMIngestedBytes:0 intL0WriteAccountedBytes:0 intIngestedAccountedBytes:0 intL0WriteLinearModel:{multiplier:0 constant:0} intL0IngestedLinearModel:{multiplier:0 constant:0} intIngestedLinearModel:{multiplier:0 constant:0} intBypassedWorkCount:0 intL0WriteBypassedAccountedBytes:0 intIngestedBypassedAccountedBytes:0 intL0IgnoredIngestedBytes:0} doLogFlush:true diskBW:{intervalDiskLoadInfo:{readBandwidth:0 writeBandwidth:0 provisionedBandwidth:0} intervalLSMInfo:{incomingBytes:0 regularTokensUsed:0 elasticTokensUsed:0}}} ioThreshold:<nil>}
did something about the switch from const 0.5 to a float64 caused these diffs?
Part of cockroachdb#82743. We introduce metrics for l0 compacted bytes, generated l0 tokens, and l0 tokens returned. Release note: None
99350f9
to
05f0064
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yanked the cluster settings diffs out.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aadityasondhi, @bananabrick, and @sumeerbhola)
pkg/util/admission/granter.go
line 470 at r2 (raw file):
Previously, sumeerbhola wrote…
shouldn't this be
-count
?
Oops, done.
pkg/util/admission/io_load_listener.go
line 902 at r2 (raw file):
Previously, sumeerbhola wrote…
The commentary here and in the if-else blocks is based on these being constants, and would need updating.
I very much prefer taking out the settings change from this PR since it is ready to merge with the metrics changes, and I'm not quite convinced we need to make these configurable yet.
Done.
pkg/util/admission/testdata/io_load_listener
line 665 at r2 (raw file):
Previously, sumeerbhola wrote…
did something about the switch from const 0.5 to a float64 caused these diffs?
Yes, but it's reverted now that I've pulled those changes out.
Build succeeded: |
Part of cockroachdb#82743. We introduce an admission.granter.io_tokens_bypassed.kv metric, that tracks the total number of tokens taken by work bypassing admission control. For example, follower writes without flow control. Aside: cockroachdb#109640 ripped out a tokens-taken-without-permission metric that was supposed to capture some of this, but even for standard admission work we'd routinely exercise that code path. When admitting work, we take 1 token, and later take the remaining without permission. Release note: None
109833: kvflowcontrol: annotate/fix perf regressions r=irfansharif a=irfansharif - Replace the flow controller level mutex-backed kvflowcontrol.Stream => token bucket map with sync.Map. On kv0/enc=false/nodes=3/cpu=96 accessing this map contributed to a high amount of mutex contention. We observe that this bucket is effectively read-only - entries for keys are written once (on creation) and read frequently after. We don't currently GC these buckets, but even if we did, the same access pattern would hold. We'll note that using a sync.Map is slightly more expensive CPU-wise. - Replace various map accesses with individual variables. We were needly using maps to access one of two variables, keyed by work class, for example when maintaining metrics per work class, or tracking token adjustments. The map accesses appeared prominently in CPU profiles and was unnecessary overhead. - Avoid using log.ExpensiveLogEnabled in hot code paths; it shows up in CPU profiles. - Slightly reduce the surface area of kvflowhandle.Handle.mu when returning flow tokens. - We also annotate various other points in the code where peep-hole optimizations exist, as surfaced by kv0/enc=false/nodes=3/cpu=96. Part of #104154. Release note: None 110088: privilege: automate generation of ByName map r=ecwall a=andyyang890 This patch automates the process of generating the `ByName` map so that any newly added privileges will automatically be included. Epic: None Release note: None 110110: admission: add metric for bypassed IO admission work r=irfansharif a=irfansharif Part of #82743. We introduce an admission.granter.io_tokens_bypassed.kv metric, that tracks the total number of tokens taken by work bypassing admission control. For example, follower writes without flow control. Aside: #109640 ripped out a tokens-taken-without-permission metric that was supposed to capture some of this, but even for standard admission work we'd routinely exercise that code path. When admitting work, we take 1 token, and later take the remaining without permission. Release note: None Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Andy Yang <[email protected]>
Part of #82743.
We introduce metrics for l0 compacted bytes, generated l0 tokens, and l0 tokens returned.
Release note: None