admission: add l0 control metrics #109640

irfansharif · 2023-08-29T00:56:23Z

Part of #82743.

We introduce metrics for l0 compacted bytes, generated l0 tokens, and l0 tokens returned.

Release note: None

cockroach-teamcity · 2023-08-29T00:56:31Z

This change is

sumeerbhola

Reviewed 4 of 7 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @irfansharif)

pkg/util/admission/grant_coordinator.go line 57 at r1 (raw file):

	kvIOTokensAvailable             *metric.Gauge
	kvElasticIOTokensAvailable      *metric.Gauge
	kvIOTokensTookWithoutPermission *metric.Counter

Both the total tokens-taken and the tokens-without-permission metric are already quite broken

The without permission case is not interesting, if one looks at the callers. We are granting 1 token in tryGrant and then using the return value to subtract the remaining. The latter is counted as took without permission, but not interesting, since we are subtracting immediately. What is more interesting is the bypassed case in updateStoreStatsAfterWorkDone, which could be captured in a metric but currently only in the logs. I think we should delete this metric.
The total tokens taken is broken in that there are many callers to subtractTokensLocked that don't add or subtract from it. This is probably worth fixing. Is the reason for existence of the tokens-returned metric that we need to make each metric.Counter monotonic?

pkg/util/admission/granter.go line 468 at r1 (raw file):

	sg.coordMu.availableElasticIOTokens -= elasticCount
	// Only update when not unlimited. Keep it whatever it was last otherwise.
	if sg.coordMu.availableIOTokens != unlimitedTokens {

As mentioned earlier, and as you noticed, the instrumentation for tokens-taken was broken. Doing it in subtractTokensLocked is the right idea, except when the caller is setAvailableTokens.
I don't understand this unlimitedTokens logic. I don't think it is going to work since even when there are unlimitedTokens for a 15s interval, we don't have unlimitedTokens for every tick, or for 250ms (since 250ms is what we use for computing the burst tokens). And I don't think it is needed since even if the available tokens is huge, here we are trying to count deductions and returns from the real work that is consuming tokens.

pkg/util/admission/io_load_listener.go line 113 at r1 (raw file):

// L0CompactionAlpha is the exponential smoothing term used when measuring L0
// compactions, which in turn is used to generate IO tokens.

I'm not keen to make either of these cluster settings. What existing experimental evidence do we have for tweaking these?
Can we make this PR only about new metrics?

pkg/util/admission/io_load_listener.go line 1076 at r1 (raw file):

			ib(m/adjustmentInterval))
		switch res.aux.tokenKind {
		case compactionTokenKind:

when tokenKind is compactionTokenKind, the value of smoothedCompactionByteTokens is the same as totalNumByteTokens, which we already print above.

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @sumeerbhola)

pkg/util/admission/grant_coordinator.go line 57 at r1 (raw file):

The without permission case is not interesting,

Agreed, removed. I've not been using it for the reasons you mentioned (+cc @bananabrick).

Is the reason for existence of the tokens-returned metric that we need to make each metric.Counter monotonic?

Yes.

The total tokens taken is broken in that there are many callers to subtractTokensLocked that don't add or subtract from it.

I don't follow. I'm looking at the callers for subtractTokensLocked, and in each case, it seems valid to me to either increment the tokens-deducted metric (if subtracting a +ve amount, i.e. actually deducting) or the tokens-returned metric (when subtracting a -ve amount, i.e. returning).

Edit: Ah, you mean the use in setAvailableTokens. I was actually reading the usage wrong, so missed it. Fixed.

pkg/util/admission/granter.go line 468 at r1 (raw file):

Doing it in subtractTokensLocked is the right idea, except when the caller is setAvailableTokens.

Fixed, though awkwardly. But also not much more awkwardly than before.

I don't understand this unlimitedTokens logic.

Hm, I don't know what I was thinking either, I must have been tired. I've removed it. It's annoying to use this metric since the scalar value can be so high, but in grafana you can just filter out datapoints less than some value, and it works well enough. I'll leave it unchanged.

pkg/util/admission/io_load_listener.go line 113 at r1 (raw file):

Previously, sumeerbhola wrote…

I'm not keen to make either of these cluster settings. What existing experimental evidence do we have for tweaking these?
Can we make this PR only about new metrics?

These were very handy in the MVCC GC experiments where we were running close to what we were able to compact out of the LSM pre MVCC GC and post MVCC GC we had enough compaction variability to affect foreground load. I can't find the experimental links to prove what I'm saying, but I've used this a lot. I'm inclined to keep it since these are private settings and I'm not changing the defaults. It does help in experiments when wanting to smooth over token calculations to paper over the current L0 compaction variability issues.

pkg/util/admission/io_load_listener.go line 1076 at r1 (raw file):

Previously, sumeerbhola wrote…

when tokenKind is compactionTokenKind, the value of smoothedCompactionByteTokens is the same as totalNumByteTokens, which we already print above.

Ack, removed but added a comment.

sumeerbhola

modulo the comment about the settings.

Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @bananabrick, and @irfansharif)

pkg/util/admission/granter.go line 378 at r2 (raw file):

	// granter was previously exhausted but is no longer so, we're allowed to
	// admit other waiting requests.
	additionalTokensTaken := cg.parent.storeReplicatedWorkAdmittedLocked(

nit: this change can be undone

pkg/util/admission/granter.go line 470 at r2 (raw file):

			sg.tokensTakenMetric.Inc(count)
		} else {
			sg.tokensReturnedMetric.Inc(count)

shouldn't this be -count?

pkg/util/admission/io_load_listener.go line 902 at r2 (raw file):

	// Let C be smoothedIntL0CompactedBytes.
	//
	// Underload: Score is less than 0.5, which means sublevels is less than 5.

The commentary here and in the if-else blocks is based on these being constants, and would need updating.
I very much prefer taking out the settings change from this PR since it is ready to merge with the metrics changes, and I'm not quite convinced we need to make these configurable yet.

pkg/util/admission/io_load_listener.go line 1078 at r2 (raw file):

		case compactionTokenKind:
			// NB: res.smoothedCompactionByteTokens (printed above) is the same
			// as res.ioLoadListenerState.totalNumByteTokens when

nit: what we are printing above is res.ioLoadListenerState.totalNumByteTokens.

pkg/util/admission/testdata/io_load_listener line 665 at r2 (raw file):

----
compaction score 1.000 (10 ssts, 20 sub-levels), L0 growth 0 B (write 0 B ingest 0 B ignored 0 B): requests 0 (0 bypassed) with 0 B acc-write (0 B bypassed) + 0 B acc-ingest (0 B bypassed) + write-model 0.00x+0 B (smoothed 1.75x+1 B) + ingested-model 0.00x+0 B (smoothed 0.75x+1 B) + at-admission-tokens 1 B, compacted 0 B [≈30 KiB], flushed 0 B [≈0 B]; admitting 65 KiB (rate 4.3 KiB/s) (elastic 1 B rate 0 B/s) due to L0 growth (used total: 0 B elastic 0 B); write stalls 0
{ioLoadListenerState:{cumL0AddedBytes:501000 curL0Bytes:10000 cumWriteStallCount:0 cumFlushWriteThroughput:{Bytes:0 WorkDuration:0 IdleDuration:0} diskBW:{bytesRead:0 bytesWritten:0 incomingLSMBytes:501000} smoothedIntL0CompactedBytes:31250 smoothedCompactionByteTokens:66406.375 smoothedNumFlushTokens:0 flushUtilTargetFraction:1.5 totalNumByteTokens:66406 byteTokensAllocated:0 byteTokensUsed:0 byteTokensUsedByElasticWork:0 totalNumElasticByteTokens:1 elasticByteTokensAllocated:0 elasticDiskBWTokens:9223372036854775807 elasticDiskBWTokensAllocated:0} requestEstimates:{writeTokens:1} l0WriteLM:{multiplier:1.75 constant:1} l0IngestLM:{multiplier:0.7505 constant:1} ingestLM:{multiplier:1 constant:1} aux:{intL0AddedBytes:0 intL0CompactedBytes:0 intFlushTokens:0 intFlushUtilization:0 intWriteStalls:0 prevTokensUsed:0 prevTokensUsedByElasticWork:0 tokenKind:0 perWorkTokensAux:{intWorkCount:0 intL0WriteBytes:0 intL0IngestedBytes:0 intLSMIngestedBytes:0 intL0WriteAccountedBytes:0 intIngestedAccountedBytes:0 intL0WriteLinearModel:{multiplier:0 constant:0} intL0IngestedLinearModel:{multiplier:0 constant:0} intIngestedLinearModel:{multiplier:0 constant:0} intBypassedWorkCount:0 intL0WriteBypassedAccountedBytes:0 intIngestedBypassedAccountedBytes:0 intL0IgnoredIngestedBytes:0} doLogFlush:true diskBW:{intervalDiskLoadInfo:{readBandwidth:0 writeBandwidth:0 provisionedBandwidth:0} intervalLSMInfo:{incomingBytes:0 regularTokensUsed:0 elasticTokensUsed:0}}} ioThreshold:<nil>}

did something about the switch from const 0.5 to a float64 caused these diffs?

Part of cockroachdb#82743. We introduce metrics for l0 compacted bytes, generated l0 tokens, and l0 tokens returned. Release note: None

irfansharif

Yanked the cluster settings diffs out.

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aadityasondhi, @bananabrick, and @sumeerbhola)

pkg/util/admission/granter.go line 470 at r2 (raw file):

Previously, sumeerbhola wrote…

shouldn't this be -count?

Oops, done.

pkg/util/admission/io_load_listener.go line 902 at r2 (raw file):

Previously, sumeerbhola wrote…

The commentary here and in the if-else blocks is based on these being constants, and would need updating.
I very much prefer taking out the settings change from this PR since it is ready to merge with the metrics changes, and I'm not quite convinced we need to make these configurable yet.

Done.

pkg/util/admission/testdata/io_load_listener line 665 at r2 (raw file):

Previously, sumeerbhola wrote…

did something about the switch from const 0.5 to a float64 caused these diffs?

Yes, but it's reverted now that I've pulled those changes out.

craig · 2023-09-05T18:43:42Z

Build succeeded:

Bazel Essential CI (Cockroach)

Part of cockroachdb#82743. We introduce an admission.granter.io_tokens_bypassed.kv metric, that tracks the total number of tokens taken by work bypassing admission control. For example, follower writes without flow control. Aside: cockroachdb#109640 ripped out a tokens-taken-without-permission metric that was supposed to capture some of this, but even for standard admission work we'd routinely exercise that code path. When admitting work, we take 1 token, and later take the remaining without permission. Release note: None

109833: kvflowcontrol: annotate/fix perf regressions r=irfansharif a=irfansharif - Replace the flow controller level mutex-backed kvflowcontrol.Stream => token bucket map with sync.Map. On kv0/enc=false/nodes=3/cpu=96 accessing this map contributed to a high amount of mutex contention. We observe that this bucket is effectively read-only - entries for keys are written once (on creation) and read frequently after. We don't currently GC these buckets, but even if we did, the same access pattern would hold. We'll note that using a sync.Map is slightly more expensive CPU-wise. - Replace various map accesses with individual variables. We were needly using maps to access one of two variables, keyed by work class, for example when maintaining metrics per work class, or tracking token adjustments. The map accesses appeared prominently in CPU profiles and was unnecessary overhead. - Avoid using log.ExpensiveLogEnabled in hot code paths; it shows up in CPU profiles. - Slightly reduce the surface area of kvflowhandle.Handle.mu when returning flow tokens. - We also annotate various other points in the code where peep-hole optimizations exist, as surfaced by kv0/enc=false/nodes=3/cpu=96. Part of #104154. Release note: None 110088: privilege: automate generation of ByName map r=ecwall a=andyyang890 This patch automates the process of generating the `ByName` map so that any newly added privileges will automatically be included. Epic: None Release note: None 110110: admission: add metric for bypassed IO admission work r=irfansharif a=irfansharif Part of #82743. We introduce an admission.granter.io_tokens_bypassed.kv metric, that tracks the total number of tokens taken by work bypassing admission control. For example, follower writes without flow control. Aside: #109640 ripped out a tokens-taken-without-permission metric that was supposed to capture some of this, but even for standard admission work we'd routinely exercise that code path. When admitting work, we take 1 token, and later take the remaining without permission. Release note: None Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Andy Yang <[email protected]>

irfansharif requested review from bananabrick and aadityasondhi August 29, 2023 00:56

irfansharif requested a review from a team as a code owner August 29, 2023 00:56

irfansharif force-pushed the 230828.io-ac-metrics branch from 02b4db1 to 4c2ec63 Compare August 29, 2023 01:20

sumeerbhola requested changes Aug 30, 2023

View reviewed changes

irfansharif force-pushed the 230828.io-ac-metrics branch from 4c2ec63 to 99350f9 Compare September 1, 2023 20:47

irfansharif commented Sep 1, 2023

View reviewed changes

sumeerbhola approved these changes Sep 3, 2023

View reviewed changes

admission: add l0 control metrics

05f0064

Part of cockroachdb#82743. We introduce metrics for l0 compacted bytes, generated l0 tokens, and l0 tokens returned. Release note: None

irfansharif force-pushed the 230828.io-ac-metrics branch from 99350f9 to 05f0064 Compare September 5, 2023 17:26

irfansharif commented Sep 5, 2023

View reviewed changes

irfansharif changed the title ~~admission: add l0 control metrics + settings~~ admission: add l0 control metrics Sep 5, 2023

irfansharif mentioned this pull request Sep 5, 2023

admission: add l0 control settings #110045

Closed

craig bot merged commit 4825e29 into cockroachdb:master Sep 5, 2023

irfansharif mentioned this pull request Sep 6, 2023

admission: add metric for bypassed IO admission work #110110

Merged

irfansharif deleted the 230828.io-ac-metrics branch September 7, 2023 14:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

admission: add l0 control metrics #109640

admission: add l0 control metrics #109640

irfansharif commented Aug 29, 2023 •

edited

Loading

cockroach-teamcity commented Aug 29, 2023

sumeerbhola left a comment

irfansharif left a comment

sumeerbhola left a comment

irfansharif left a comment

craig bot commented Sep 5, 2023

admission: add l0 control metrics #109640

admission: add l0 control metrics #109640

Conversation

irfansharif commented Aug 29, 2023 • edited Loading

cockroach-teamcity commented Aug 29, 2023

sumeerbhola left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

craig bot commented Sep 5, 2023

irfansharif commented Aug 29, 2023 •

edited

Loading