admission: change store write admission control to use byte tokens #80480

sumeerbhola · 2022-04-25T14:36:50Z

This switch to byte tokens will result in better accounting for
large writes, including ingests based on whether their bytes land
in L0 or elsewhere. It is also a precurson to taking into account
flush capacity (in bytes).

The store write admission control path now uses a StoreWorkQueue
which wraps a WorkQueue and provides additional functionality:

Work can specify WriteBytes and whether it is an IngestRequest.
This is used to decide how many byte tokens to consume.
Done work specifies how many bytes were ingested into L0, so
token consumption can be adjusted.

The main framework change is that a single work item can consume
multiple (byte) tokens, which ripples through the various
interfaces including requester, granter. There is associated
cleanup: kvGranter that was handling both slots and tokens is
eliminated since in practice it was only doing one or the other.
Instead, for the slot case the slotGranter is reused. For the token
case there is a new kvStoreTokenGranter.

The main logic change is in ioLoadListener which computes byte
tokens and various estimates. The change is (mostly) neutral if
no write provides WriteBytes, since the usual estimation will
take over. The integration changes in this PR are superficial in
that the requests don't provide WriteBytes. Improvements to the
integration, along with experimental results, will happen in
future PRs.

Informs #79092
Informs #77357

Release note: None

cockroach-teamcity · 2022-04-25T14:36:58Z

This change is

ajwerner

Just nits

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)

-- commits line 6 at r1:
nit: precursor

Code quote:

precurson

pkg/util/admission/work_queue.go line 1581 at r1 (raw file):

}

type StoreWriteWorkInfo struct {

you'll need a comment here

pkg/util/admission/work_queue.go line 1654 at r1 (raw file):

}

func (q *StoreWorkQueue) AdmittedWorkDone(h StoreWorkHandle, ingestedIntoL0Bytes int64) error {

the linter will complain about the lack of commentary on the public methods.

pkg/util/admission/work_queue.go line 1726 at r1 (raw file):

) storeRequester {
	q := &StoreWorkQueue{
		q: *(makeWorkQueue(ambientCtx, KVWork, granter, settings, opts).(*WorkQueue)),

our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.

pkg/util/admission/testdata/io_load_listener line 120 at r1 (raw file):

# don't limit the tokens.
#
# TODO: Remove this temp comment for reviewers. All of the earlier test cases

ack

pkg/util/admission/granter_test.go line 122 at r1 (raw file):

		buf.Reset()
		buf.Reset()
		buf.Reset()

extra reset calls?

Code quote:

		buf.Reset()
		buf.Reset()

This switch to byte tokens will result in better accounting for large writes, including ingests based on whether their bytes land in L0 or elsewhere. It is also a precursor to taking into account flush capacity (in bytes). The store write admission control path now uses a StoreWorkQueue which wraps a WorkQueue and provides additional functionality: - Work can specify WriteBytes and whether it is an IngestRequest. This is used to decide how many byte tokens to consume. - Done work specifies how many bytes were ingested into L0, so token consumption can be adjusted. The main framework change is that a single work item can consume multiple (byte) tokens, which ripples through the various interfaces including requester, granter. There is associated cleanup: kvGranter that was handling both slots and tokens is eliminated since in practice it was only doing one or the other. Instead, for the slot case the slotGranter is reused. For the token case there is a new kvStoreTokenGranter. The main logic change is in ioLoadListener which computes byte tokens and various estimates. The change is (mostly) neutral if no write provides WriteBytes, since the usual estimation will take over. The integration changes in this PR are superficial in that the requests don't provide WriteBytes. Improvements to the integration, along with experimental results, will happen in future PRs. Informs cockroachdb#79092 Informs cockroachdb#77357 Release note: None

sumeerbhola

TFTR!

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

-- commits line 6 at r1:

Previously, ajwerner wrote…

nit: precursor

Done

pkg/util/admission/work_queue.go line 1581 at r1 (raw file):

Previously, ajwerner wrote…

you'll need a comment here

Done

pkg/util/admission/work_queue.go line 1654 at r1 (raw file):

Previously, ajwerner wrote…

the linter will complain about the lack of commentary on the public methods.

Done

pkg/util/admission/work_queue.go line 1726 at r1 (raw file):

Previously, ajwerner wrote…

our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.

Done, by doing all the work in initWorkQueue. I've kept the goroutine starting in the init method since there is no reason to delay it.

pkg/util/admission/testdata/io_load_listener line 120 at r1 (raw file):

Previously, ajwerner wrote…

ack

Removed

pkg/util/admission/granter_test.go line 122 at r1 (raw file):

Previously, ajwerner wrote…

extra reset calls?

ghost, methinks.
removed.

ajwerner

Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

sumeerbhola · 2022-05-03T02:04:26Z

bors r=ajwerner

craig · 2022-05-03T03:16:51Z

Build succeeded:

GitHub CI (Cockroach)

To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of cockroachdb#80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None

80859: admission: improve granter logs r=sumeerbhola a=tbg To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of #80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None Co-authored-by: Tobias Grieger <[email protected]>

a-robinson · 2022-08-25T18:47:00Z

pkg/util/admission/granter.go

+		now := timeutil.Now()
+		exhaustedMicros := now.Sub(sg.exhaustedStart).Microseconds()
+		sg.ioTokensExhaustedDurationMetric.Inc(exhaustedMicros)
+		if sg.availableIOTokens <= 0 {


@sumeerbhola would there be any interest in backporting just this change from == 0 to <= 0 to v21.2?

The ioTokensExhaustedDurationMetric appears somewhat busted on v21.2 because every time the function containing this code (setAvailableIOTokensLocked in v21.2) is called when sg.availableIOTokens is less than 0, it re-adds now() - sg.exhaustedStart to the metric without resetting the value of sg.exhaustedStart.

On a node that's constantly overloaded, it ends up with a ioTokensExhaustedDurationMetric that increases roughly quadratically, such that taking its rate gives a linearly increasing graph over time, which doesn't seem to be the metric's intent? (especially given that this commit appears to have fixed it)

Sorry, I mean v22.1, not v21.2

yo @a-robinson, nice see you around! Few of us are out on PTO and others are swarming to shove stuff in for the impending release cut. Filed #86963 to make sure we fix it in 22.1, what you're saying looks right to me.

Thanks @irfansharif ! Clearly no rush since it's just a metric, just thought I'd mention it.

sumeerbhola requested review from ajwerner and a team April 25, 2022 14:36

sumeerbhola requested a review from a team as a code owner April 25, 2022 14:36

ajwerner reviewed Apr 26, 2022

View reviewed changes

sumeerbhola mentioned this pull request Apr 27, 2022

admission,kvserver: subject snapshot ingestion to admission control #80607

Closed

sumeerbhola force-pushed the admission_bytes2 branch from c4fb5f5 to 81aa87d Compare April 28, 2022 22:32

sumeerbhola commented Apr 28, 2022

View reviewed changes

ajwerner approved these changes May 2, 2022

View reviewed changes

craig bot merged commit 91aa127 into cockroachdb:master May 3, 2022

tbg mentioned this pull request May 3, 2022

admission: improve granter logs #80859

Merged

irfansharif mentioned this pull request May 20, 2022

[WIP] admission: generalize store write admission control #75120

Closed

a-robinson reviewed Aug 25, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

admission: change store write admission control to use byte tokens #80480

admission: change store write admission control to use byte tokens #80480

sumeerbhola commented Apr 25, 2022

cockroach-teamcity commented Apr 25, 2022

ajwerner left a comment

sumeerbhola left a comment

ajwerner left a comment

sumeerbhola commented May 3, 2022

craig bot commented May 3, 2022

a-robinson Aug 25, 2022

a-robinson Aug 25, 2022

irfansharif Aug 26, 2022

a-robinson Aug 26, 2022

admission: change store write admission control to use byte tokens #80480

admission: change store write admission control to use byte tokens #80480

Conversation

sumeerbhola commented Apr 25, 2022

cockroach-teamcity commented Apr 25, 2022

ajwerner left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

sumeerbhola commented May 3, 2022

craig bot commented May 3, 2022

a-robinson Aug 25, 2022

Choose a reason for hiding this comment

a-robinson Aug 25, 2022

Choose a reason for hiding this comment

irfansharif Aug 26, 2022

Choose a reason for hiding this comment

a-robinson Aug 26, 2022

Choose a reason for hiding this comment