Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission: change store write admission control to use byte tokens #80480

Merged
merged 1 commit into from
May 3, 2022

Conversation

sumeerbhola
Copy link
Collaborator

This switch to byte tokens will result in better accounting for
large writes, including ingests based on whether their bytes land
in L0 or elsewhere. It is also a precurson to taking into account
flush capacity (in bytes).

The store write admission control path now uses a StoreWorkQueue
which wraps a WorkQueue and provides additional functionality:

  • Work can specify WriteBytes and whether it is an IngestRequest.
    This is used to decide how many byte tokens to consume.
  • Done work specifies how many bytes were ingested into L0, so
    token consumption can be adjusted.

The main framework change is that a single work item can consume
multiple (byte) tokens, which ripples through the various
interfaces including requester, granter. There is associated
cleanup: kvGranter that was handling both slots and tokens is
eliminated since in practice it was only doing one or the other.
Instead, for the slot case the slotGranter is reused. For the token
case there is a new kvStoreTokenGranter.

The main logic change is in ioLoadListener which computes byte
tokens and various estimates. The change is (mostly) neutral if
no write provides WriteBytes, since the usual estimation will
take over. The integration changes in this PR are superficial in
that the requests don't provide WriteBytes. Improvements to the
integration, along with experimental results, will happen in
future PRs.

Informs #79092
Informs #77357

Release note: None

@sumeerbhola sumeerbhola requested review from ajwerner and a team April 25, 2022 14:36
@sumeerbhola sumeerbhola requested a review from a team as a code owner April 25, 2022 14:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


-- commits line 6 at r1:
nit: precursor

Code quote:

precurson

pkg/util/admission/work_queue.go line 1581 at r1 (raw file):

}

type StoreWriteWorkInfo struct {

you'll need a comment here


pkg/util/admission/work_queue.go line 1654 at r1 (raw file):

}

func (q *StoreWorkQueue) AdmittedWorkDone(h StoreWorkHandle, ingestedIntoL0Bytes int64) error {

the linter will complain about the lack of commentary on the public methods.


pkg/util/admission/work_queue.go line 1726 at r1 (raw file):

) storeRequester {
	q := &StoreWorkQueue{
		q: *(makeWorkQueue(ambientCtx, KVWork, granter, settings, opts).(*WorkQueue)),

our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.


pkg/util/admission/testdata/io_load_listener line 120 at r1 (raw file):

# don't limit the tokens.
#
# TODO: Remove this temp comment for reviewers. All of the earlier test cases

ack


pkg/util/admission/granter_test.go line 122 at r1 (raw file):

		buf.Reset()
		buf.Reset()
		buf.Reset()

extra reset calls?

Code quote:

		buf.Reset()
		buf.Reset()

This switch to byte tokens will result in better accounting for
large writes, including ingests based on whether their bytes land
in L0 or elsewhere. It is also a precursor to taking into account
flush capacity (in bytes).

The store write admission control path now uses a StoreWorkQueue
which wraps a WorkQueue and provides additional functionality:
- Work can specify WriteBytes and whether it is an IngestRequest.
  This is used to decide how many byte tokens to consume.
- Done work specifies how many bytes were ingested into L0, so
  token consumption can be adjusted.

The main framework change is that a single work item can consume
multiple (byte) tokens, which ripples through the various
interfaces including requester, granter. There is associated
cleanup: kvGranter that was handling both slots and tokens is
eliminated since in practice it was only doing one or the other.
Instead, for the slot case the slotGranter is reused. For the token
case there is a new kvStoreTokenGranter.

The main logic change is in ioLoadListener which computes byte
tokens and various estimates. The change is (mostly) neutral if
no write provides WriteBytes, since the usual estimation will
take over. The integration changes in this PR are superficial in
that the requests don't provide WriteBytes. Improvements to the
integration, along with experimental results, will happen in
future PRs.

Informs cockroachdb#79092
Informs cockroachdb#77357

Release note: None
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


-- commits line 6 at r1:

Previously, ajwerner wrote…

nit: precursor

Done


pkg/util/admission/work_queue.go line 1581 at r1 (raw file):

Previously, ajwerner wrote…

you'll need a comment here

Done


pkg/util/admission/work_queue.go line 1654 at r1 (raw file):

Previously, ajwerner wrote…

the linter will complain about the lack of commentary on the public methods.

Done


pkg/util/admission/work_queue.go line 1726 at r1 (raw file):

Previously, ajwerner wrote…

our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.

Done, by doing all the work in initWorkQueue. I've kept the goroutine starting in the init method since there is no reason to delay it.


pkg/util/admission/testdata/io_load_listener line 120 at r1 (raw file):

Previously, ajwerner wrote…

ack

Removed


pkg/util/admission/granter_test.go line 122 at r1 (raw file):

Previously, ajwerner wrote…

extra reset calls?

ghost, methinks.
removed.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

@sumeerbhola
Copy link
Collaborator Author

bors r=ajwerner

@craig
Copy link
Contributor

craig bot commented May 3, 2022

Build succeeded:

@craig craig bot merged commit 91aa127 into cockroachdb:master May 3, 2022
tbg added a commit to tbg/cockroach that referenced this pull request May 3, 2022
To the uninitiated, they were previously difficult to understand.
Now it's clearer what each number means just from looking at the
output.

While rebasing this PR on top of cockroachdb#80480, I realized how hard it
was (at least for me) to keep track of the many quantities involved. The
way the updates to the state were computed (were the old struct is
updated incrementally in-place) additionally made me worried
that bugs could easily be introduced. Some quantities are cumulative,
some are for the current interval but contain other quantities, etc - it
all melted my brain with ease.

I decided to at least try to make it easier for the next newcomer to
this codebase to get their bearings by constructing all of the updates
first, and then overwriting the old state. I additionally renamed all
of the variables according to a fixed pattern so that it is now clear
which ones are for the current interval or cumulative, and added
commentary to (hopefully) clarify which quantities contain each other.
Additionally, all quantities are now stored for posterity, so in
principle it's now straightforward to write a test case on any of the
intermediate quantities for any desired inputs.

More could be done here but I feel that this leaves things in a good
place. There's a chance that I have written equally intricate code but
happen to like it more because I wrote it. I trust my reviewers to keep
me honest.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request May 11, 2022
To the uninitiated, they were previously difficult to understand.
Now it's clearer what each number means just from looking at the
output.

While rebasing this PR on top of cockroachdb#80480, I realized how hard it
was (at least for me) to keep track of the many quantities involved. The
way the updates to the state were computed (were the old struct is
updated incrementally in-place) additionally made me worried
that bugs could easily be introduced. Some quantities are cumulative,
some are for the current interval but contain other quantities, etc - it
all melted my brain with ease.

I decided to at least try to make it easier for the next newcomer to
this codebase to get their bearings by constructing all of the updates
first, and then overwriting the old state. I additionally renamed all
of the variables according to a fixed pattern so that it is now clear
which ones are for the current interval or cumulative, and added
commentary to (hopefully) clarify which quantities contain each other.
Additionally, all quantities are now stored for posterity, so in
principle it's now straightforward to write a test case on any of the
intermediate quantities for any desired inputs.

More could be done here but I feel that this leaves things in a good
place. There's a chance that I have written equally intricate code but
happen to like it more because I wrote it. I trust my reviewers to keep
me honest.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request May 17, 2022
To the uninitiated, they were previously difficult to understand.
Now it's clearer what each number means just from looking at the
output.

While rebasing this PR on top of cockroachdb#80480, I realized how hard it
was (at least for me) to keep track of the many quantities involved. The
way the updates to the state were computed (were the old struct is
updated incrementally in-place) additionally made me worried
that bugs could easily be introduced. Some quantities are cumulative,
some are for the current interval but contain other quantities, etc - it
all melted my brain with ease.

I decided to at least try to make it easier for the next newcomer to
this codebase to get their bearings by constructing all of the updates
first, and then overwriting the old state. I additionally renamed all
of the variables according to a fixed pattern so that it is now clear
which ones are for the current interval or cumulative, and added
commentary to (hopefully) clarify which quantities contain each other.
Additionally, all quantities are now stored for posterity, so in
principle it's now straightforward to write a test case on any of the
intermediate quantities for any desired inputs.

More could be done here but I feel that this leaves things in a good
place. There's a chance that I have written equally intricate code but
happen to like it more because I wrote it. I trust my reviewers to keep
me honest.

Release note: None
craig bot pushed a commit that referenced this pull request May 17, 2022
80859: admission: improve granter logs r=sumeerbhola a=tbg

To the uninitiated, they were previously difficult to understand.
Now it's clearer what each number means just from looking at the
output.

While rebasing this PR on top of #80480, I realized how hard it
was (at least for me) to keep track of the many quantities involved. The
way the updates to the state were computed (were the old struct is
updated incrementally in-place) additionally made me worried
that bugs could easily be introduced. Some quantities are cumulative,
some are for the current interval but contain other quantities, etc - it
all melted my brain with ease.

I decided to at least try to make it easier for the next newcomer to
this codebase to get their bearings by constructing all of the updates
first, and then overwriting the old state. I additionally renamed all
of the variables according to a fixed pattern so that it is now clear
which ones are for the current interval or cumulative, and added
commentary to (hopefully) clarify which quantities contain each other.
Additionally, all quantities are now stored for posterity, so in
principle it's now straightforward to write a test case on any of the
intermediate quantities for any desired inputs.

More could be done here but I feel that this leaves things in a good
place. There's a chance that I have written equally intricate code but
happen to like it more because I wrote it. I trust my reviewers to keep
me honest.

Release note: None

Co-authored-by: Tobias Grieger <[email protected]>
now := timeutil.Now()
exhaustedMicros := now.Sub(sg.exhaustedStart).Microseconds()
sg.ioTokensExhaustedDurationMetric.Inc(exhaustedMicros)
if sg.availableIOTokens <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola would there be any interest in backporting just this change from == 0 to <= 0 to v21.2?

The ioTokensExhaustedDurationMetric appears somewhat busted on v21.2 because every time the function containing this code (setAvailableIOTokensLocked in v21.2) is called when sg.availableIOTokens is less than 0, it re-adds now() - sg.exhaustedStart to the metric without resetting the value of sg.exhaustedStart.

On a node that's constantly overloaded, it ends up with a ioTokensExhaustedDurationMetric that increases roughly quadratically, such that taking its rate gives a linearly increasing graph over time, which doesn't seem to be the metric's intent? (especially given that this commit appears to have fixed it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I mean v22.1, not v21.2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yo @a-robinson, nice see you around! Few of us are out on PTO and others are swarming to shove stuff in for the impending release cut. Filed #86963 to make sure we fix it in 22.1, what you're saying looks right to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @irfansharif ! Clearly no rush since it's just a metric, just thought I'd mention it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants