-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: change store write admission control to use byte tokens #80480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits
Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
-- commits
line 6 at r1:
nit: precursor
Code quote:
precurson
pkg/util/admission/work_queue.go
line 1581 at r1 (raw file):
} type StoreWriteWorkInfo struct {
you'll need a comment here
pkg/util/admission/work_queue.go
line 1654 at r1 (raw file):
} func (q *StoreWorkQueue) AdmittedWorkDone(h StoreWorkHandle, ingestedIntoL0Bytes int64) error {
the linter will complain about the lack of commentary on the public methods.
pkg/util/admission/work_queue.go
line 1726 at r1 (raw file):
) storeRequester { q := &StoreWorkQueue{ q: *(makeWorkQueue(ambientCtx, KVWork, granter, settings, opts).(*WorkQueue)),
our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.
pkg/util/admission/testdata/io_load_listener
line 120 at r1 (raw file):
# don't limit the tokens. # # TODO: Remove this temp comment for reviewers. All of the earlier test cases
ack
pkg/util/admission/granter_test.go
line 122 at r1 (raw file):
buf.Reset() buf.Reset() buf.Reset()
extra reset calls?
Code quote:
buf.Reset()
buf.Reset()
This switch to byte tokens will result in better accounting for large writes, including ingests based on whether their bytes land in L0 or elsewhere. It is also a precursor to taking into account flush capacity (in bytes). The store write admission control path now uses a StoreWorkQueue which wraps a WorkQueue and provides additional functionality: - Work can specify WriteBytes and whether it is an IngestRequest. This is used to decide how many byte tokens to consume. - Done work specifies how many bytes were ingested into L0, so token consumption can be adjusted. The main framework change is that a single work item can consume multiple (byte) tokens, which ripples through the various interfaces including requester, granter. There is associated cleanup: kvGranter that was handling both slots and tokens is eliminated since in practice it was only doing one or the other. Instead, for the slot case the slotGranter is reused. For the token case there is a new kvStoreTokenGranter. The main logic change is in ioLoadListener which computes byte tokens and various estimates. The change is (mostly) neutral if no write provides WriteBytes, since the usual estimation will take over. The integration changes in this PR are superficial in that the requests don't provide WriteBytes. Improvements to the integration, along with experimental results, will happen in future PRs. Informs cockroachdb#79092 Informs cockroachdb#77357 Release note: None
c4fb5f5
to
81aa87d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
Previously, ajwerner wrote…
nit: precursor
Done
pkg/util/admission/work_queue.go
line 1581 at r1 (raw file):
Previously, ajwerner wrote…
you'll need a comment here
Done
pkg/util/admission/work_queue.go
line 1654 at r1 (raw file):
Previously, ajwerner wrote…
the linter will complain about the lack of commentary on the public methods.
Done
pkg/util/admission/work_queue.go
line 1726 at r1 (raw file):
Previously, ajwerner wrote…
our linter, for better or for worse, doesn't like this because it copied a mutex. you'll need to rework the constructor a bit. Probably you'll need to add some sort of start method or function so you can separate the initialization from kicking off the goroutine.
Done, by doing all the work in initWorkQueue
. I've kept the goroutine starting in the init method since there is no reason to delay it.
pkg/util/admission/testdata/io_load_listener
line 120 at r1 (raw file):
Previously, ajwerner wrote…
ack
Removed
pkg/util/admission/granter_test.go
line 122 at r1 (raw file):
Previously, ajwerner wrote…
extra reset calls?
ghost, methinks.
removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)
bors r=ajwerner |
Build succeeded: |
To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of cockroachdb#80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None
To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of cockroachdb#80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None
To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of cockroachdb#80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None
80859: admission: improve granter logs r=sumeerbhola a=tbg To the uninitiated, they were previously difficult to understand. Now it's clearer what each number means just from looking at the output. While rebasing this PR on top of #80480, I realized how hard it was (at least for me) to keep track of the many quantities involved. The way the updates to the state were computed (were the old struct is updated incrementally in-place) additionally made me worried that bugs could easily be introduced. Some quantities are cumulative, some are for the current interval but contain other quantities, etc - it all melted my brain with ease. I decided to at least try to make it easier for the next newcomer to this codebase to get their bearings by constructing all of the updates first, and then overwriting the old state. I additionally renamed all of the variables according to a fixed pattern so that it is now clear which ones are for the current interval or cumulative, and added commentary to (hopefully) clarify which quantities contain each other. Additionally, all quantities are now stored for posterity, so in principle it's now straightforward to write a test case on any of the intermediate quantities for any desired inputs. More could be done here but I feel that this leaves things in a good place. There's a chance that I have written equally intricate code but happen to like it more because I wrote it. I trust my reviewers to keep me honest. Release note: None Co-authored-by: Tobias Grieger <[email protected]>
now := timeutil.Now() | ||
exhaustedMicros := now.Sub(sg.exhaustedStart).Microseconds() | ||
sg.ioTokensExhaustedDurationMetric.Inc(exhaustedMicros) | ||
if sg.availableIOTokens <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumeerbhola would there be any interest in backporting just this change from == 0
to <= 0
to v21.2?
The ioTokensExhaustedDurationMetric
appears somewhat busted on v21.2 because every time the function containing this code (setAvailableIOTokensLocked
in v21.2) is called when sg.availableIOTokens
is less than 0, it re-adds now() - sg.exhaustedStart
to the metric without resetting the value of sg.exhaustedStart
.
On a node that's constantly overloaded, it ends up with a ioTokensExhaustedDurationMetric
that increases roughly quadratically, such that taking its rate gives a linearly increasing graph over time, which doesn't seem to be the metric's intent? (especially given that this commit appears to have fixed it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean v22.1, not v21.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yo @a-robinson, nice see you around! Few of us are out on PTO and others are swarming to shove stuff in for the impending release cut. Filed #86963 to make sure we fix it in 22.1, what you're saying looks right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @irfansharif ! Clearly no rush since it's just a metric, just thought I'd mention it.
This switch to byte tokens will result in better accounting for
large writes, including ingests based on whether their bytes land
in L0 or elsewhere. It is also a precurson to taking into account
flush capacity (in bytes).
The store write admission control path now uses a StoreWorkQueue
which wraps a WorkQueue and provides additional functionality:
This is used to decide how many byte tokens to consume.
token consumption can be adjusted.
The main framework change is that a single work item can consume
multiple (byte) tokens, which ripples through the various
interfaces including requester, granter. There is associated
cleanup: kvGranter that was handling both slots and tokens is
eliminated since in practice it was only doing one or the other.
Instead, for the slot case the slotGranter is reused. For the token
case there is a new kvStoreTokenGranter.
The main logic change is in ioLoadListener which computes byte
tokens and various estimates. The change is (mostly) neutral if
no write provides WriteBytes, since the usual estimation will
take over. The integration changes in this PR are superficial in
that the requests don't provide WriteBytes. Improvements to the
integration, along with experimental results, will happen in
future PRs.
Informs #79092
Informs #77357
Release note: None