-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: enables elastic CPU limiter for all users of ExportRequest #96691
Conversation
d6a94d1
to
d46edea
Compare
d1d4314
to
2033f54
Compare
friendly ping @stevendanna / @irfansharif , I have a couple of export related diffs queued up behind this so just wanted to put this on your radar. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me. I've left some possible suggestions but none of them are blocking in my opinion.
db *kv.DB, | ||
startKey, endKey roachpb.Key, | ||
startTime, endTime hlc.Timestamp, | ||
allRevs chan []VersionedValues, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only two callers, so I don't feel strongly about this, but you could pass something like a func(batch []VersionedValues)
which would give the caller a bit more flexibility.
dddc968
to
c0e5a41
Compare
Previously, there was a strange coupling between the elastic CPU limiter and the `header.TargetBytes` DistSender limit set on each ExportRequest. Even if a request was preempted on exhausting its allotted CPU tokens, it would only return from kvserver by virtue of its `header.TargetBytes` being set to a non-zero value. Out of the four users of ExportRequest, only backup set this field to a sentinel value of 1 to limit the number of SSTs we send back in an ExportResponse. The remaining callers of ExportRequest would not return from the kvserver. Instead they would evaluate the request from the resume key immediately, not giving the scheduler a chance to take the goroutine off CPU. This change breaks this coupling by introducing a `resumeInfo` object that indicates whether the resumption was because we were over our CPU limit. If it was, we return an ExportResponse with our progress so far. This change shifts the burden of handling pagination to the client. This seems better than having the server sleep or wait around until its CPU tokens are replenished as the client would be left wondering why a request is taking so long. To that effect this change adds pagination support to the other callers of ExportRequest. Note, we do not set `SplitMidKey` at these other callsites yet. Thus, all pagination will happen at key boundaries in the ExportRequest. A follow-up will add support for `SplitMidKey` to these callers. Informs: cockroachdb#96684 Release note: None
c0e5a41
to
4f9ec34
Compare
flake is #97320 bors r=stevendanna |
Build succeeded: |
97916: kv: opt into Elastic CPU limiting resume spans r=adityamaru,arulajmani a=stevendanna In #96691, we changed ExportRequest such that it returns resume spans that result from the elastic CPU limiter all the way to the caller. This has at least two problems: 1) In a mixed-version state, the caller might not yet know how to handle resume spans. This could result in incomplete responses erroneous being used as if they were full responses. 2) The DistSender inspects a request to determine whether it may stop early. If it shouldn't be able to stop early, then the request is split up, possibly sent in parallel, and all responses are combined. The code which combines responses asserts that neither side has a resume span. As a result, we've seen failures such as crdb_internal.fingerprint(): combining /Tenant/2/Table/106/1/-{8403574544142222370/0-37656332809536692} with /Tenant/{2/Table/106/1/436440321206557763/0-3} since the change was made. Here, we add a new request header field to allow callers to indicate whether they are prepared to accept resume spans. Further, we add that new field into the logic in DistSender which decides how to process requests. The downside here is that crdb_internal.fingerprint won't have its requests sent in parallel. Release note: None Fixes #97886, #97903 Epic: none Co-authored-by: Steven Danna <[email protected]>
In cockroachdb#96691, we changed the return type of mvccExportToWriter such that it now indicates when a CPU limit has been reached. As part of that change, when the CPU limit was reached, we would immedately `return` rather than `break`ing out of the loop. This introduced a bug, since the code after the loop that the `break` was taking us to is important. Namely, we may have previously buffered range keys that we need to write into our response still. By replacing the break with a return, these range keys were lost. The end-user impact of this is that a BACKUP that _ought_ to have included range keys (such as a backup of a table with a rolled back IMPORT) would not include those range keys and thus would end up resurecting deleted keys upon restore. This PR brings back the `break` and adds a test that covers exporting with range keys under CPU exhaustion. This bug only ever existed on master. Informs cockroachdb#97694 Epic: none Release note: None
97717: multitenant: add AdminUnsplitRequest capability r=knz a=ecwall Fixes #97716 1) Add a new `tenantcapabilitiespb.CanAdminUnsplit` capability. 2) Check capability in `Authorizer.HasCapabilityForBatch`. 3) Remove `ExecutorConfig.RequireSystemTenant` call from `execFactory.ConstructAlterTableUnsplit`, `execFactory.ConstructAlterTableUnsplitAll`. Release note: None 98250: kvserver: add minimum cpu lb split threshold r=andrewbaptist a=kvoli Previously, `kv.range_split.load_cpu_threshold` had no minimum setting value. It is undesirable to allow users to set this setting to low as excessive splitting may occur. `kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`. See #96869 for additional context on the threshold. Resolves: #98107 Release note (ops change): `kv.range_split.load_cpu_threshold` now has a minimum setting value of `10ms`. 98270: dashboards: add replica cpu to repl dashboard r=xinhaoz a=kvoli In #96127 we added the option to load balance replica CPU instead of QPS across stores in a cluster. It is desirable to view the signal being controlled for rebalancing in the replication dashboard, similar to QPS. This pr adds the `rebalancing.cpunanospersecond` metric to the replication metrics dashboard. The avg QPS graph on the replication graph previously described the metric as "Exponentially weighted average", however this is not true. This pr updates the description to just be "moving average" which is accurate. Note that follow the workload does use an exponentially weighted value, however the metric in the dashboard is not the same. This pr also updates the graph header to include Replica in the title: "Average Replica Queries per Node". QPS is specific to replicas. This is already mentioned in the description. Resolves: #98109 98289: storage: mvccExportToWriter should always return buffered range keys r=adityamaru a=stevendanna In #96691, we changed the return type of mvccExportToWriter such that it now indicates when a CPU limit has been reached. As part of that change, when the CPU limit was reached, we would immedately `return` rather than `break`ing out of the loop. This introduced a bug, since the code after the loop that the `break` was taking us to is important. Namely, we may have previously buffered range keys that we need to write into our response still. By replacing the break with a return, these range keys were lost. The end-user impact of this is that a BACKUP that _ought_ to have included range keys (such as a backup of a table with a rolled back IMPORT) would not include those range keys and thus would end up resurecting deleted keys upon restore. This PR brings back the `break` and adds a test that covers exporting with range keys under CPU exhaustion. This bug only ever existed on master. Informs #97694 Epic: none Release note: None 98329: sql: fix iteration conditions in crdb_internal.scan r=ajwerner a=stevendanna Rather than using the Next() key of the last key in the response when iterating, we should use the resume span. The previous code could result in a failure in the rare case that the end key of our scan exactly matched the successor key of the very last key in the iteration. Epic: none Release note: None Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Steven Danna <[email protected]>
Previously, there was a strange coupling between the elastic CPU
limiter and the
header.TargetBytes
DistSender limit set on eachExportRequest. Even if a request was preempted on exhausting its
allotted CPU tokens, it would only return from kvserver by virtue
of its
header.TargetBytes
being set to a non-zero value. Out of thefour users of ExportRequest, only backup set this field to a sentinel
value of 1 to limit the number of SSTs we send back in an ExportResponse.
The remaining callers of ExportRequest would not return from the kvserver.
Instead they would evaluate the request from the resume key immediately,
not giving the scheduler a chance to take the goroutine off CPU.
This change breaks this coupling by introducing a
resumeInfo
object thatindicates whether the resumption was because we were over our CPU limit. If
it was, we return an ExportResponse with our progress so far. This change
shifts the burden of handling pagination to the client. This seems better than
having the server sleep or wait around until its CPU tokens are replenished
as the client would be left wondering why a request is taking so long.
To that effect this change adds pagination support to the other callers of
ExportRequest. Note, we do not set
SplitMidKey
at these other callsitesyet. Thus, all pagination will happen at key boundaries in the ExportRequest.
A follow-up will add support for
SplitMidKey
to these callers.Informs: #96684
Release note: None