Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.2: kv,rangefeed: integrate catchup scans with elastic cpu #92439

Merged
merged 5 commits into from
Nov 28, 2022

Conversation

irfansharif
Copy link
Contributor

Backport 4/4 commits from #89709.

/cc @cockroachdb/release


Part of #65957. First commit is from #89656.

Changefeed backfills, given their scan-heavy nature, can be fairly
CPU-intensive. In #89656 we introduced a roachtest demonstrating the
latency impact backfills can have on a moderately CPU-saturated cluster.
Similar to what we saw for backups, this CPU heavy nature can elevate Go
scheduling latencies which in turn translates to foreground latency
impact. This commit integrates rangefeed catchup scan with the elastic
CPU limiter we introduced in #86638; this is one of two optional halves
of changefeed backfills. The second half is the initial scan -- scan
requests issued over some keyspan as of some timestamp. For that we
simply rely on the existing slots mechanism but now setting a lower
priority bit (BulkNormalPri). Experimentally we observed that during
initial scans the encoding routines in changefeedccl are the most
impactful CPU-wise, something #89589 can help with. We leave admission
integration of parallel worker goroutines to future work.

Unlike export requests, rangefeed catchup scans are non-premptible. The
rangefeed RPC is a streaming one, and the catchup scan is done during
stream setup. So we don't have resumption tokens to propagate up to the
caller like we did for backups. We still want CPU-bound work going
through admission control to use at most 100ms of CPU time (it makes
for an ineffective controller otherwise), and to that end we introduce
the following component used within the rangefeed catchup iterator:

// Pacer is used in tight loops (CPU-bound) for non-premptible
// elastic work. Callers are expected to invoke Pace() every loop
// iteration and Close() once done. Internally this type integrates
// with elastic CPU work queue, acquiring tokens for the CPU work
// being done, and blocking if tokens are unavailable. This allows
// for a form of cooperative scheduling with elastic CPU granters.
type Pacer struct
func (p *Pacer) Pace(ctx context.Context) error { ... }
func (p *Pacer) Close() { ... }

Release note: None
Release justification: Non-production code. In 22.2 this machinery is all switched off by default.

Pure code movement/renaming. We also renamed a cluster setting
kv.store.admission.provisioned_bandwidth to
kvadmission.store.provisioned_bandwidth.

Release note (general note): We renamed a cluster setting
kv.store.admission.provisioned_bandwidth to
kvadmission.store.provisioned_bandwidth.
There's only ever one URL.

Release note: None
Part of cockroachdb#65957.

Changefeed backfills, given their scan-heavy nature, can be fairly
CPU-intensive. In cockroachdb#89656 we introduced a roachtest demonstrating the
latency impact backfills can have on a moderately CPU-saturated cluster.
Similar to what we saw for backups, this CPU heavy nature can elevate Go
scheduling latencies which in turn translates to foreground latency
impact. This commit integrates rangefeed catchup scan with the elastic
CPU limiter we introduced in cockroachdb#86638; this is one of two optional halves
of changefeed backfills. The second half is the initial scan -- scan
requests issued over some keyspan as of some timestamp. For that we
simply rely on the existing slots mechanism but now setting a lower
priority bit (BulkNormalPri) -- cockroachdb#88733. Experimentally we observed that
during initial scans the encoding routines in changefeedccl are the most
impactful CPU-wise, something cockroachdb#89589 can help with. We leave admission
integration of parallel worker goroutines to future work (cockroachdb#90089).

Unlike export requests rangefeed catchup scans are non-premptible. The
rangefeed RPC is a streaming one, and the catchup scan is done during
stream setup. So we don't have resumption tokens to propagate up to the
caller like we did for backups. We still want CPU-bound work going
through admission control to only use 100ms of CPU time, to exercise
better control over scheduling latencies. To that end, we introduce the
following component used within the rangefeed catchup iterator.

    // Pacer is used in tight loops (CPU-bound) for non-premptible
    // elastic work. Callers are expected to invoke Pace() every loop
    // iteration and Close() once done. Internally this type integrates
    // with elastic CPU work queue, acquiring tokens for the CPU work
    // being done, and blocking if tokens are unavailable. This allows
    // for a form of cooperative scheduling with elastic CPU granters.
    type Pacer struct
    func (p *Pacer) Pace(ctx context.Context) error { ... }
    func (p *Pacer) Close() { ... }

Release note: None
.. catch-up scans, introduce a private cluster setting
(kvadmission.rangefeed_catchup_scan_elastic_control.enabled) to
selectively switch off catch-up scan integration if needed, and plumb
kvadmission.Pacer in explicitly to rangefeed catchup scan loop instead
of opaquely through the surrounding context.

Release note: None
@irfansharif irfansharif requested a review from a team November 24, 2022 01:52
@irfansharif irfansharif requested review from a team as code owners November 24, 2022 01:52
@irfansharif irfansharif requested a review from a team November 24, 2022 01:52
@irfansharif irfansharif requested a review from a team as a code owner November 24, 2022 01:52
@irfansharif irfansharif requested review from herkolategan and removed request for a team November 24, 2022 01:52
@blathers-crl
Copy link

blathers-crl bot commented Nov 24, 2022

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@irfansharif irfansharif requested review from smg260 and miretskiy and removed request for a team November 24, 2022 01:52
@cockroach-teamcity
Copy link
Member

This change is Reviewable

We want this to be disabled by default in 22.2, used only selectively.

Release note: None
@miretskiy
Copy link
Contributor

This is a fairly large PR; Even though this functionality turned off, do we still want to merge it?
I think having a particular client in mind who would turn on this setting would be useful. Otherwise, if there is
no such client, why bother?

@shermanCRL
Copy link
Contributor

@miretskiy Well, we agree that it’s a fairly universal problem of CPU spikes and therefore SQL latency on changefeed backfill. Your question is about the value/risk of the backport, I assume?

We could give more bake time on v23.1 and then re-evaluate the backport. Or @irfansharif can make the case on risk/return.

@irfansharif
Copy link
Contributor Author

  • Large is in the eye of the beholder, I've backported bigger things and this to me is relatively small -- it's not adding new machinery, it's adding a straightforward integration. There's a commit in this PR that's carving out a separate package (purely code movement) that's contributing to the high line count.
  • We were deliberate about backporting changefeed,kvcoord: populate AC headers for backfill work #90093 earlier into the .0 release to pave the way for just this PR.
  • It's all switched off like you can see. KV L2 pains have taught me it's better to have something available that's switched off than scramble for a subsequent release/custom binary for an already fixed problem.
  • There is a customer who's asked for it, see this internal document.
  • This has had almost four weeks of baking time on master now with no fallout. We're not going to learn much more at this point.

@miretskiy
Copy link
Contributor

Cool -- I just want to make sure we have existing pains we are trying to solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants