Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: distribute COCKROACH_SCHEDULER_CONCURRENCY across stores #102859

Merged
merged 4 commits into from
May 10, 2023

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented May 7, 2023

kvserver: ensure at least 1 Raft scheduler worker

Previously, it was possible to create a Raft scheduler with no workers by setting COCKROACH_SCHEDULER_CONCURRENCY=0. This patch ensures we always create at least 1 worker per scheduler.

Release note: None

kvserver: pass Raft scheduler workers via StoreConfig

Release note: None

kvserver: move StoreConfig.SetDefaults() out of NewStore()

Release note: None

kvserver: distribute COCKROACH_SCHEDULER_CONCURRENCY across stores

COCKROACH_SCHEDULER_CONCURRENCY defaults to 8 per CPU core, capped at 96, to avoid putting excessive pressure on the Go scheduler. However, it was applied individually to each store, which means that nodes with e.g. 10 stores would run up to 960 workers. This can lead to scheduler thrashing, as well as excessive memory usage since it also serves to bound the amount of data pulled into memory for concurrent Raft ready processing.

This patch instead divides the worker count across stores.

Resolves #102838.

Epic: none
Release note (ops change): the default Raft scheduler concurrency, controlled by COCKROACH_SCHEDULER_CONCURRENCY and defaulting to 8 per CPU core capped at 96, is now divided evenly across stores instead of applying individually per store. This avoids excessive Go scheduler pressure and memory usage on nodes with many stores. The common case of 1 store per node is not affected.

@erikgrinaker erikgrinaker added backport-22.1.x backport-23.1.x Flags PRs that need to be backported to 23.1 labels May 7, 2023
@erikgrinaker erikgrinaker requested review from pav-kv, tbg and a team May 7, 2023 15:37
@erikgrinaker erikgrinaker self-assigned this May 7, 2023
@erikgrinaker erikgrinaker requested a review from a team as a code owner May 7, 2023 15:37
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker erikgrinaker changed the title Raft scheduler concurrency kvserver: distribute COCKROACH_SCHEDULER_CONCURRENCY across stores May 7, 2023
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented May 7, 2023

I feel like this is okay to backport, since I think this was always the intended behavior, and we've seen the existing behavior cause problems on nodes with many stores. But we can consider gating this behind an environment variable for the backports.

@erikgrinaker erikgrinaker force-pushed the raft-scheduler-concurrency branch 2 times, most recently from 67a1b1f to a9ed7be Compare May 7, 2023 16:02
if storeTestingKnobs := cfg.TestingKnobs.Store; storeTestingKnobs != nil {
storeCfg.TestingKnobs = *storeTestingKnobs.(*kvserver.StoreTestingKnobs)
}
storeCfg.SetDefaults()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could imagine this would break a few tests. I guess it hasn't, since CI looks green. Nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, most tests use testStoreConfig() which already calls SetDefaults().

@erikgrinaker
Copy link
Contributor Author

Any thoughts on gating this behind an environment variable for backports, or shipping it as-is? Alternatively, scaling up the worker count somewhat with multiple stores?

@irfansharif irfansharif added the O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs label May 10, 2023
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented May 10, 2023

Because the outage here seemed to be related to quadratic behavior of Raft probes due to reproposals (#102956) and the inefficient application of these probes (#102953), those backports seem like a safer and more effective fix.

Backporting this PR to 23.1 seems fine, because we already shard the Raft scheduler in 23.1 thus putting a limit on the number of workers available to a range (16 per shard), but backporting to 22.2 is riskier because we'd be significantly reducing the available workers per range.

I'm therefore only backporting to 23.1 for now. If some users need this in 22.2, they can accomplish the same effect by setting COCKROACH_SCHEDULER_CONCURRENCY to an appropriate per-store value.

bors r+

@erikgrinaker
Copy link
Contributor Author

Oops, forgot to address Pavel's comment.

bors r-

@craig
Copy link
Contributor

craig bot commented May 10, 2023

Canceled.

Previously, it was possible to create a Raft scheduler with no workers
by setting `COCKROACH_SCHEDULER_CONCURRENCY=0`. This patch ensures we
always create at least 1 worker per scheduler.

Epic: none
Release note: None
`COCKROACH_SCHEDULER_CONCURRENCY` defaults to 8 per CPU core, capped at
96, to avoid putting excessive pressure on the Go scheduler. However, it
was applied individually to each store, which means that nodes with e.g.
10 stores would run up to 960 workers. This can lead to scheduler
thrashing, as well as excessive memory usage since it also serves to
bound the amount of data pulled into memory for concurrent Raft ready
processing.

This patch instead divides the worker count across stores.

Epic: none
Release note (ops change): the default Raft scheduler concurrency,
controlled by `COCKROACH_SCHEDULER_CONCURRENCY` and defaulting to 8 per
CPU core capped at 96, is now divided evenly across stores instead of
applying individually per store. This avoids excessive Go scheduler
pressure and memory usage on nodes with many stores. The common case of
1 store per node is not affected.
@erikgrinaker erikgrinaker force-pushed the raft-scheduler-concurrency branch from a9ed7be to a448edc Compare May 10, 2023 19:57
@erikgrinaker
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 10, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1 O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: limit Raft scheduler concurrency across all stores
5 participants