Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsql: make the number of DistSQL runners dynamic #84946

Merged
merged 2 commits into from
Aug 1, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jul 22, 2022

distsql: make the number of DistSQL runners dynamic

This commit improves the infrastructure around a pool of "DistSQL
runners" that are used for issuing SetupFlow RPCs in parallel.
Previously, we had a hard-coded number of 16 goroutines which was
probably insufficient in many cases. This commit makes it so that we use
the default value of 4 x N(cpus) to make it proportional to how beefy
the node is (under the expectation that the larger the node is, the more
distributed queries it will be handling). The choice of the four as the
multiple was made so that we get the previous default on machines with
4 CPUs.

Additionally, this commit introduces a mechanism to dynamically adjust
the number of runners based on a cluster setting. Whenever the setting
is reduced, some of the workers are stopped, if the setting is
increased, then new workers are spun up accordingly. This coordinator
listens on two channels: one about the server quescing, and another
about the new target pool size. Whenever a new target size is received,
the coordinator will spin up / shut down one worker at a time until that
target size is achieved. The worker, however, doesn't access the server
quescing channel and, instead, relies on the coordinator to tell it to
exit (either by closing the channel when quescing or sending a single
message when the target size is decreased).

Fixes: #84459.

Release note: None

distsql: change the flow setup code a bit

Previously, when setting up a distributed plan, we would wait for all
SetupFlow RPCs to come back before setting up the flow on the gateway.
Most likely (in the happy scenario) all those RPCs would be successful,
so we can parallelize the happy path a bit by setting up the local flow
while the RPCs are in-flight which is what this commit does. This seems
especially beneficial given the change in the previous commit to
increase the number of DistSQL runners for beefy machines - we are now
more likely to issue SetupFlow RPCs asynchronously.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the distsql-runners branch 2 times, most recently from b26e334 to b8f1343 Compare July 22, 2022 23:53
@yuzefovich
Copy link
Member Author

cc @dt I'm curious if you could take a look at the first commit - this is what I was thinking about when asking on Slack last week.

var settingDistSQLRunnersCPUMultiple = settings.RegisterIntSetting(
settings.TenantWritable,
"sql.distsql.runners_cpu_multiple",
"the value multiplied by the number of CPUs on a node to determine "+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we ever imagine wanting to set this to less than numCores?

you could have the value just be the desired worker per node count, and then have the default value be 4*numCores?

return
}
}
// Whenever the corresponding setting is updated, we need to notify the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were so inclined, could expand this comment with a note of the fact that initRunners is only called once per server lifetime so this won't leak an unbounded numbers of onchange callbacks, since in general an onChange not in an init() is suspect

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt)


pkg/sql/distsql_running.go line 65 at r1 (raw file):

Previously, dt (David Taylor) wrote…

do we ever imagine wanting to set this to less than numCores?

you could have the value just be the desired worker per node count, and then have the default value be 4*numCores?

Yeah, good point, done.


pkg/sql/distsql_running.go line 160 at r2 (raw file):

Previously, dt (David Taylor) wrote…

If you were so inclined, could expand this comment with a note of the fact that initRunners is only called once per server lifetime so this won't leak an unbounded numbers of onchange callbacks, since in general an onChange not in an init() is suspect

Done.

@yuzefovich yuzefovich marked this pull request as ready for review July 25, 2022 22:22
@yuzefovich yuzefovich requested a review from a team as a code owner July 25, 2022 22:22
@yuzefovich yuzefovich requested review from michae2 and cucaroach July 25, 2022 22:23
@yuzefovich yuzefovich requested a review from a team as a code owner July 26, 2022 23:22
@yuzefovich yuzefovich removed the request for review from a team July 26, 2022 23:22
@yuzefovich
Copy link
Member Author

The last commit in this PR exposed a possible data race which is now fixed in the first commit. That first commit is in #85101 and should be ignored here.

This commit improves the infrastructure around a pool of "DistSQL
runners" that are used for issuing SetupFlow RPCs in parallel.
Previously, we had a hard-coded number of 16 goroutines which was
probably insufficient in many cases. This commit makes it so that we use
the default value of `4 x N(cpus)` to make it proportional to how beefy
the node is (under the expectation that the larger the node is, the more
distributed queries it will be handling). The choice of the four as the
multiple was made so that we get the previous default on machines with
4 CPUs.

Additionally, this commit introduces a mechanism to dynamically adjust
the number of runners based on a cluster setting. Whenever the setting
is reduced, some of the workers are stopped, if the setting is
increased, then new workers are spun up accordingly. This coordinator
listens on two channels: one about the server quescing, and another
about the new target pool size. Whenever a new target size is received,
the coordinator will spin up / shut down one worker at a time until that
target size is achieved. The worker, however, doesn't access the server
quescing channel and, instead, relies on the coordinator to tell it to
exit (either by closing the channel when quescing or sending a single
message when the target size is decreased).

Release note: None
Previously, when setting up a distributed plan, we would wait for all
SetupFlow RPCs to come back before setting up the flow on the gateway.
Most likely (in the happy scenario) all those RPCs would be successful,
so we can parallelize the happy path a bit by setting up the local flow
while the RPCs are in-flight which is what this commit does. This seems
especially beneficial given the change in the previous commit to
increase the number of DistSQL runners for beefy machines - we are now
more likely to issue SetupFlow RPCs asynchronously.

Release note: None
@yuzefovich
Copy link
Member Author

Rebased on top of #85101, so now all commits in the PR are relevant.

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r5, 3 of 3 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @dt, @michae2, and @yuzefovich)


pkg/sql/distsql_running.go line 72 at r7 (raw file):

	// original value of 16 on machines with 4 CPUs.
	4*int64(runtime.GOMAXPROCS(0)), /* defaultValue */
	func(v int64) error {

What happens if this is set to 0? Like does the setup flow phase fail or something and an error gets sent back to gateway?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @dt, and @michae2)


pkg/sql/distsql_running.go line 72 at r7 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

What happens if this is set to 0? Like does the setup flow phase fail or something and an error gets sent back to gateway?

In that case, the setup goroutine on the gateway will issue all SetupFlow RPCs for remote nodes sequentially followed by performing a setup on the gateway. This pool of DistSQL runners allows for those RPCs to be issued in parallel, but if the pool is used up, the setup goroutine doesn't wait for any worker and, instead, does the "work" itself.

@craig craig bot merged commit 314baa5 into cockroachdb:master Aug 1, 2022
@craig
Copy link
Contributor

craig bot commented Aug 1, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

distsql: dynamically adjust the number of distsql runners
4 participants