Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: throttling row sampling is based on potentially dubious CPU signal #103472

Closed
kvoli opened this issue May 16, 2023 · 0 comments · Fixed by #104153
Closed

sql: throttling row sampling is based on potentially dubious CPU signal #103472

kvoli opened this issue May 16, 2023 · 0 comments · Fixed by #104153
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@kvoli
Copy link
Collaborator

kvoli commented May 16, 2023

The SQL stats row sampler uses the normalized CPU to pace and throttle sampling:

// Look at CRDB's average CPU usage in the last 10 seconds:
// - if it is lower than cpuUsageMinThrottle, we do not throttle;
// - if it is higher than cpuUsageMaxThrottle, we throttle all the way;
// - in-between, we scale the idle time proportionally.
usage := s.FlowCtx.Cfg.RuntimeStats.GetCPUCombinedPercentNorm()

However due to #101633, the normalized CPU signal used may be under reported if GOMAXPROCS is set to a lower number than the number of CPUs available for the process - num_cpus.

This may lead to a lack of throttling due to the underreported normalized CPU.

Jira issue: CRDB-28020

@kvoli kvoli added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label May 16, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label May 16, 2023
craig bot pushed a commit that referenced this issue Jun 1, 2023
104153: status: account for gomaxprocs in cpu utilization r=kvoli a=kvoli

When a CRDB process had GOMAXPROCS set lower than the number of CPUs
available, the normalized CPU utilization metric would only account for
the number of processors. However, the process could never use more than
GOMAXPROCS processors in parallel, capping CPU capacity. As a result,
the normalized CPU utilization would be under reported.

e.g. When the number of CPUs available is 4, GOMAXPROCS is 2 and the
usage is 1, utilization would be reported as 25%, whilst the real
utilized capacity is 50%.

Update the normalized CPU calculation to take the GOMAXPROCS into
account, using the smallest capacity for utilization calculation. This
affects the `sys.cpu.combined.percent-normalized` metric.

Fixes: #101633
Fixes: #103472

Release note (bug fix): `sys.cpu.combined.percent-normalized` now
uses `GOMAXPROCS`, if lower than the number of CPU shares when
calculating CPU utilization.

Co-authored-by: Austen McClernon <[email protected]>
@craig craig bot closed this as completed in df55958 Jun 1, 2023
blathers-crl bot pushed a commit that referenced this issue Jun 1, 2023
When a CRDB process had GOMAXPROCS set lower than the number of CPUs
available, the normalized CPU utilization metric would only account for
the number of processors. However, the process could never use more than
GOMAXPROCS processors in parallel, capping CPU capacity. As a result,
the normalized CPU utilization would be under reported.

e.g. When the number of CPUs available is 4, GOMAXPROCS is 2 and the
usage is 1, utilization would be reported as 25%, whilst the real
utilized capacity is 50%.

Update the normalized CPU calculation to take the GOMAXPROCS into
account, using the smallest capacity for utilization calculation. This
affects the `sys.cpu.combined.percent-normalized` metric.

Fixes: #101633
Fixes: #103472

Release note (bug fix): `sys.cpu.combined.percent-normalized` now
uses `GOMAXPROCS`, if lower than the number of CPU shares when
calculating CPU utilization.
kvoli added a commit to kvoli/cockroach that referenced this issue Jun 7, 2023
When a CRDB process had GOMAXPROCS set lower than the number of CPUs
available, the normalized CPU utilization metric would only account for
the number of processors. However, the process could never use more than
GOMAXPROCS processors in parallel, capping CPU capacity. As a result,
the normalized CPU utilization would be under reported.

e.g. When the number of CPUs available is 4, GOMAXPROCS is 2 and the
usage is 1, utilization would be reported as 25%, whilst the real
utilized capacity is 50%.

Update the normalized CPU calculation to take the GOMAXPROCS into
account, using the smallest capacity for utilization calculation. This
affects the `sys.cpu.combined.percent-normalized` metric.

Fixes: cockroachdb#101633
Fixes: cockroachdb#103472

Release note (bug fix): `sys.cpu.combined.percent-normalized` now
uses `GOMAXPROCS`, if lower than the number of CPU shares when
calculating CPU utilization.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant