-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: fine-grained cpu attribution #82625
Comments
82013: storage: add MVCC point synthesizing iterator r=jbowens a=erikgrinaker This patch adds `pointSynthesizingIter`, an MVCC iterator which wraps an arbitrary MVCC iterator and synthesizes point keys for range keys at their start key and where they overlap point keys. It can optionally synthesize around the SeekGE seek key too, which is useful for point operations like `MVCCGet` where we may want to return a synthetic tombstone for an MVCC range tombstone if there is no existing point key. This will primarily be used to handle MVCC range tombstones in MVCC scans and gets, as well as during MVCC conflict checks, which allows much of this logic to remain unchanged and simplified (in particular, `pebbleMVCCScanner` will not need any changes). However, this patch does not make use of the iterator yet, since both it and Pebble will need further performance optimizations for use in hot paths. For now, correctness is sufficient, and only basic attempts at performance optimization have been made. Touches #70412. Release note: None 82547: opt: fix inverted key column type in testcat r=mgartner a=mgartner Previously, inverted key columns in the test catalog were incorrectly given the type of their source column. Now they have the correct type, `types.EncodedKey`. Release note: None 82635: bazel: unlist unsupported os+arch combos r=irfansharif a=irfansharif We don't support these for CRDB builds. Even windows is experimental, but lets keep that around anyway since we publish binaries for it. +cc #82625. Release note: None Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: irfan sharif <[email protected]>
Connecting some dots. From an internal doc around admission control, which would benefit from:
I was also curious about how the Go runtime is able to pace its GC to use only 30% CPU, and came across https://github.com/golang/proposal/blob/master/design/44167-gc-pacer-redesign.md#a-note-about-cpu-utilization (this whole document is fantastic). This was in the context of trying to understand how we could possibly do similar per-node CPU control for tenants and/or ranges, or get more accurate resource use signals. Turns out Go internally just reads off of nanotime(), using it to compute running time for individual goroutines doing GC assist work, without trying to be accurate about OS threads being descheduled. +cc golang/go#41554 (comment). |
Package grunning is a library that's able to retrieve on-CPU running time for individual goroutines. It relies on using a patched Go and provides a primitive for fine-grained CPU attribution and control through a single API: package grunning // Time returns the time spent by the current goroutine in the // running state. func Time() time.Duration The motivating RFC is over at cockroachdb#82356. Informs cockroachdb#82625. We build CRDB using use the patched Go runtime for all officially supported platforms when built using Bazel (cockroachdb#84867). Engineers commonly building CRDB also use happen to use two platforms we don't use a patched Go for: - FreeBSD (we don't have cross-compilers setup), and - M1/M2 Macs (we don't have a code-signing pipeline, yet). We use '(darwin && arm64) || freebsd || !bazel' as the build tag to exclude such platforms. See cockroachdb#84867 for more details. This package tests various properties we should expect over the running time value. It does not make assertions given the CI environments we run these under (CPU-starved, lot of OS thread pre-emption, dissimilar to healthy CRDB deployments). This is also why they're skipped under stress. Still, these tests are useful to understand the properties we expect running time to have: === RUN TestEquivalentGoroutines thread=03 expected≈10.00% got= 9.98% of on-cpu time thread=06 expected≈10.00% got=10.00% of on-cpu time thread=02 expected≈10.00% got=10.01% of on-cpu time thread=10 expected≈10.00% got=10.01% of on-cpu time thread=07 expected≈10.00% got= 9.99% of on-cpu time thread=04 expected≈10.00% got= 9.99% of on-cpu time thread=09 expected≈10.00% got=10.00% of on-cpu time thread=01 expected≈10.00% got= 9.99% of on-cpu time thread=08 expected≈10.00% got=10.02% of on-cpu time thread=05 expected≈10.00% got=10.02% of on-cpu time --- PASS: TestEquivalentGoroutines (0.56s) === RUN TestProportionalGoroutines thread=01 got 1.82% of on-cpu time: expected≈ 1.00x got=1.00x thread=02 got 3.64% of on-cpu time: expected≈ 2.00x got=2.00x thread=03 got 5.47% of on-cpu time: expected≈ 3.00x got=3.00x thread=04 got 7.28% of on-cpu time: expected≈ 4.00x got=4.00x thread=05 got 9.09% of on-cpu time: expected≈ 5.00x got=4.99x thread=06 got 10.91% of on-cpu time: expected≈ 6.00x got=5.99x thread=07 got 12.73% of on-cpu time: expected≈ 7.00x got=6.99x thread=08 got 14.54% of on-cpu time: expected≈ 8.00x got=7.99x thread=09 got 16.36% of on-cpu time: expected≈ 9.00x got=8.99x thread=10 got 18.16% of on-cpu time: expected≈10.00x got=9.97x --- PASS: TestProportionalGoroutines (1.72s) === RUN TestPingPongHog pinger/ponger expected≈1.00x got=0.96x --- PASS: TestPingPongHog (0.91s) Release note: None
Package grunning is a library that's able to retrieve on-CPU running time for individual goroutines. It relies on using a patched Go and provides a primitive for fine-grained CPU attribution and control through a single API: package grunning // Time returns the time spent by the current goroutine in the // running state. func Time() time.Duration The motivating RFC is over at cockroachdb#82356. Informs cockroachdb#82625. We build CRDB using use the patched Go runtime for all officially supported platforms when built using Bazel (cockroachdb#84867). Engineers commonly building CRDB also use happen to use two platforms we don't use a patched Go for: - FreeBSD (we don't have cross-compilers setup), and - M1/M2 Macs (we don't have a code-signing pipeline, yet). We use '(darwin && arm64) || freebsd || !bazel' as the build tag to exclude such platforms. See cockroachdb#84867 for more details. This package tests various properties we should expect over the running time value. It does not make assertions given the CI environments we run these under (CPU-starved, lot of OS thread pre-emption, dissimilar to healthy CRDB deployments). This is also why they're skipped under stress. Still, these tests are useful to understand the properties we expect running time to have: === RUN TestEquivalentGoroutines thread=03 expected≈10.00% got= 9.98% of on-cpu time thread=06 expected≈10.00% got=10.00% of on-cpu time thread=02 expected≈10.00% got=10.01% of on-cpu time thread=10 expected≈10.00% got=10.01% of on-cpu time thread=07 expected≈10.00% got= 9.99% of on-cpu time thread=04 expected≈10.00% got= 9.99% of on-cpu time thread=09 expected≈10.00% got=10.00% of on-cpu time thread=01 expected≈10.00% got= 9.99% of on-cpu time thread=08 expected≈10.00% got=10.02% of on-cpu time thread=05 expected≈10.00% got=10.02% of on-cpu time --- PASS: TestEquivalentGoroutines (0.56s) === RUN TestProportionalGoroutines thread=01 got 1.82% of on-cpu time: expected≈ 1.00x got=1.00x thread=02 got 3.64% of on-cpu time: expected≈ 2.00x got=2.00x thread=03 got 5.47% of on-cpu time: expected≈ 3.00x got=3.00x thread=04 got 7.28% of on-cpu time: expected≈ 4.00x got=4.00x thread=05 got 9.09% of on-cpu time: expected≈ 5.00x got=4.99x thread=06 got 10.91% of on-cpu time: expected≈ 6.00x got=5.99x thread=07 got 12.73% of on-cpu time: expected≈ 7.00x got=6.99x thread=08 got 14.54% of on-cpu time: expected≈ 8.00x got=7.99x thread=09 got 16.36% of on-cpu time: expected≈ 9.00x got=8.99x thread=10 got 18.16% of on-cpu time: expected≈10.00x got=9.97x --- PASS: TestProportionalGoroutines (1.72s) === RUN TestPingPongHog pinger/ponger expected≈1.00x got=0.96x --- PASS: TestPingPongHog (0.91s) Release note: None
85850: importer: support {} format for arrays in CSV r=ecwall a=rafiss fixes #84631 Release note (sql change): Arrays can now be imported in a CSV file using the {} format, similar to COPY FROM. Importing array expressions (e.g. ARRAY[1, 2, 3]) is still supported as well. 85854: clusterversion,storage: remove 22.1 PebbleFormat version gates r=jbowens a=celiala This commit removes the following 22.1 version gates: - PebbleFormatBlockPropertyCollector - PebbleFormatSplitUserKeysMarked Cleanup was done following guidance from [21.2 cleanup](#74270 (comment)): > For the most part, if the gates were just simple if !version.IsActive { return x } or something, I just removed the block, and even if it was a little more complicated, like args = [x]; if version { args = append(args, y) }; foo(args) I still tried to mostly inline it such that it looked natural (i.e. remove that append and make it args = [x, y]). > However for just a couple more complicated cases that were referring to <21.2 versions that needed to be replaced when those were deleted, I added a placeholder clusterversion.TODOPre21_2 alias for 21.2. Replacing those calls with this alias shouldn't change their behavior -- it was already always true, since the code today should never run in a <21.2 cluster -- but means we can delete those older versions in the meantime and then the owners of these bits can decide how to update them. Partially addresses #80663 Release note: none 85909: kvserver: instrument RaftTransport workers with pprof labels r=tbg,erikgrinaker a=pavelkalinnikov The unused arguments in the method signature were used to identify goroutines in traces. This no longer works after Go 1.17 started passing arguments via registers. This commit adds pprof labels when starting these goroutines, to have a cleaner code, more readable traces, and to work around the new Go convention. Release note: None 85977: grunning: add library for precise on-CPU time measurement r=irfansharif a=irfansharif Package grunning is a library that's able to retrieve on-CPU running time for individual goroutines. It relies on using a patched Go and provides a primitive for fine-grained CPU attribution and control through a single API: package grunning // Time returns the time spent by the current goroutine in the // running state. func Time() time.Duration The motivating RFC is over at #82356. Informs #82625. We build CRDB using use the patched Go runtime for all officially supported platforms when built using Bazel (#84867). Engineers commonly building CRDB also use happen to use two platforms we don't use a patched Go for: - FreeBSD (we don't have cross-compilers setup), and - M1/M2 Macs (we don't have a code-signing pipeline, yet). We use '(darwin && arm64) || freebsd || !bazel' as the build tag to exclude such platforms. See #84867 for more details. This package tests various properties we should expect over the running time value. It does not make assertions given the CI environments we run these under (CPU-starved, lot of OS thread pre-emption, dissimilar to healthy CRDB deployments). This is also why they're skipped under stress. Still, these tests are useful to understand the properties we expect running time to have: === RUN TestEquivalentGoroutines thread=03 expected≈10.00% got= 9.98% of on-cpu time thread=06 expected≈10.00% got=10.00% of on-cpu time thread=02 expected≈10.00% got=10.01% of on-cpu time thread=10 expected≈10.00% got=10.01% of on-cpu time thread=07 expected≈10.00% got= 9.99% of on-cpu time thread=04 expected≈10.00% got= 9.99% of on-cpu time thread=09 expected≈10.00% got=10.00% of on-cpu time thread=01 expected≈10.00% got= 9.99% of on-cpu time thread=08 expected≈10.00% got=10.02% of on-cpu time thread=05 expected≈10.00% got=10.02% of on-cpu time --- PASS: TestEquivalentGoroutines (0.56s) === RUN TestProportionalGoroutines thread=01 got 1.82% of on-cpu time: expected≈ 1.00x got=1.00x thread=02 got 3.64% of on-cpu time: expected≈ 2.00x got=2.00x thread=03 got 5.47% of on-cpu time: expected≈ 3.00x got=3.00x thread=04 got 7.28% of on-cpu time: expected≈ 4.00x got=4.00x thread=05 got 9.09% of on-cpu time: expected≈ 5.00x got=4.99x thread=06 got 10.91% of on-cpu time: expected≈ 6.00x got=5.99x thread=07 got 12.73% of on-cpu time: expected≈ 7.00x got=6.99x thread=08 got 14.54% of on-cpu time: expected≈ 8.00x got=7.99x thread=09 got 16.36% of on-cpu time: expected≈ 9.00x got=8.99x thread=10 got 18.16% of on-cpu time: expected≈10.00x got=9.97x --- PASS: TestProportionalGoroutines (1.72s) === RUN TestPingPongHog pinger/ponger expected≈1.00x got=0.96x --- PASS: TestPingPongHog (0.91s) Release note: None 85987: sql: fix aggregation of statistics r=maryliag a=maryliag Previously, because we were using a join, we were double counting statistics when we had the same fingerprint in memory and persisted. This commit adds a `DISTINCT` so we only count them once. Fixes #85958 Release note: None 86008: sql: logic test to inform maximum builtin function oid change r=chengxiong-ruan a=chengxiong-ruan This commit adds a logic test to let engineers who added a new builtin function know that the new builtin function is constructed earlier than some existing builtin functions at init time. Release note: None Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Celia La <[email protected]> Co-authored-by: Pavel Kalinnikov <[email protected]> Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]> Co-authored-by: Chengxiong Ruan <[email protected]>
The necessary work is done. |
This is super cool. I didn't know we actually implemented this until now. @irfansharif, where do we use this capability today? |
Admission control for now, #86638, coming to a 22.2 CC cluster near you. It's pretty sick, check out #89709 (comment) for ex. In 23.1 I hope we start using it in the allocator: #83490. |
So great! |
This is the tracking issue for #82356. We expect the work here to break down into the following deliverables.
Must have:
Nice to have:
Surfacing per-statement cluster-wide CPU usage as part of EXPLAIN ANALYZE. This would likely require integrating the library above into the Stopper and recording as tracing events total running CPU time for a given request (propagated all the way up to the gateway).Tracked in sql: record CPU seconds per statement #87213.Jira issue: CRDB-16580
The text was updated successfully, but these errors were encountered: