Skip to content

Commit

Permalink
Merge #123172
Browse files Browse the repository at this point in the history
123172: cli: include cpu profiles into debug.zip r=yuzefovich a=yuzefovich

This commit extends `GetFiles` API to support sending CPU profiles (collected by the CPU profiler) and uses this in the debug.zip to automatically include all relevant CPU profiles (this is in addition to collecting fresh CPU profiles at the time debug.zip is taken). In 24.1 time frame we enabled the CPU profiler by default, so this is a nice addition to that. (Note that on CC clusters we had the CPU profiler enabled already, and SREs had to manually fetch the CPU profiles since it wasn't included into the debug.zip. This will remove that extra overhead.)

Fixes: #105012.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
  • Loading branch information
craig[bot] and yuzefovich committed Apr 29, 2024
2 parents 20749b8 + 6f09a3b commit 7cd5b54
Show file tree
Hide file tree
Showing 22 changed files with 315 additions and 248 deletions.
7 changes: 7 additions & 0 deletions pkg/base/test_server_args.go
Original file line number Diff line number Diff line change
Expand Up @@ -638,6 +638,13 @@ type TestTenantArgs struct {
// If set, this directory should be cleaned up after the test completes.
HeapProfileDirName string

// CPUProfileDirName is used to initialize the same named field on the
// SQLServer.BaseConfig field. It is the directory name for cpu profiles
// using cpuprofiler. If empty, no cpu profiles will be collected during the
// test. If set, this directory should be cleaned up after the test
// completes.
CPUProfileDirName string

// StartDiagnosticsReporting checks cluster.TelemetryOptOut(), and
// if not disabled starts the asynchronous goroutine that checks for
// CockroachDB upgrades and periodically reports diagnostics to
Expand Down
27 changes: 18 additions & 9 deletions pkg/cli/testdata/zip/partial1
Original file line number Diff line number Diff line change
Expand Up @@ -119,12 +119,15 @@ debug zip --concurrency=1 --cpu-profile-duration=0s /dev/null
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response...
[node 1] requesting heap file list: last request failed: rpc error: ...
[node 1] requesting heap file list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting heap profile list... received response...
[node 1] requesting heap profile list: last request failed: rpc error: ...
[node 1] requesting heap profile list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting goroutine dump list... received response...
[node 1] requesting goroutine dump list: last request failed: rpc error: ...
[node 1] requesting goroutine dump list: creating error output: debug/nodes/1/goroutines.err.txt... done
[node 1] requesting cpu profile list... received response...
[node 1] requesting cpu profile list: last request failed: rpc error: ...
[node 1] requesting cpu profile list: creating error output: debug/nodes/1/cpuprof.err.txt... done
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response... done
Expand Down Expand Up @@ -218,12 +221,15 @@ debug zip --concurrency=1 --cpu-profile-duration=0s /dev/null
[node 2] requesting heap profile... received response...
[node 2] requesting heap profile: last request failed: rpc error: ...
[node 2] requesting heap profile: creating error output: debug/nodes/2/heap.pprof.err.txt... done
[node 2] requesting heap file list... received response...
[node 2] requesting heap file list: last request failed: rpc error: ...
[node 2] requesting heap file list: creating error output: debug/nodes/2/heapprof.err.txt... done
[node 2] requesting heap profile list... received response...
[node 2] requesting heap profile list: last request failed: rpc error: ...
[node 2] requesting heap profile list: creating error output: debug/nodes/2/heapprof.err.txt... done
[node 2] requesting goroutine dump list... received response...
[node 2] requesting goroutine dump list: last request failed: rpc error: ...
[node 2] requesting goroutine dump list: creating error output: debug/nodes/2/goroutines.err.txt... done
[node 2] requesting cpu profile list... received response...
[node 2] requesting cpu profile list: last request failed: rpc error: ...
[node 2] requesting cpu profile list: creating error output: debug/nodes/2/cpuprof.err.txt... done
[node 2] requesting log files list... received response...
[node 2] requesting log files list: last request failed: rpc error: ...
[node 2] requesting log files list: creating error output: debug/nodes/2/logs.err.txt... done
Expand Down Expand Up @@ -261,12 +267,15 @@ debug zip --concurrency=1 --cpu-profile-duration=0s /dev/null
[node 3] requesting stacks... received response... writing binary output: debug/nodes/3/stacks.txt... done
[node 3] requesting stacks with labels... received response... writing binary output: debug/nodes/3/stacks_with_labels.txt... done
[node 3] requesting heap profile... received response... writing binary output: debug/nodes/3/heap.pprof... done
[node 3] requesting heap file list... received response...
[node 3] requesting heap file list: last request failed: rpc error: ...
[node 3] requesting heap file list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting heap profile list... received response...
[node 3] requesting heap profile list: last request failed: rpc error: ...
[node 3] requesting heap profile list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting goroutine dump list... received response...
[node 3] requesting goroutine dump list: last request failed: rpc error: ...
[node 3] requesting goroutine dump list: creating error output: debug/nodes/3/goroutines.err.txt... done
[node 3] requesting cpu profile list... received response...
[node 3] requesting cpu profile list: last request failed: rpc error: ...
[node 3] requesting cpu profile list: creating error output: debug/nodes/3/cpuprof.err.txt... done
[node 3] requesting log files list... received response... done
[node ?] ? log files found
[node 3] requesting ranges... received response... done
Expand Down
18 changes: 12 additions & 6 deletions pkg/cli/testdata/zip/partial1_excluded
Original file line number Diff line number Diff line change
Expand Up @@ -119,12 +119,15 @@ debug zip /dev/null --concurrency=1 --exclude-nodes=2 --cpu-profile-duration=0
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response...
[node 1] requesting heap file list: last request failed: rpc error: ...
[node 1] requesting heap file list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting heap profile list... received response...
[node 1] requesting heap profile list: last request failed: rpc error: ...
[node 1] requesting heap profile list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting goroutine dump list... received response...
[node 1] requesting goroutine dump list: last request failed: rpc error: ...
[node 1] requesting goroutine dump list: creating error output: debug/nodes/1/goroutines.err.txt... done
[node 1] requesting cpu profile list... received response...
[node 1] requesting cpu profile list: last request failed: rpc error: ...
[node 1] requesting cpu profile list: creating error output: debug/nodes/1/cpuprof.err.txt... done
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response... done
Expand Down Expand Up @@ -161,12 +164,15 @@ debug zip /dev/null --concurrency=1 --exclude-nodes=2 --cpu-profile-duration=0
[node 3] requesting stacks... received response... writing binary output: debug/nodes/3/stacks.txt... done
[node 3] requesting stacks with labels... received response... writing binary output: debug/nodes/3/stacks_with_labels.txt... done
[node 3] requesting heap profile... received response... writing binary output: debug/nodes/3/heap.pprof... done
[node 3] requesting heap file list... received response...
[node 3] requesting heap file list: last request failed: rpc error: ...
[node 3] requesting heap file list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting heap profile list... received response...
[node 3] requesting heap profile list: last request failed: rpc error: ...
[node 3] requesting heap profile list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting goroutine dump list... received response...
[node 3] requesting goroutine dump list: last request failed: rpc error: ...
[node 3] requesting goroutine dump list: creating error output: debug/nodes/3/goroutines.err.txt... done
[node 3] requesting cpu profile list... received response...
[node 3] requesting cpu profile list: last request failed: rpc error: ...
[node 3] requesting cpu profile list: creating error output: debug/nodes/3/cpuprof.err.txt... done
[node 3] requesting log files list... received response... done
[node ?] ? log files found
[node 3] requesting ranges... received response... done
Expand Down
18 changes: 12 additions & 6 deletions pkg/cli/testdata/zip/partial2
Original file line number Diff line number Diff line change
Expand Up @@ -119,12 +119,15 @@ debug zip --concurrency=1 --cpu-profile-duration=0 /dev/null
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response...
[node 1] requesting heap file list: last request failed: rpc error: ...
[node 1] requesting heap file list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting heap profile list... received response...
[node 1] requesting heap profile list: last request failed: rpc error: ...
[node 1] requesting heap profile list: creating error output: debug/nodes/1/heapprof.err.txt... done
[node 1] requesting goroutine dump list... received response...
[node 1] requesting goroutine dump list: last request failed: rpc error: ...
[node 1] requesting goroutine dump list: creating error output: debug/nodes/1/goroutines.err.txt... done
[node 1] requesting cpu profile list... received response...
[node 1] requesting cpu profile list: last request failed: rpc error: ...
[node 1] requesting cpu profile list: creating error output: debug/nodes/1/cpuprof.err.txt... done
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response... done
Expand Down Expand Up @@ -160,12 +163,15 @@ debug zip --concurrency=1 --cpu-profile-duration=0 /dev/null
[node 3] requesting stacks... received response... writing binary output: debug/nodes/3/stacks.txt... done
[node 3] requesting stacks with labels... received response... writing binary output: debug/nodes/3/stacks_with_labels.txt... done
[node 3] requesting heap profile... received response... writing binary output: debug/nodes/3/heap.pprof... done
[node 3] requesting heap file list... received response...
[node 3] requesting heap file list: last request failed: rpc error: ...
[node 3] requesting heap file list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting heap profile list... received response...
[node 3] requesting heap profile list: last request failed: rpc error: ...
[node 3] requesting heap profile list: creating error output: debug/nodes/3/heapprof.err.txt... done
[node 3] requesting goroutine dump list... received response...
[node 3] requesting goroutine dump list: last request failed: rpc error: ...
[node 3] requesting goroutine dump list: creating error output: debug/nodes/3/goroutines.err.txt... done
[node 3] requesting cpu profile list... received response...
[node 3] requesting cpu profile list: last request failed: rpc error: ...
[node 3] requesting cpu profile list: creating error output: debug/nodes/3/cpuprof.err.txt... done
[node 3] requesting log files list... received response... done
[node ?] ? log files found
[node 3] requesting ranges... received response... done
Expand Down
4 changes: 3 additions & 1 deletion pkg/cli/testdata/zip/testzip
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,12 @@ debug zip --concurrency=1 --cpu-profile-duration=1s /dev/null
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response... done
[node 1] requesting heap profile list... received response... done
[node ?] ? heap profiles found
[node 1] requesting goroutine dump list... received response... done
[node ?] ? goroutine dumps found
[node 1] requesting cpu profile list... received response... done
[node ?] ? cpu profiles found
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response... done
Expand Down
45 changes: 30 additions & 15 deletions pkg/cli/testdata/zip/testzip_concurrent
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,11 @@ zip
[node 1] node status...
[node 1] node status: done
[node 1] node status: writing JSON output: debug/nodes/1/status.json...
[node 1] requesting cpu profile list...
[node 1] requesting cpu profile list: creating error output: debug/nodes/1/cpuprof.err.txt...
[node 1] requesting cpu profile list: done
[node 1] requesting cpu profile list: last request failed: rpc error: ...
[node 1] requesting cpu profile list: received response...
[node 1] requesting data for debug/nodes/1/details...
[node 1] requesting data for debug/nodes/1/details: done
[node 1] requesting data for debug/nodes/1/details: received response...
Expand All @@ -289,11 +294,11 @@ zip
[node 1] requesting goroutine dump list: done
[node 1] requesting goroutine dump list: last request failed: rpc error: ...
[node 1] requesting goroutine dump list: received response...
[node 1] requesting heap file list...
[node 1] requesting heap file list: creating error output: debug/nodes/1/heapprof.err.txt...
[node 1] requesting heap file list: done
[node 1] requesting heap file list: last request failed: rpc error: ...
[node 1] requesting heap file list: received response...
[node 1] requesting heap profile list...
[node 1] requesting heap profile list: creating error output: debug/nodes/1/heapprof.err.txt...
[node 1] requesting heap profile list: done
[node 1] requesting heap profile list: last request failed: rpc error: ...
[node 1] requesting heap profile list: received response...
[node 1] requesting heap profile...
[node 1] requesting heap profile: done
[node 1] requesting heap profile: received response...
Expand Down Expand Up @@ -388,6 +393,11 @@ zip
[node 2] node status...
[node 2] node status: done
[node 2] node status: writing JSON output: debug/nodes/2/status.json...
[node 2] requesting cpu profile list...
[node 2] requesting cpu profile list: creating error output: debug/nodes/2/cpuprof.err.txt...
[node 2] requesting cpu profile list: done
[node 2] requesting cpu profile list: last request failed: rpc error: ...
[node 2] requesting cpu profile list: received response...
[node 2] requesting data for debug/nodes/2/details...
[node 2] requesting data for debug/nodes/2/details: done
[node 2] requesting data for debug/nodes/2/details: received response...
Expand All @@ -405,11 +415,11 @@ zip
[node 2] requesting goroutine dump list: done
[node 2] requesting goroutine dump list: last request failed: rpc error: ...
[node 2] requesting goroutine dump list: received response...
[node 2] requesting heap file list...
[node 2] requesting heap file list: creating error output: debug/nodes/2/heapprof.err.txt...
[node 2] requesting heap file list: done
[node 2] requesting heap file list: last request failed: rpc error: ...
[node 2] requesting heap file list: received response...
[node 2] requesting heap profile list...
[node 2] requesting heap profile list: creating error output: debug/nodes/2/heapprof.err.txt...
[node 2] requesting heap profile list: done
[node 2] requesting heap profile list: last request failed: rpc error: ...
[node 2] requesting heap profile list: received response...
[node 2] requesting heap profile...
[node 2] requesting heap profile: done
[node 2] requesting heap profile: received response...
Expand Down Expand Up @@ -504,6 +514,11 @@ zip
[node 3] node status...
[node 3] node status: done
[node 3] node status: writing JSON output: debug/nodes/3/status.json...
[node 3] requesting cpu profile list...
[node 3] requesting cpu profile list: creating error output: debug/nodes/3/cpuprof.err.txt...
[node 3] requesting cpu profile list: done
[node 3] requesting cpu profile list: last request failed: rpc error: ...
[node 3] requesting cpu profile list: received response...
[node 3] requesting data for debug/nodes/3/details...
[node 3] requesting data for debug/nodes/3/details: done
[node 3] requesting data for debug/nodes/3/details: received response...
Expand All @@ -521,11 +536,11 @@ zip
[node 3] requesting goroutine dump list: done
[node 3] requesting goroutine dump list: last request failed: rpc error: ...
[node 3] requesting goroutine dump list: received response...
[node 3] requesting heap file list...
[node 3] requesting heap file list: creating error output: debug/nodes/3/heapprof.err.txt...
[node 3] requesting heap file list: done
[node 3] requesting heap file list: last request failed: rpc error: ...
[node 3] requesting heap file list: received response...
[node 3] requesting heap profile list...
[node 3] requesting heap profile list: creating error output: debug/nodes/3/heapprof.err.txt...
[node 3] requesting heap profile list: done
[node 3] requesting heap profile list: last request failed: rpc error: ...
[node 3] requesting heap profile list: received response...
[node 3] requesting heap profile...
[node 3] requesting heap profile: done
[node 3] requesting heap profile: received response...
Expand Down
4 changes: 3 additions & 1 deletion pkg/cli/testdata/zip/testzip_exclude_goroutine_stacks
Original file line number Diff line number Diff line change
Expand Up @@ -121,10 +121,12 @@ debug zip --concurrency=1 --cpu-profile-duration=1s --include-goroutine-stacks=f
[node 1] requesting data for debug/nodes/1/enginestats... received response... writing JSON output: debug/nodes/1/enginestats.json... done
[node 1] Skipping fetching goroutine stacks. Enable via the --include-goroutine-stacks flag.
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response... done
[node 1] requesting heap profile list... received response... done
[node ?] ? heap profiles found
[node 1] requesting goroutine dump list... received response... done
[node ?] ? goroutine dumps found
[node 1] requesting cpu profile list... received response... done
[node ?] ? cpu profiles found
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response... done
Expand Down
4 changes: 3 additions & 1 deletion pkg/cli/testdata/zip/testzip_exclude_range_info
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,12 @@ debug zip --concurrency=1 --cpu-profile-duration=1s --include-range-info=false /
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response... done
[node 1] requesting heap profile list... received response... done
[node ?] ? heap profiles found
[node 1] requesting goroutine dump list... received response... done
[node ?] ? goroutine dumps found
[node 1] requesting cpu profile list... received response... done
[node ?] ? cpu profiles found
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[cluster] pprof summary script... writing binary output: debug/pprof-summary.sh... done
4 changes: 3 additions & 1 deletion pkg/cli/testdata/zip/testzip_external_process_virtualization
Original file line number Diff line number Diff line change
Expand Up @@ -144,10 +144,12 @@ debug zip --concurrency=1 --cpu-profile-duration=1s /dev/null
[node 1] requesting stacks... received response... writing binary output: debug/nodes/1/stacks.txt... done
[node 1] requesting stacks with labels... received response... writing binary output: debug/nodes/1/stacks_with_labels.txt... done
[node 1] requesting heap profile... received response... writing binary output: debug/nodes/1/heap.pprof... done
[node 1] requesting heap file list... received response... done
[node 1] requesting heap profile list... received response... done
[node ?] ? heap profiles found
[node 1] requesting goroutine dump list... received response... done
[node ?] ? goroutine dumps found
[node 1] requesting cpu profile list... received response... done
[node ?] ? cpu profiles found
[node 1] requesting log files list... received response... done
[node ?] ? log files found
[node 1] requesting ranges... received response...
Expand Down
Loading

0 comments on commit 7cd5b54

Please sign in to comment.