forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
server,tracing: integrate on-demand profiling with CRDB tracing
This change introduces a BackgroundProfiler service that is started during server startup on each node in the cluster. The BackgroundProfiler is responsible for collecting on-demand CPU profiles and runtime traces for a particular operation. The profiler can be subscribed to by an in-process listener. The first Subscriber initializes the collection of the CPU and execution trace profiles. While the profiles are being collected, only Subscribers carrying the same `profileID` are allowed to join the running profiles. The profiles are stopped and persisted to local storage when the last Subscriber unsubscribes. The `profileID` is a unique identifier of the operation that is being traced. Since only one profile can be running in a process at a time, any Subscribers with different `profileID`s than the current one will be rejected. The in-process listeners described above will be CRDBs internal tracing spans. This change introduces a `WithBackgroudProfiling` option that can be used to instruct a tracing span to subscribe to the BackgroundProfiler. This option is propogated to all local and remote child spans created as part of the trace. Only local, root spans that have background profiling enabled will Subscribe to the profiler on creation. As mentioned above only one operation can be profiled at a time. We use the first root span's `TraceID` as the BackgroundProfiler's `profileID`. All subsequent root span's that are part of the same trace will be able to join the running profile. Tracing span's unsubscribe from the profile on Finish(). Every Susbcriber is returned a wrapped ctx with pprof labels that tie its execution to the profile being collected by the BackgroundProfiler. These labels are used to post-process the collected CPU profile and filter out samples that only correspond to our subscribers. The end result is filtered CPU profile prefixed `cpuprofiler.` and a process wide execution trace `runtimetrace.` persisted to local storage. This change only introduces the infrastructure to enable on-demand profiling. The test in `profiler_test.go` results in a CPU profile with information about each labelled root operation collected on-demand: ❯ go tool pprof cpuprofiler.2023-03-08T14_51_52.402 Type: cpu Time: Mar 8, 2023 at 9:51am (EST) Duration: 10.11s, Total samples = 8.57s (84.77%) Entering interactive mode (type "help" for commands, "o" for options) (pprof) tags 9171346462634118014: Total 8.6s 906.0ms (10.57%): op2 902.0ms (10.53%): op1 890.0ms (10.39%): op0 886.0ms (10.34%): op7 866.0ms (10.11%): op4 866.0ms (10.11%): op5 854.0ms ( 9.96%): op3 806.0ms ( 9.40%): op8 804.0ms ( 9.38%): op6 790.0ms ( 9.22%): op9 Execution traces do not surface pprof labels in golang yet but a future patch could consider cherry-picking https://go-review.googlesource.com/c/go/+/446975. This allows the user to focus on goroutines run with the specified pprof labels: With this framework in place one could envision the following use cases: - stmt diagnostics requests get a new option to request profiling. When requested, any local root trace span (i.e. while any part of the trace is active on a given node) subscribes to profiles, and references to the profiles collected are stored as payloads in the span. They're then included in the stmt bundle. - even outside of diagnostics, could mark traces as wanting to capture debug info for "slow spans". Such spans on creation could set a timer that, once it fires, subscribes to (short) execution traces periodically as a way to snapshot the goroutine's actions. These could be referenced in the span for later retrieval. Informs: cockroachdb#97215
- Loading branch information
1 parent
b84f10c
commit b12cbd1
Showing
24 changed files
with
854 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
load("//build/bazelutil/unused_checker:unused.bzl", "get_x_data") | ||
load("@io_bazel_rules_go//go:def.bzl", "go_library") | ||
|
||
go_library( | ||
name = "executiontracer", | ||
srcs = ["executiontracer.go"], | ||
importpath = "github.com/cockroachdb/cockroach/pkg/server/executiontracer", | ||
visibility = ["//visibility:public"], | ||
deps = ["//pkg/util/protoutil"], | ||
) | ||
|
||
go_library( | ||
name = "backgroundprofiler", | ||
srcs = ["background_profiler.go"], | ||
importpath = "github.com/cockroachdb/cockroach/pkg/server/backgroundprofiler", | ||
visibility = ["//visibility:public"], | ||
deps = ["//pkg/util/protoutil"], | ||
) | ||
|
||
get_x_data(name = "get_x_data") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
// Copyright 2023 The Cockroach Authors. | ||
// | ||
// Use of this software is governed by the Business Source License | ||
// included in the file licenses/BSL.txt. | ||
// | ||
// As of the Change Date specified in that file, in accordance with | ||
// the Business Source License, use of this software will be governed | ||
// by the Apache License, Version 2.0, included in the file | ||
// licenses/APL.txt. | ||
|
||
package backgroundprofiler | ||
|
||
import ( | ||
"context" | ||
|
||
"github.com/cockroachdb/cockroach/pkg/util/protoutil" | ||
) | ||
|
||
// ProfileID is a unique identifier of the operation being profiled by the | ||
// Profiler. | ||
type ProfileID int | ||
|
||
// SubscriberID is a unique identifier of the Subscriber subscribing to the | ||
// background profile collection. | ||
type SubscriberID int | ||
|
||
// IsSet returns true if the BackgroundProfiler is currently associated with a | ||
// profileID. | ||
func (r ProfileID) IsSet() bool { | ||
return r != 0 | ||
} | ||
|
||
// Subscriber is the interface that describes an object that can subscribe to | ||
// the background profiler. | ||
type Subscriber interface { | ||
// LabelValue returns the value that will be used when setting the pprof | ||
// labels of the Subscriber. The key of the label will always be the ProfileID | ||
// thereby allowing us to identify all samples that describe the operation | ||
// being profiled. | ||
LabelValue() string | ||
// Identifier returns the unique identifier of the Subscriber. | ||
Identifier() SubscriberID | ||
// ProfileID returns the unique identifier of the operation that the | ||
// Subscriber is executing on behalf of. | ||
ProfileID() ProfileID | ||
} | ||
|
||
// Profiler is the interface that exposes methods to subscribe and unsubscribe | ||
// from a background profiler. | ||
type Profiler interface { | ||
// Subscribe registers the subscriber with the background profiler. This | ||
// method returns a context wrapped with pprof labels along with a closure to | ||
// restore the original labels of the context. | ||
Subscribe(ctx context.Context, subscriber Subscriber) (context.Context, func()) | ||
// Unsubscribe unregisters the subscriber from the background profiler. If the | ||
// subscriber is responsible for finishing the profile the method will also | ||
// return metadata describing the collected profile. | ||
Unsubscribe(subscriber Subscriber) (finishedProfile bool, msg protoutil.Message) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
load("//build/bazelutil/unused_checker:unused.bzl", "get_x_data") | ||
load("@rules_proto//proto:defs.bzl", "proto_library") | ||
load("@io_bazel_rules_go//go:def.bzl", "go_library", "go_test") | ||
load("@io_bazel_rules_go//proto:def.bzl", "go_proto_library") | ||
|
||
proto_library( | ||
name = "profiler_proto", | ||
srcs = ["profiler.proto"], | ||
strip_import_prefix = "/pkg", | ||
visibility = ["//visibility:public"], | ||
deps = ["@com_github_gogo_protobuf//gogoproto:gogo_proto"], | ||
) | ||
|
||
go_proto_library( | ||
name = "profiler_go_proto", | ||
compilers = ["//pkg/cmd/protoc-gen-gogoroach:protoc-gen-gogoroach_compiler"], | ||
importpath = "github.com/cockroachdb/cockroach/pkg/server/backgroundprofiler/profiler", | ||
proto = ":profiler_proto", | ||
visibility = ["//visibility:public"], | ||
deps = ["@com_github_gogo_protobuf//gogoproto"], | ||
) | ||
|
||
go_library( | ||
name = "profiler", | ||
srcs = ["profiler.go"], | ||
embed = [":profiler_go_proto"], | ||
importpath = "github.com/cockroachdb/cockroach/pkg/server/backgroundprofiler/profiler", | ||
visibility = ["//visibility:public"], | ||
deps = [ | ||
"//pkg/server/backgroundprofiler", | ||
"//pkg/server/dumpstore", | ||
"//pkg/settings", | ||
"//pkg/settings/cluster", | ||
"//pkg/util/log", | ||
"//pkg/util/pprofutil", | ||
"//pkg/util/protoutil", | ||
"//pkg/util/stop", | ||
"//pkg/util/syncutil", | ||
"//pkg/util/timeutil", | ||
"@com_github_cockroachdb_errors//:errors", | ||
"@com_github_google_pprof//profile", | ||
], | ||
) | ||
|
||
go_test( | ||
name = "profiler_test", | ||
srcs = ["profiler_test.go"], | ||
args = ["-test.timeout=295s"], | ||
deps = [ | ||
":profiler", | ||
"//pkg/settings/cluster", | ||
"//pkg/testutils", | ||
"//pkg/util/ctxgroup", | ||
"//pkg/util/log", | ||
"//pkg/util/stop", | ||
"//pkg/util/tracing", | ||
"//pkg/util/tracing/tracingpb", | ||
"@com_github_gogo_protobuf//types", | ||
"@com_github_google_pprof//profile", | ||
"@com_github_stretchr_testify//require", | ||
], | ||
) | ||
|
||
get_x_data(name = "get_x_data") |
Oops, something went wrong.