Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add pprof labels based on the statement type #30930

Closed
wants to merge 1 commit into from

Conversation

jordanlewis
Copy link
Member

pprof profiles will now be enhanced with the type of the statement that
caused the creation of each stack. statement types are things like
'SELECT', 'INSERT', 'ALTER TABLE', etc.

This example output is produced based on a kv95 workload:

(pprof) tags
 stmt: Total 27.7s
       25.4s (91.77%): SELECT
        2.3s ( 8.23%): INSERT

Release note: None

@jordanlewis jordanlewis requested review from nvanbenschoten, petermattis and a team October 3, 2018 18:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jordanlewis
Copy link
Member Author

Here's a sample profile created with the new tags:

pprof.samples.cpu.007.pb.gz

@jordanlewis
Copy link
Member Author

https://rakyll.org/profiler-labels/ has more details on how the labels work, btw.

@tbg
Copy link
Member

tbg commented Oct 3, 2018

Just a drive by question, what's the cost of a label? Are they cheap enough to throw in basically as we please?

@jordanlewis
Copy link
Member Author

They seem very cheap, but I don't know about using them more than on the order of once per query. They do a context.Value call and call a runtime function that sets a field on a pointer.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as Tobi. This is really neat, but I'd like to experimentally confirm that the overhead is completely negligible before making the change.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/conn_executor_exec.go, line 26 at r1 (raw file):

	"github.com/pkg/errors"

	pprof "runtime/pprof"

Weird import.

@jordanlewis
Copy link
Member Author

Switching to pprof-labels~
+ make bench PKG=./pkg/bench BENCHTIMEOUT=5m 'BENCHES=BenchmarkSelect1/^Cockroach$$' 'TESTFLAGS=-count 10 -benchmem'
Switching to pprof-labels
+ make bench PKG=./pkg/bench BENCHTIMEOUT=5m 'BENCHES=BenchmarkSelect1/^Cockroach$$' 'TESTFLAGS=-count 10 -benchmem'
name                 old time/op    new time/op    delta
Select1/Cockroach-8     126µs ± 1%     140µs ±13%  +10.74%  (p=0.000 n=9+9)

name                 old alloc/op   new alloc/op   delta
Select1/Cockroach-8    18.8kB ± 0%    19.2kB ± 0%   +2.22%  (p=0.000 n=7+9)

name                 old allocs/op  new allocs/op  delta
Select1/Cockroach-8       164 ± 0%       169 ± 0%   +3.05%  (p=0.000 n=10+10)

Pretty disappointing. Not sure this is worth it. Though the Do call is high level and always allocates something - let me see if I can improve the performance by doing things with lower level calls.

pprof profiles will now be enhanced with the type of the statement that
caused the creation of each stack. statement types are things like
'SELECT', 'INSERT', 'ALTER TABLE', etc.

Release note: None
@jordanlewis
Copy link
Member Author

Replacing the Do call makes this seem ok again.

name                 old time/op    new time/op    delta
Select1/Cockroach-8     126µs ± 2%     127µs ± 0%    ~     (p=0.156 n=10+9)

name                 old alloc/op   new alloc/op   delta
Select1/Cockroach-8    18.8kB ± 0%    19.2kB ± 0%  +2.26%  (p=0.000 n=10+10)

name                 old allocs/op  new allocs/op  delta
Select1/Cockroach-8       164 ± 0%       169 ± 0%  +3.05%  (p=0.000 n=10+10)

@nvanbenschoten
Copy link
Member

Can we statically allocate the labels for all statement types to avoid the increase in memory usage?

@jordanlewis
Copy link
Member Author

We'll still have to allocate a map in WithLabels, since the labelContextKey isn't exported for us to muck with, but I think you're right that we can avoid making LabelSets over and over. Good suggestion.

@jordanlewis
Copy link
Member Author

Nevermind, there's no way we can reduce allocations here. However, I think the increase of allocations is negligible. The query benchmarked here is a worst-case - it's SELECT 1 which does almost no work at all. Normal queries do a lot more. Thoughts?

@nvanbenschoten
Copy link
Member

Nevermind, there's no way we can reduce allocations here.

Can't we at least avoid the pprof.Labels allocation?

@jordanlewis
Copy link
Member Author

That doesn't get allocated.

@nvanbenschoten
Copy link
Member

Really? Aren't there varargs and a slice in pprof.Labels?

@jordanlewis
Copy link
Member Author

I ran this change through pprof and didn't see any allocations on those lines. Maybe I did something wrong?

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, so up to you whether you think the added debuggability from this change is worth the minor increase in allocations.

@jordanlewis
Copy link
Member Author

I'm not really sure. I'm going to let this sit a while longer and revisit when I have more headspace for it. Thanks for the reviews so far.

@tbg
Copy link
Member

tbg commented Feb 6, 2019

Per our discussion yesterday, I think that if you gated this behind a cluster setting and used the anonymized statement string (on top of a tag label) this would be a very useful addition that we could use in the 2.2 cycle to find out whether we'll want to make larger investments here like the ones you're dreaming of (like adding general byte counter profiles etc that help us profile I/O). WDYT?

PS: for the cluster setting, perhaps the right pattern is to make the setting a deadline at which the labels deactivate -- intended usage would be something like SET CLUSTER SETTING server.pprof.statements = NOW() + INTERVAL '10 minutes';.

tbg added a commit to tbg/cockroach that referenced this pull request Feb 22, 2019
When pprofui CPU profiling is active, add the statement tag and
anonymized statement string to the goroutine labels.

For example, this is what you can see when running

    ./bin/workload run kv --read-percent 50 --init

```
$ pprof -seconds 10 http://localhost:8080/debug/pprof/ui/profile
[...]
(pprof) tags
 stmt.anonymized: Total 7.9s
                  4.0s (50.57%): UPSERT INTO kv(k, v) VALUES ($1, $2)
                  3.9s (49.43%): SELECT k, v FROM kv WHERE k IN ($1,)

 stmt.tag: Total 7.9s
           4.0s (50.57%): INSERT
           3.9s (49.43%): SELECT
```

The dot graphs are similarly annotated, though they require `dot` to be
installed on the machine and thus won't be as useful on the pprofui
itself.

Profile tags are not propagated across RPC boundaries. That is, a node
may have high CPU as a result of SQL queries not originating at the
node itself, and no labels will be available.

But perusing this diff, you may notice that any moving part in the
system can sniff whether profiling is active, and can add labels in
itself, so in principle we could add the application name or any
other information that is propagated along with the transaction
on the recipient node and track down problems that way.

We may also be able to add tags based on RangeIDs to identify ranges
which cause high CPU load. The possibilities are endless, and with this
infra in place, it's trivial to quickly iterate on what's useful.

Closes cockroachdb#30930.

Release note (admin ui change): Running nodes can now be CPU profiled in
a way that breaks down CPU usage by query (some restrictions apply).
tbg added a commit to tbg/cockroach that referenced this pull request Feb 25, 2019
When pprofui CPU profiling is active, add the statement tag and
anonymized statement string to the goroutine labels.

For example, this is what you can see when running

    ./bin/workload run kv --read-percent 50 --init

```
$ pprof -seconds 10 http://localhost:8080/debug/pprof/ui/profile
[...]
(pprof) tags
 stmt.anonymized: Total 7.9s
                  4.0s (50.57%): UPSERT INTO kv(k, v) VALUES ($1, $2)
                  3.9s (49.43%): SELECT k, v FROM kv WHERE k IN ($1,)

 stmt.tag: Total 7.9s
           4.0s (50.57%): INSERT
           3.9s (49.43%): SELECT
```

The dot graphs are similarly annotated, though they require `dot` to be
installed on the machine and thus won't be as useful on the pprofui
itself.

Profile tags are not propagated across RPC boundaries. That is, a node
may have high CPU as a result of SQL queries not originating at the
node itself, and no labels will be available.

But perusing this diff, you may notice that any moving part in the
system can sniff whether profiling is active, and can add labels in
itself, so in principle we could add the application name or any
other information that is propagated along with the transaction
on the recipient node and track down problems that way.

We may also be able to add tags based on RangeIDs to identify ranges
which cause high CPU load. The possibilities are endless, and with this
infra in place, it's trivial to quickly iterate on what's useful.

Closes cockroachdb#30930.

Release note (admin ui change): Running nodes can now be CPU profiled in
a way that breaks down CPU usage by query (some restrictions apply).
craig bot pushed a commit that referenced this pull request Feb 26, 2019
35147: sql: add profiler labels during CPU profiling r=jordanlewis a=tbg

When pprofui CPU profiling is active, add the statement tag and anonymized
statement string to the goroutine labels.

For example, this is what you can see when running

    ./bin/workload run kv --read-percent 50 --init

```
$ pprof -seconds 10 http://localhost:8080/debug/pprof/ui/profile
[...]
(pprof) tags
stmt.anonymized: Total 7.9s
                 4.0s (50.57%): UPSERT INTO kv(k, v) VALUES ($1, $2)
                 3.9s (49.43%): SELECT k, v FROM kv WHERE k IN ($1,)

 stmt.tag: Total 7.9s
          4.0s (50.57%): INSERT
          3.9s (49.43%): SELECT
```

The dot graphs are similarly annotated, though they require `dot` to be
installed on the machine and thus won't be as useful on the pprofui itself.

Profile tags are not propagated across RPC boundaries. That is, a node may
have high CPU as a result of SQL queries not originating at the node
itself, and no labels will be available.

But perusing this diff, you may notice that any moving part in the system
can sniff whether profiling is active, and can add labels in itself, so in
principle we could add the application name or any other information that
is propagated along with the transaction on the recipient node and track
down problems that way.

We may also be able to add tags based on RangeIDs to identify ranges which
cause high CPU load. The possibilities are endless, and with this infra in
place, it's trivial to quickly iterate on what's useful.

Closes #30930.

Release note (admin ui change): Running nodes can now be CPU profiled in a
way that breaks down CPU usage by query (some restrictions apply).

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@craig craig bot closed this in #35147 Feb 26, 2019
@jordanlewis jordanlewis deleted the pprof-labels branch March 15, 2019 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants