sql: add pprof labels based on the statement type #30930

jordanlewis · 2018-10-03T18:43:58Z

pprof profiles will now be enhanced with the type of the statement that
caused the creation of each stack. statement types are things like
'SELECT', 'INSERT', 'ALTER TABLE', etc.

This example output is produced based on a kv95 workload:

(pprof) tags
 stmt: Total 27.7s
       25.4s (91.77%): SELECT
        2.3s ( 8.23%): INSERT

Release note: None

cockroach-teamcity · 2018-10-03T18:44:06Z

This change is

jordanlewis · 2018-10-03T18:44:25Z

Here's a sample profile created with the new tags:

pprof.samples.cpu.007.pb.gz

jordanlewis · 2018-10-03T18:56:59Z

https://rakyll.org/profiler-labels/ has more details on how the labels work, btw.

tbg · 2018-10-03T19:01:35Z

Just a drive by question, what's the cost of a label? Are they cheap enough to throw in basically as we please?

jordanlewis · 2018-10-03T19:30:04Z

They seem very cheap, but I don't know about using them more than on the order of once per query. They do a context.Value call and call a runtime function that sets a field on a pointer.

nvanbenschoten

Same question as Tobi. This is really neat, but I'd like to experimentally confirm that the overhead is completely negligible before making the change.

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/conn_executor_exec.go, line 26 at r1 (raw file):

	"github.com/pkg/errors"

	pprof "runtime/pprof"

Weird import.

jordanlewis · 2018-10-03T19:43:22Z

Switching to pprof-labels~
+ make bench PKG=./pkg/bench BENCHTIMEOUT=5m 'BENCHES=BenchmarkSelect1/^Cockroach$$' 'TESTFLAGS=-count 10 -benchmem'
Switching to pprof-labels
+ make bench PKG=./pkg/bench BENCHTIMEOUT=5m 'BENCHES=BenchmarkSelect1/^Cockroach$$' 'TESTFLAGS=-count 10 -benchmem'
name                 old time/op    new time/op    delta
Select1/Cockroach-8     126µs ± 1%     140µs ±13%  +10.74%  (p=0.000 n=9+9)

name                 old alloc/op   new alloc/op   delta
Select1/Cockroach-8    18.8kB ± 0%    19.2kB ± 0%   +2.22%  (p=0.000 n=7+9)

name                 old allocs/op  new allocs/op  delta
Select1/Cockroach-8       164 ± 0%       169 ± 0%   +3.05%  (p=0.000 n=10+10)

Pretty disappointing. Not sure this is worth it. Though the Do call is high level and always allocates something - let me see if I can improve the performance by doing things with lower level calls.

pprof profiles will now be enhanced with the type of the statement that caused the creation of each stack. statement types are things like 'SELECT', 'INSERT', 'ALTER TABLE', etc. Release note: None

jordanlewis · 2018-10-03T19:52:33Z

Replacing the Do call makes this seem ok again.

name                 old time/op    new time/op    delta
Select1/Cockroach-8     126µs ± 2%     127µs ± 0%    ~     (p=0.156 n=10+9)

name                 old alloc/op   new alloc/op   delta
Select1/Cockroach-8    18.8kB ± 0%    19.2kB ± 0%  +2.26%  (p=0.000 n=10+10)

name                 old allocs/op  new allocs/op  delta
Select1/Cockroach-8       164 ± 0%       169 ± 0%  +3.05%  (p=0.000 n=10+10)

nvanbenschoten · 2018-10-03T20:07:22Z

Can we statically allocate the labels for all statement types to avoid the increase in memory usage?

jordanlewis · 2018-10-03T20:14:36Z

We'll still have to allocate a map in WithLabels, since the labelContextKey isn't exported for us to muck with, but I think you're right that we can avoid making LabelSets over and over. Good suggestion.

jordanlewis · 2018-10-03T20:25:14Z

Nevermind, there's no way we can reduce allocations here. However, I think the increase of allocations is negligible. The query benchmarked here is a worst-case - it's SELECT 1 which does almost no work at all. Normal queries do a lot more. Thoughts?

nvanbenschoten · 2018-10-04T15:32:47Z

Nevermind, there's no way we can reduce allocations here.

Can't we at least avoid the pprof.Labels allocation?

jordanlewis · 2018-10-04T16:40:50Z

That doesn't get allocated.

nvanbenschoten · 2018-10-04T16:48:45Z

Really? Aren't there varargs and a slice in pprof.Labels?

jordanlewis · 2018-10-05T22:10:53Z

I ran this change through pprof and didn't see any allocations on those lines. Maybe I did something wrong?

nvanbenschoten

Code LGTM, so up to you whether you think the added debuggability from this change is worth the minor increase in allocations.

jordanlewis · 2018-10-11T13:50:44Z

I'm not really sure. I'm going to let this sit a while longer and revisit when I have more headspace for it. Thanks for the reviews so far.

tbg · 2019-02-06T08:19:11Z

Per our discussion yesterday, I think that if you gated this behind a cluster setting and used the anonymized statement string (on top of a tag label) this would be a very useful addition that we could use in the 2.2 cycle to find out whether we'll want to make larger investments here like the ones you're dreaming of (like adding general byte counter profiles etc that help us profile I/O). WDYT?

PS: for the cluster setting, perhaps the right pattern is to make the setting a deadline at which the labels deactivate -- intended usage would be something like SET CLUSTER SETTING server.pprof.statements = NOW() + INTERVAL '10 minutes';.

When pprofui CPU profiling is active, add the statement tag and anonymized statement string to the goroutine labels. For example, this is what you can see when running ./bin/workload run kv --read-percent 50 --init ``` $ pprof -seconds 10 http://localhost:8080/debug/pprof/ui/profile [...] (pprof) tags stmt.anonymized: Total 7.9s 4.0s (50.57%): UPSERT INTO kv(k, v) VALUES ($1, $2) 3.9s (49.43%): SELECT k, v FROM kv WHERE k IN ($1,) stmt.tag: Total 7.9s 4.0s (50.57%): INSERT 3.9s (49.43%): SELECT ``` The dot graphs are similarly annotated, though they require `dot` to be installed on the machine and thus won't be as useful on the pprofui itself. Profile tags are not propagated across RPC boundaries. That is, a node may have high CPU as a result of SQL queries not originating at the node itself, and no labels will be available. But perusing this diff, you may notice that any moving part in the system can sniff whether profiling is active, and can add labels in itself, so in principle we could add the application name or any other information that is propagated along with the transaction on the recipient node and track down problems that way. We may also be able to add tags based on RangeIDs to identify ranges which cause high CPU load. The possibilities are endless, and with this infra in place, it's trivial to quickly iterate on what's useful. Closes cockroachdb#30930. Release note (admin ui change): Running nodes can now be CPU profiled in a way that breaks down CPU usage by query (some restrictions apply).

35147: sql: add profiler labels during CPU profiling r=jordanlewis a=tbg When pprofui CPU profiling is active, add the statement tag and anonymized statement string to the goroutine labels. For example, this is what you can see when running ./bin/workload run kv --read-percent 50 --init ``` $ pprof -seconds 10 http://localhost:8080/debug/pprof/ui/profile [...] (pprof) tags stmt.anonymized: Total 7.9s 4.0s (50.57%): UPSERT INTO kv(k, v) VALUES ($1, $2) 3.9s (49.43%): SELECT k, v FROM kv WHERE k IN ($1,) stmt.tag: Total 7.9s 4.0s (50.57%): INSERT 3.9s (49.43%): SELECT ``` The dot graphs are similarly annotated, though they require `dot` to be installed on the machine and thus won't be as useful on the pprofui itself. Profile tags are not propagated across RPC boundaries. That is, a node may have high CPU as a result of SQL queries not originating at the node itself, and no labels will be available. But perusing this diff, you may notice that any moving part in the system can sniff whether profiling is active, and can add labels in itself, so in principle we could add the application name or any other information that is propagated along with the transaction on the recipient node and track down problems that way. We may also be able to add tags based on RangeIDs to identify ranges which cause high CPU load. The possibilities are endless, and with this infra in place, it's trivial to quickly iterate on what's useful. Closes #30930. Release note (admin ui change): Running nodes can now be CPU profiled in a way that breaks down CPU usage by query (some restrictions apply). Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

jordanlewis requested review from nvanbenschoten, petermattis and a team October 3, 2018 18:43

nvanbenschoten requested changes Oct 3, 2018

View reviewed changes

sql: add pprof labels based on the statement type

7f5e563

pprof profiles will now be enhanced with the type of the statement that caused the creation of each stack. statement types are things like 'SELECT', 'INSERT', 'ALTER TABLE', etc. Release note: None

jordanlewis force-pushed the pprof-labels branch from 533025c to 7f5e563 Compare October 3, 2018 19:52

nvanbenschoten approved these changes Oct 11, 2018

View reviewed changes

tbg mentioned this pull request Feb 21, 2019

sql: infrastructure to get visibility into the sql memory budget #35097

Closed

tbg mentioned this pull request Feb 22, 2019

sql: add profiler labels during CPU profiling #35147

Merged

craig bot closed this in #35147 Feb 26, 2019

jordanlewis deleted the pprof-labels branch March 15, 2019 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: add pprof labels based on the statement type #30930

sql: add pprof labels based on the statement type #30930

jordanlewis commented Oct 3, 2018

cockroach-teamcity commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

tbg commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten left a comment

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten commented Oct 4, 2018

jordanlewis commented Oct 4, 2018

nvanbenschoten commented Oct 4, 2018

jordanlewis commented Oct 5, 2018

nvanbenschoten left a comment

jordanlewis commented Oct 11, 2018

tbg commented Feb 6, 2019

sql: add pprof labels based on the statement type #30930

sql: add pprof labels based on the statement type #30930

Conversation

jordanlewis commented Oct 3, 2018

cockroach-teamcity commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

tbg commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

jordanlewis commented Oct 3, 2018

nvanbenschoten commented Oct 4, 2018

jordanlewis commented Oct 4, 2018

nvanbenschoten commented Oct 4, 2018

jordanlewis commented Oct 5, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

jordanlewis commented Oct 11, 2018

tbg commented Feb 6, 2019