-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: support tagging SQL statements in pprof results #14312
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
cherry pick to release-3.0 failed |
/run-cherry-picker |
cherry pick to release-3.0 failed |
Benchmark Report when set pprof-sql-cpu = true
@@ Benchmark Diff @@
================================================================================
--- tidb: b398ef4d2ead6ae98287d9489976a7dc8dba8e49
+++ tidb: 91144ca41fcd86801e85ca8be2d748f3b3885568
tikv: 1211ed161c093ab180e42879146dedd05c4c6fb5
pd: 4498b6f203588f556cb6f6ebcdcb55f0afff0efa
================================================================================
oltp_update_index:
* QPS: 4246.60 ± 0.04% (std=1.15) delta: 0.06% (p=0.363)
* Latency p50: 30.14 ± 0.03% (std=0.01) delta: -0.07%
* Latency p99: 54.84 ± 3.64% (std=1.22) delta: -1.81%
oltp_insert:
* QPS: 7567.92 ± 0.04% (std=2.21) delta: 0.47% (p=0.029)
* Latency p50: 16.91 ± 0.04% (std=0.00) delta: -0.49%
* Latency p99: 29.07 ± 4.09% (std=0.78) delta: -1.76%
oltp_read_write:
* QPS: 15694.90 ± 0.25% (std=27.78) delta: -0.52% (p=0.072)
* Latency p50: 163.45 ± 0.21% (std=0.25) delta: 0.55%
* Latency p99: 326.02 ± 1.81% (std=4.79) delta: 0.90%
oltp_point_select:
* QPS: 41161.17 ± 0.17% (std=53.14) delta: -3.90% (p=0.020)
* Latency p50: 3.11 ± 0.21% (std=0.00) delta: 4.02%
* Latency p99: 10.09 ± 0.00% (std=0.00) delta: 3.70%
oltp_update_non_index:
* QPS: 4715.66 ± 0.83% (std=28.71) delta: -0.12% (p=0.341)
* Latency p50: 27.14 ± 0.84% (std=0.17) delta: 0.14%
* Latency p99: 43.92 ± 2.41% (std=0.75) delta: 2.54%
Originally posted by @sre-bot in #14362 (comment) |
@lysu No command or invalid command |
1 similar comment
@lysu No command or invalid command |
What problem does this PR solve?
Related PR: *: replace conf item
pprof_sql_cpu
with srv vartidb_pprof_sql_cpu
In TiDB, we often use Go profiler (pprof) to analyze which parts of the code consume most of the CPU resources. But in actual user scenarios, when we’re troubleshooting the 100% CPU usage issue, it would also be of great help to know which parts of the SQL statement to over-consume CPU.
After v1.9, Go supports adding profiler labels that allow users to tag key-value pairs in pprof results, so we think tagging SQL statements in pprof would help. (ref: https://rakyll.org/profiler-labels/)
This PR adds SQL labels to TiDB's Go profiler.
What is changed and how it works?
Profiler labels are not free. There can be some costs to call runtime set goroutine, so this PR adds a config item and combine it with
http://0.0.0.0:10080/reload-config
to hot reload it.The config item’s default value is
false
. You can set it totrue
and runcurl http://0.0.0.0:10080/reload-config
to analyze which parts of the SQL statement consume most of the CPU time.Then, run
curl http://0.0.0.0:10080/debug/pprof/profile --output p
orcurl http://0.0.0.0:10080/debug/pprof/zip --output tidb_debug.zip
to get pprof results.In the pprof results, you can observe tags as below:
NOTE:
You need to set
pprof-sql-cpu
back tofalse
and curlhttp://0.0.0.0:10080/reload-config
after checking the pprof result.TiDB will only tag the SQL statements that are executed after setting
pprof-sql-cpu
totrue
.Check List
Tests
Code changes
Side effects
Related changes
Release note
This change is