-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697
Comments
yes, this seems like a reasonable conclusion. commit may have more I/O (it has all the writes, while deliver has the possibly cached reads). It would be interesting to have a graph of IO time here as well. Also, do you use leveldb with disk backed storage? This is what makes Commit and deliver reads slower |
Interesting real world graph. Thank you @alexanderbez Are the commit times maxing at 100ms, or is that averaged? As I heard of multi-second end block for the distribution (but that might be end block and not commit) |
This is an average. I'll post the Prom query shortly. |
Averages are fallacious -- imagine myself, my sister and Bill Gates, my 8 friends all walked into a room and someone asked for the average wealth, you can see how skewed that is. I'd highly encourage using percentiles and not averages so for example here is how to get the 95th percentile for a metric named "metric" in Prometheus histogram_quantile(0.95,
sum(rate(metric_bucket[5m])) by (job, le)) cc @kirbyquerby and @cuonglm |
Sure, but that's not at all what's happening here. The query is:
But this thread isn't about Prom, so use this how you wish. |
@alexanderbez I raised that because we are dealing with latencies. Is it possible for you to help us with the bucketized p95th latency graph? |
Yup. I'll run the query in a bit! I also have a public Gaia instance if you want to run your own Prom queries? |
Gentle ping. Still relevant? |
Summary of Bug
As part of cosmos-sdk benchmarking, this issue is to provide a guide to figuring out what culprits are and what needs to be investigated and improved. Inside x/bank/bench_test.go there is a looming need to figure out what consumes more CPU per
cosmos-sdk/x/bank/bench_test.go
Lines 86 to 101 in 149bed4
The target here is to figure out how to make things better and improve on throughput and what to care about optimizing. We've worked on continuous benchmarking infrastructure that'll extract a git commit from a PR, run benchmarks and post results to show cosmos-sdk engineers what could have changed.
Results
app.Deliver->runTx consumes much more CPU cycles (10.30s) than app.Commit (1.16s)
Commit CPU graph
Deliver CPU graph
Version
Latest at 90d799f
Steps to Reproduce
To reproduce, please run
go test -run=^$ -bench=OneBankMultiSendTxPerBlock -run=60 -cpuprofile=ms.cpu -memprofile=ms.mem
and when completed, run pprof to examine the respective files
/cc @ethanfrey @cuonglm @okwme
For Admin Use
The text was updated successfully, but these errors were encountered: