-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend: get raw values for the various dimensions and present them to the UI for rendering #14
Comments
Of the current dimensions we return: cosmosloadtester/proto/orijtech/cosmosloadtester/v1/loadtest_service.proto Lines 83 to 99 in cf9cd20
Three are three totals and two average rates. QPounder only shows percentiles for request latency. What metrics are you looking for and which ones do you want to have percentiles for? Do you want a new metric added to track percentiles for request latency? Do you want graphs over time for certain metrics as well? If we want to track latency, does it still make sense to track it for broadcast_tx_async, which returns immediately?: |
This change returns more informative stats with p50, p75, p90, p95, p99 values for the latencies which massively help in seeing the actual performance of a node. These values are useful to properly visualize the processing power maximums and minimums plus load tapers. Updates orijtech/cosmosloadtester#14
@kirbyquerby @willpoint before my flight to Canada today while on a layover in Los Angeles, I sat down and explored a bunch of options like using OpenCensus/OpenTelemetry but their APIs such and are now too convoluted so I instead rolled out simple stats in https://github.com/orijtech/tm-load-test/releases/tag/vorijtech-1.0.0 per orijtech/tm-load-test@d2a5d18 and now if we use this diff, we can see more informative stats like this diff --git a/go.mod b/go.mod
index 74a202f..ae6d99b 100644
--- a/go.mod
+++ b/go.mod
@@ -9,7 +9,7 @@ require (
github.com/informalsystems/tm-load-test v1.0.0
github.com/lib/pq v1.10.6
github.com/sirupsen/logrus v1.9.0
- go.opencensus.io v0.23.0
+ go.opencensus.io v0.24.0
google.golang.org/genproto v0.0.0-20221207170731-23e4bf6bdc37
google.golang.org/grpc v1.51.0
google.golang.org/protobuf v1.28.2-0.20220831092852-f930b1dc76e8
@@ -96,7 +96,7 @@ require (
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/spf13/viper v1.13.0 // indirect
- github.com/stretchr/testify v1.8.0 // indirect
+ github.com/stretchr/testify v1.8.1 // indirect
github.com/subosito/gotenv v1.4.1 // indirect
github.com/syndtr/goleveldb v1.0.1-0.20210819022825-2ae1ddf74ef7 // indirect
github.com/tendermint/btcd v0.1.1 // indirect
@@ -121,4 +121,6 @@ require (
replace github.com/gogo/protobuf => github.com/regen-network/protobuf v1.3.3-alpha.regen.1
+replace github.com/informalsystems/tm-load-test => github.com/orijtech/tm-load-test v1.0.1-0.20221218023019-d2a5d1861a00
+
// replace github.com/informalsystems/tm-load-test => /home/nathan/Documents/tm-load-test
diff --git a/server/server.go b/server/server.go
index 621bd33..5c416d8 100644
--- a/server/server.go
+++ b/server/server.go
@@ -3,6 +3,7 @@ package server
import (
"context"
"encoding/csv"
+ "encoding/json"
"fmt"
"os"
"strconv"
@@ -75,11 +76,19 @@ func (s *Server) RunLoadtest(ctx context.Context, req *loadtestpb.RunLoadtestReq
return nil, status.Errorf(codes.InvalidArgument, "invalid input: %v", err)
}
- err = loadtest.ExecuteStandalone(cfg)
+ psL, err := loadtest.ExecuteStandaloneWithStats(cfg)
if err != nil {
return nil, err
}
+ // TODO: Send over the actual values of psL to the UI
+ // instead of the CSV parsing down below.
+ blob, err := json.MarshalIndent(psL, "", " ")
+ if err != nil {
+ return nil, err
+ }
+ println(string(blob))
+
f, err := os.Open(statsOutputFilePath)
if err != nil {
return nil, fmt.Errorf("failed to open stats output file: %w", err) UI requestResult[
{
"avg_bytes_per_sec": 11521.651142520457,
"avg_tx_per_sec": 2880.4127856301143,
"total_time": 39032947139,
"total_bytes": 449724,
"total_txs": 112431,
"p50": {
"at_ns": 10581297990,
"at_str": "10.58129799s",
"latency": 7037
},
"p75": {
"at_ns": 28128359648,
"at_str": "28.128359648s",
"latency": 11295
},
"p90": {
"at_ns": 20108712942,
"at_str": "20.108712942s",
"latency": 21267
},
"p95": {
"at_ns": 31586764579,
"at_str": "31.586764579s",
"latency": 32603
},
"p99": {
"at_ns": 4631386768,
"at_str": "4.631386768s",
"latency": 90098
},
"per_sec": [
{
"sec": 0,
"qps": 8838,
"bytes": 35352
},
{
"sec": 1,
"qps": 2911,
"bytes": 11644
},
{
"sec": 2,
"qps": 5342,
"bytes": 21368
},
{
"sec": 3,
"qps": 3074,
"bytes": 12296
},
{
"sec": 4,
"qps": 9218,
"bytes": 36872
},
{
"sec": 5,
"qps": 0,
"bytes": 0
},
{
"sec": 6,
"qps": 0,
"bytes": 0
},
{
"sec": 7,
"qps": 10001,
"bytes": 40004
},
{
"sec": 8,
"qps": 5403,
"bytes": 21612
},
{
"sec": 9,
"qps": 3244,
"bytes": 12976
},
{
"sec": 10,
"qps": 3407,
"bytes": 13628
},
{
"sec": 11,
"qps": 3730,
"bytes": 14920
},
{
"sec": 12,
"qps": 2433,
"bytes": 9732
},
{
"sec": 13,
"qps": 2921,
"bytes": 11684
},
{
"sec": 14,
"qps": 2595,
"bytes": 10380
},
{
"sec": 15,
"qps": 2433,
"bytes": 9732
},
{
"sec": 16,
"qps": 2758,
"bytes": 11032
},
{
"sec": 17,
"qps": 2596,
"bytes": 10384
},
{
"sec": 18,
"qps": 2108,
"bytes": 8432
},
{
"sec": 19,
"qps": 2271,
"bytes": 9084
},
{
"sec": 20,
"qps": 2758,
"bytes": 11032
},
{
"sec": 21,
"qps": 3568,
"bytes": 14272
},
{
"sec": 22,
"qps": 2434,
"bytes": 9736
},
{
"sec": 23,
"qps": 2109,
"bytes": 8436
},
{
"sec": 24,
"qps": 2595,
"bytes": 10380
},
{
"sec": 25,
"qps": 2271,
"bytes": 9084
},
{
"sec": 26,
"qps": 1946,
"bytes": 7784
},
{
"sec": 27,
"qps": 1947,
"bytes": 7788
},
{
"sec": 28,
"qps": 1460,
"bytes": 5840
},
{
"sec": 29,
"qps": 1135,
"bytes": 4540
},
{
"sec": 30,
"qps": 1623,
"bytes": 6492
},
{
"sec": 31,
"qps": 1459,
"bytes": 5836
},
{
"sec": 32,
"qps": 1947,
"bytes": 7788
},
{
"sec": 33,
"qps": 1622,
"bytes": 6488
},
{
"sec": 34,
"qps": 2109,
"bytes": 8436
},
{
"sec": 35,
"qps": 2271,
"bytes": 9084
},
{
"sec": 36,
"qps": 1622,
"bytes": 6488
},
{
"sec": 37,
"qps": 2272,
"bytes": 9088
}
]
}
] and with this we can now graph timeseries data as suggested above, showing both QPS and Bytes/second as well as the latency progressions with the various percentiles. |
I've mailed out a much better modification In our fork of tm-load-test that'll now show rankings per second and thus allow visualizing of percentiles https://github.com/orijtech/tm-load-test/releases/tag/vorijtech-1.1.0 for example [
{
"avg_bytes_per_sec": 422.15393825983676,
"avg_tx_per_sec": 105.53848456495919,
"total_time": 18002911524,
"total_bytes": 7600,
"total_txs": 1900,
"p50": {
"at_ns": 12000740850,
"at_str": "12.00074085s",
"latency": 16942,
"size": 4
},
"p75": {
"at_ns": 2001010940,
"at_str": "2.00101094s",
"latency": 25159,
"size": 4
},
"p90": {
"at_ns": 15002840984,
"at_str": "15.002840984s",
"latency": 40838,
"size": 4
},
"p95": {
"at_ns": 16000796152,
"at_str": "16.000796152s",
"latency": 54770,
"size": 4
},
"p99": {
"at_ns": 3000639229,
"at_str": "3.000639229s",
"latency": 88433,
"size": 4
},
"per_sec": [
{
"sec": 0,
"qps": 100,
"bytes": 400,
"bytes_rankings": {
"p50": {
"at_ns": 3025761,
"at_str": "3.025761ms",
"size": 4
},
"p75": {
"at_ns": 3126893,
"at_str": "3.126893ms",
"size": 4
},
"p90": {
"at_ns": 710362,
"at_str": "710.362µs",
"size": 4
},
"p95": {
"at_ns": 2010957,
"at_str": "2.010957ms",
"size": 4
},
"p99": {
"at_ns": 633480,
"at_str": "633.48µs",
"size": 4
}
},
"latency_rankings": {
"p50": {
"at_ns": 3025761,
"at_str": "3.025761ms",
"latency": 17800
},
"p75": {
"at_ns": 3126893,
"at_str": "3.126893ms",
"latency": 26296
},
"p90": {
"at_ns": 710362,
"at_str": "710.362µs",
"latency": 42240
},
"p95": {
"at_ns": 2010957,
"at_str": "2.010957ms",
"latency": 56953
},
"p99": {
"at_ns": 633480,
"at_str": "633.48µs",
"latency": 242929
}
}
},
{
"sec": 1,
"qps": 100,
"bytes": 400,
"bytes_rankings": {
"p50": {
"at_ns": 1002111655,
"at_str": "1.002111655s",
"size": 4
},
"p75": {
"at_ns": 1002353024,
"at_str": "1.002353024s",
"size": 4
},
"p90": {
"at_ns": 1002326247,
"at_str": "1.002326247s",
"size": 4
},
"p95": {
"at_ns": 1002397596,
"at_str": "1.002397596s",
"size": 4
},
"p99": {
"at_ns": 1001288678,
"at_str": "1.001288678s",
"size": 4
}
},
"latency_rankings": {
"p50": {
"at_ns": 1002111655,
"at_str": "1.002111655s",
"latency": 14099
},
"p75": {
"at_ns": 1002353024,
"at_str": "1.002353024s",
"latency": 23419
},
"p90": {
"at_ns": 1002326247,
"at_str": "1.002326247s",
"latency": 39388
},
"p95": {
"at_ns": 1002397596,
"at_str": "1.002397596s",
"latency": 42895
},
"p99": {
"at_ns": 1001288678,
"at_str": "1.001288678s",
"latency": 81521
}
}
}
]
}
] |
updates #14 TODO(uzo) complete the flow using the protobuf definitions and server response
descPercentile and bucketizedPerSecond need to be exported for this change to compile. I don't have push access to the tm-load-test fork, but here's where the files are: https://github.com/orijtech/tm-load-test/blob/d37154798c88c311e880eb23ec799d09e04cb44b/pkg/loadtest/transactor.go#L271-L285 Updates #14 Updates #16
descPercentile and bucketizedPerSecond need to be exported for this change to compile. I don't have push access to the tm-load-test fork, but here's where the files are: https://github.com/orijtech/tm-load-test/blob/d37154798c88c311e880eb23ec799d09e04cb44b/pkg/loadtest/transactor.go#L271-L285 Updates #14 Updates #16
* proto/orijtech: define per point data Updates the protobuf definitions to have per point data that can then be used by the user interface. Updates #14 Updates #16 * Document proto fields + regenerate protos * server: populate new RunLoadtestResponse fields descPercentile and bucketizedPerSecond need to be exported for this change to compile. I don't have push access to the tm-load-test fork, but here's where the files are: https://github.com/orijtech/tm-load-test/blob/d37154798c88c311e880eb23ec799d09e04cb44b/pkg/loadtest/transactor.go#L271-L285 Updates #14 Updates #16 * fix cosmos-sdk version The PR this change is based on changed the cosmos-sdk version, which causes a build failure * regenerate protos + update go modules + fix warnings * use correct value for percentile latency and bytes sent Co-authored-by: Nathan Dias <[email protected]>
Right now the informalsystems/tm-loadtest code computes averages of the values and displays a summary, due to the limited user interface of a CLI. We need to be able to show percentiles like we do for qpounder
The text was updated successfully, but these errors were encountered: