Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose load test ID Prometheus gauge #42

Merged
merged 4 commits into from
Jan 21, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
# Changelog

## v0.8.0
* [\#42](https://github.com/interchainio/tm-load-test/pull/42) - Add Prometheus
gauge for when load test is underway. This indicator exposes a customizable
load test ID.

## v0.7.1
* Re-released due to v0.7.0 being incorrectly tagged

## v0.7.0
* [\#39](https://github.com/interchainio/tm-load-test/issues/39) - Add basic
* [\#39](https://github.com/interchainio/tm-load-test/pull/40) - Add basic
aggregate statistics output to CSV file.
* Added integration test for standalone execution happy path.

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,8 @@ The following kinds of metrics are made available here:
* 5 = Slave completed load testing successfully
* Standard Prometheus-provided metrics about the garbage collector in
`tm-load-test`
* The ID of the load test currently underway (defaults to 0), set by way of the
`--load-test-id` flag on the master

## Aggregate Statistics
As of `tm-load-test` v0.7.0, one can now write simple aggregate statistics to
Expand Down
3 changes: 2 additions & 1 deletion pkg/loadtest/cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import (
)

// CLIVersion must be manually updated as new versions are released.
const CLIVersion = "v0.7.0"
const CLIVersion = "v0.8.0"

// cliVersionCommitID must be set through linker settings. See
// https://stackoverflow.com/a/11355611/1156132 for details.
Expand Down Expand Up @@ -91,6 +91,7 @@ func buildCLI(cli *CLIConfig, logger logging.Logger) *cobra.Command {
masterCmd.PersistentFlags().IntVar(&masterCfg.ExpectSlaves, "expect-slaves", 2, "The number of slaves to expect to connect to the master before starting load testing")
masterCmd.PersistentFlags().IntVar(&masterCfg.SlaveConnectTimeout, "connect-timeout", 180, "The maximum number of seconds to wait for all slaves to connect")
masterCmd.PersistentFlags().IntVar(&masterCfg.ShutdownWait, "shutdown-wait", 0, "The number of seconds to wait after testing completes prior to shutting down the web server")
masterCmd.PersistentFlags().IntVar(&masterCfg.LoadTestID, "load-test-id", 0, "The ID of the load test currently underway")

var slaveCfg SlaveConfig
slaveCmd := &cobra.Command{
Expand Down
4 changes: 4 additions & 0 deletions pkg/loadtest/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ type MasterConfig struct {
ExpectSlaves int `json:"expect_slaves"` // The number of slaves to expect before starting the load test.
SlaveConnectTimeout int `json:"connect_timeout"` // The number of seconds to wait for all slaves to connect.
ShutdownWait int `json:"shutdown_wait"` // The number of seconds to wait at shutdown (while keeping the HTTP server running - primarily to allow Prometheus to keep polling).
LoadTestID int `json:"load_test_id"` // An integer greater than 0 that will be exposed via a Prometheus gauge while the load test is underway.
}

// SlaveConfig is the configuration options specific to a slave node.
Expand Down Expand Up @@ -126,6 +127,9 @@ func (c MasterConfig) Validate() error {
if c.SlaveConnectTimeout < 1 {
return fmt.Errorf("master connect-timeout must be at least 1 second")
}
if c.LoadTestID < 0 {
return fmt.Errorf("master load-test-id must be 0 or greater")
}
return nil
}

Expand Down
12 changes: 12 additions & 0 deletions pkg/loadtest/master.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ type Master struct {
txRateMetric prometheus.Gauge // The transaction throughput rate (tx/sec) as measured by the master since the last metrics update.
overallTxRateMetric prometheus.Gauge // The overall transaction throughput rate (tx/sec) as measured by the master since the beginning of the load test.
slavesCompletedMetric prometheus.Gauge // The total number of slaves that have completed their testing.
testUnderwayMetric prometheus.Gauge // The ID of the load test currently underway (-1 if none).

mtx sync.Mutex
cancelled bool
Expand Down Expand Up @@ -113,6 +114,10 @@ func NewMaster(cfg *Config, masterCfg *MasterConfig) *Master {
Name: "tmloadtest_master_slaves_completed",
Help: "The total number of slaves that have completed their testing so far",
}),
testUnderwayMetric: promauto.NewGauge(prometheus.GaugeOpts{
Name: "tmloadtest_master_test_underway",
Help: "The ID of the load test currently underway (-1 if none)",
}),
}
mux := http.NewServeMux()
mux.HandleFunc("/", master.newWebSocketHandler())
Expand All @@ -123,6 +128,7 @@ func NewMaster(cfg *Config, masterCfg *MasterConfig) *Master {
}
master.svr = svr
master.stateMetric.Set(masterStarting)
master.testUnderwayMetric.Set(-1)
return master
}

Expand Down Expand Up @@ -229,6 +235,12 @@ func (m *Master) receiveTestingUpdates() error {
m.logger.Info("Watching for slave updates")
m.stateMetric.Set(masterTesting)

// we set the current test underway ID to our configured load test ID for
// the duration of the test
m.testUnderwayMetric.Set(float64(m.masterCfg.LoadTestID))
// and we set it to -1 the moment all slaves are done
defer m.testUnderwayMetric.Set(-1)

completed := 0

progressTicker := time.NewTicker(masterProgressUpdateInterval)
Expand Down