-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add understanding results page #6984
Changes from 10 commits
b8c37f2
22175e1
fa2ff78
00ab8f0
106ff96
46ae1c2
0957135
393ef7a
2a61b10
2b8f517
1c3bc9a
82236d6
eb0a510
c6a49f7
6e0b73d
89addfd
5ee0b30
b7f26ab
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,125 @@ | ||||||
--- | ||||||
layout: default | ||||||
title: Understanding results | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
nav_order: 22 | ||||||
parent: User guide | ||||||
--- | ||||||
|
||||||
|
||||||
At the end of each test run, a summary table is produced which includes metrics like service time, throughput, latency, and more. These metrics provide insights into how the workload selected performed on a benchmarked OpenSearch cluster. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should come later in the topic, perhaps to introduce a sample table.
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
The following guide gives information about how to understand the results of the summary report. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this necessary?
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## OpenSearch Benchmark runs | ||||||
|
||||||
OpenSearch Benchmark runs a series of nightly tests targeting the overall OpenSearch development cluster. These runs can be found on https://opensearch.org/benchmarks. It compares several metrics across different test runs targeting both recent and future versions of OpenSearch. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sounds better as the opening paragraph because the first paragraph doesn't give any context. Can you move it up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is specific to the link provided in the paragraph. It only applies as an example and not as the standard for summary reports. In the orginal draft, this section was at the bottom and included a disclaimer similar to the following, "Use the nightly benchmark runs as an example of how to present your benchmark results in OpenSearch Dashboards."
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## Selecting metrics to compare | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
While an OpenSearch Benchmark summary report provides many metrics related to the performance of your cluster, how to compare and use those metrics depends on your use case. Some users might be interested in the number of documents their can index, while another might be interested in the amount of latency or service time it takes for a document to be queried. For example, during the OpenSearch Benchmark nightly runs, the OpenSearch teams pulls metrics similar to the following summary report: | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
```bash | ||||||
------------------------------------------------------ | ||||||
_______ __ _____ | ||||||
/ ____(_)___ ____ _/ / / ___/_________ ________ | ||||||
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \ | ||||||
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/ | ||||||
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/ | ||||||
------------------------------------------------------ | ||||||
|
||||||
| Metric | Task | Value | Unit | | ||||||
|---------------------------------------------------------------:|-------------------------------------------:|------------:|-------:| | ||||||
| Cumulative indexing time of primary shards | | 0.02655 | min | | ||||||
| Min cumulative indexing time across primary shards | | 0 | min | | ||||||
| Median cumulative indexing time across primary shards | | 0.00176667 | min | | ||||||
| Max cumulative indexing time across primary shards | | 0.0140333 | min | | ||||||
| Cumulative indexing throttle time of primary shards | | 0 | min | | ||||||
| Min cumulative indexing throttle time across primary shards | | 0 | min | | ||||||
| Median cumulative indexing throttle time across primary shards | | 0 | min | | ||||||
| Max cumulative indexing throttle time across primary shards | | 0 | min | | ||||||
| Cumulative merge time of primary shards | | 0.0102333 | min | | ||||||
| Cumulative merge count of primary shards | | 3 | | | ||||||
| Min cumulative merge time across primary shards | | 0 | min | | ||||||
| Median cumulative merge time across primary shards | | 0 | min | | ||||||
| Max cumulative merge time across primary shards | | 0.0102333 | min | | ||||||
| Cumulative merge throttle time of primary shards | | 0 | min | | ||||||
| Min cumulative merge throttle time across primary shards | | 0 | min | | ||||||
| Median cumulative merge throttle time across primary shards | | 0 | min | | ||||||
| Max cumulative merge throttle time across primary shards | | 0 | min | | ||||||
| Cumulative refresh time of primary shards | | 0.0709333 | min | | ||||||
| Cumulative refresh count of primary shards | | 118 | | | ||||||
| Min cumulative refresh time across primary shards | | 0 | min | | ||||||
| Median cumulative refresh time across primary shards | | 0.00186667 | min | | ||||||
| Max cumulative refresh time across primary shards | | 0.0511667 | min | | ||||||
| Cumulative flush time of primary shards | | 0.00963333 | min | | ||||||
| Cumulative flush count of primary shards | | 4 | | | ||||||
| Min cumulative flush time across primary shards | | 0 | min | | ||||||
| Median cumulative flush time across primary shards | | 0 | min | | ||||||
| Max cumulative flush time across primary shards | | 0.00398333 | min | | ||||||
| Total Young Gen GC time | | 0 | s | | ||||||
| Total Young Gen GC count | | 0 | | | ||||||
| Total Old Gen GC time | | 0 | s | | ||||||
| Total Old Gen GC count | | 0 | | | ||||||
| Store size | | 0.000485923 | GB | | ||||||
| Translog size | | 2.01873e-05 | GB | | ||||||
| Heap used for segments | | 0 | MB | | ||||||
| Heap used for doc values | | 0 | MB | | ||||||
| Heap used for terms | | 0 | MB | | ||||||
| Heap used for norms | | 0 | MB | | ||||||
| Heap used for points | | 0 | MB | | ||||||
| Heap used for stored fields | | 0 | MB | | ||||||
| Segment count | | 32 | | | ||||||
| Min Throughput | index | 3008.97 | docs/s | | ||||||
| Mean Throughput | index | 3008.97 | docs/s | | ||||||
| Median Throughput | index | 3008.97 | docs/s | | ||||||
| Max Throughput | index | 3008.97 | docs/s | | ||||||
| 50th percentile latency | index | 351.059 | ms | | ||||||
| 100th percentile latency | index | 365.058 | ms | | ||||||
| 50th percentile service time | index | 351.059 | ms | | ||||||
| 100th percentile service time | index | 365.058 | ms | | ||||||
| error rate | index | 0 | % | | ||||||
| Min Throughput | wait-until-merges-finish | 28.41 | ops/s | | ||||||
| Mean Throughput | wait-until-merges-finish | 28.41 | ops/s | | ||||||
| Median Throughput | wait-until-merges-finish | 28.41 | ops/s | | ||||||
| Max Throughput | wait-until-merges-finish | 28.41 | ops/s | | ||||||
| 100th percentile latency | wait-until-merges-finish | 34.7088 | ms | | ||||||
| 100th percentile service time | wait-until-merges-finish | 34.7088 | ms | | ||||||
| error rate | wait-until-merges-finish | 0 | % | | ||||||
| Min Throughput | match_all | 36.09 | ops/s | | ||||||
| Mean Throughput | match_all | 36.09 | ops/s | | ||||||
| Median Throughput | match_all | 36.09 | ops/s | | ||||||
| Max Throughput | match_all | 36.09 | ops/s | | ||||||
| 100th percentile latency | match_all | 35.9822 | ms | | ||||||
| 100th percentile service time | match_all | 7.93048 | ms | | ||||||
| error rate | match_all | 0 | % | | ||||||
|
||||||
[...] | ||||||
|
||||||
| Min Throughput | term | 16.1 | ops/s | | ||||||
| Mean Throughput | term | 16.1 | ops/s | | ||||||
| Median Throughput | term | 16.1 | ops/s | | ||||||
| Max Throughput | term | 16.1 | ops/s | | ||||||
| 100th percentile latency | term | 131.798 | ms | | ||||||
| 100th percentile service time | term | 69.5237 | ms | | ||||||
| error rate | term | 0 | % | | ||||||
``` | ||||||
|
||||||
Metrics unique to the cluster begin at the `index` task line. The following two use cases can give you an idea of what metrics might be relevant to you: | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
- To assess how much load your cluster can handle, the `index` task metrics provide the number of documents ingested during the workload run, as well as the ingestion error rate. | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
- To assess the measurable latency and service time of the queries in the workload, the `match_all` and `term` give both the number of query operations performed per second and the measurable latency of the query, as well as the error rate when running query operations. | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
|
||||||
## Result storage | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. SUPER Nit: Result -> Results
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
Results from OpenSearch Benchmark are stored in two ways, either in-memory or in an external metric store. | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
When stored in-memory, results can be found in the `/.benchmark/benchmarks/test_executions/<test_execution_id>` directory. Results are named based off of the `test_execution_id` given to the workload test during its last run. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
While [running a test](https://opensearch.org/docs/latest/benchmark/reference/commands/execute-test/#general-settings), you can also customize where the results are stored, using any combination of the following command flags: | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
* `--results-file`: When provided a file path, writes the summary report to the file indicated in the path. | ||||||
* `--results-format`: Defines the output format for the command line results, either `markdown` or `csv`. Default is `markdown`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. command line results -> summary report
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
* `--show-in-results`: Defines which values are shown in the published summary report, either `available`, `all-percentiles`, or `all`. Default is `available`. | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
* `--user-tag`: Defines user-specific key-value pairs to be used in the metric record as meta information, for example, `intention:baseline-ticket-12345`. This is useful when storing metrics and results in an external metric store. | ||||||
Naarcha-AWS marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understanding workload test results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Understanding benchmark results?" What do you think @IanHoang