[Introduce Aggregate Subcommand] Enhance opensearch-benchmark compare
command
#630
Labels
enhancement
New feature or request
compare
command
#630
This is an issue based off one of the proposed priorities in this RFC: #627
Background
The existing compare subcommand in OpenSearch Benchmark (OSB) allows users to compare the results of two benchmark test executions by providing the unique IDs (UIDs) of the test executions. However, users have expressed interest in comparing aggregated results from two or more groups of test executions, rather than just two individual tests.
Proposed Design
To address this requirement, we propose enhancing the existing compare subcommand to support comparing two aggregated test results.
When comparing two aggregated test results, a validation step will be added to ensure that the underlying workload (the type of operations being performed, the data set being used, etc.) is consistent across the test executions being aggregated.
The enhanced compare subcommand will allow users to specify two test execution IDs, and OSB will perform the necessary validations before comparing the aggregated results. The output will display the performance differences between the two groups.
Example Usage:
Proposed Priority
The ability to compare two aggregated test results is a highly requested feature from users. It will enable more accurate and representative performance comparisons by reducing the impact of variability and outliers, particularly when evaluating the impact of changes or optimizations across different configurations or workloads.
The text was updated successfully, but these errors were encountered: