Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add throughput and latency documentation #6910

Merged
merged 24 commits into from
Apr 18, 2024
Merged

Conversation

Naarcha-AWS
Copy link
Collaborator

Also adds a new concepts section the OpenSearch Benchmark user guide.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Naarcha-AWS
Copy link
Collaborator Author

@IanHoang: This is ready for your review.


## Core concepts and definitions

- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The description of" --> "A collection of"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"workflow" --> "workload"

Comment on lines 26 to 29
A workload is a specification of one or more benchmarking scenarios. A workload typically includes the following:

- One or more data streams that are ingested into indexes.
- A set of queries and operations that are invoked as part of the benchmark.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be migrated to be under the Workload bullet point above. We can also remove the sentence "A workload is a specification of one or more benchmarking scenarios" since it' mentioned in the first sentence of the workload bullet point.


At the end of each test, OpenSearch Benchmark produces a table that summarizes the following:

- [Took time](#took-time)
- [Service time](#service-time)
- Throughput
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Would recommend putting throughput first in this list of points

Copy link
Contributor

@IanHoang IanHoang Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would group the time-based metrics together and ordering the table of contents in the same order as the headers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove throughput on line 16

- The error rate for each completed task or OpenSearch operation.

The following diagram illustrates how each component of the table is measured during the life cycle of a request from an OpenSearch cluster to the OpenSearch client:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"request from an OpenSearch cluster to the OpenSearch client" --> "request involving the OpenSearch cluster, the OpenSearch client, and OpenSearch Benchmark"

Copy link
Contributor

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

Signed-off-by: Naarcha-AWS <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS requested a review from IanHoang April 8, 2024 19:06
@Naarcha-AWS Naarcha-AWS self-assigned this Apr 9, 2024
Copy link
Contributor

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested two quick changes and overall looks good

Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS added 4 - Doc review PR: Doc review in progress and removed 3 - Tech review PR: Tech review in progress labels Apr 15, 2024
Copy link
Contributor

@vagimeli vagimeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Please see my suggested edits and comments.

@@ -11,7 +13,9 @@ Before using OpenSearch Benchmark, familiarize yourself with the following conce

## Core concepts and definitions

- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
- **Workload**: A collection of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workload runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/). A workload typically includes the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest breaking up this paragraph for ease of readability.

_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/target-throughput.md Show resolved Hide resolved
_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved

<img src="{{site.url}}{{site.baseurl}}/images/benchmark/b-latency-explanation-2.png" alt="">

OpenSearch Benchmark does not account for this and continues to try to achieve the `target-throughput` of 10 operations per second. Because of this, delays for each request begin to cascade, as illustrated in the following diagram.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "this" refer to? I suggest adding the accompanying noun in both instances.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it refers to the throughput limitation described in the preceding para and shown in the diagram, so we could include "limitation" after the first instance of "this" (but not the second one), but it's otherwise fine for me as written.

_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved
_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved
Naarcha-AWS and others added 3 commits April 16, 2024 13:55
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS added 5 - Editorial review PR: Editorial review in progress and removed 4 - Doc review PR: Doc review in progress labels Apr 16, 2024
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@natebower natebower added 4 - Doc review PR: Doc review in progress and removed 5 - Editorial review PR: Editorial review in progress labels Apr 17, 2024
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS added 5 - Editorial review PR: Editorial review in progress and removed 4 - Doc review PR: Doc review in progress labels Apr 17, 2024
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Naarcha-AWS Please see my comments and changes and let me know if you have any questions. Thanks!

_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/concepts.md Outdated Show resolved Hide resolved
_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved
_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved

<img src="{{site.url}}{{site.baseurl}}/images/benchmark/b-latency-explanation-2.png" alt="">

OpenSearch Benchmark does not account for this and continues to try to achieve the `target-throughput` of 10 operations per second. Because of this, delays for each request begin to cascade, as illustrated in the following diagram.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it refers to the throughput limitation described in the preceding para and shown in the diagram, so we could include "limitation" after the first instance of "this" (but not the second one), but it's otherwise fine for me as written.

_benchmark/user-guide/target-throughput.md Outdated Show resolved Hide resolved
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
@Naarcha-AWS Naarcha-AWS merged commit 6b7dc8e into main Apr 18, 2024
6 checks passed
@Naarcha-AWS Naarcha-AWS deleted the benchmark-latency-throughput branch April 18, 2024 14:04
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 18, 2024
* Create concepts section

Signed-off-by: Archer <[email protected]>

* Add Throughput and Latency concept page

Signed-off-by: Archer <[email protected]>

* Fix links

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Move workload bullets

Signed-off-by: Naarcha-AWS <[email protected]>

* Update time-latency.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Add additional feedback

Signed-off-by: Archer <[email protected]>

* Fix links

Signed-off-by: Archer <[email protected]>

* Reorder sections, fix more links

Signed-off-by: Archer <[email protected]>

* Fix link. Fix reorder

Signed-off-by: Archer <[email protected]>

* Apply suggestions from code review

Signed-off-by: Naarcha-AWS <[email protected]>

* Update concepts.md

Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>

---------

Signed-off-by: Archer <[email protected]>
Signed-off-by: Naarcha-AWS <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
(cherry picked from commit 6b7dc8e)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Editorial review PR: Editorial review in progress backport 2.13 PR: Backport label for 2.13 benchmark
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants