Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix --number-of-docs bug in create-workload #659

Merged
merged 6 commits into from
Oct 3, 2024

Conversation

IanHoang
Copy link
Collaborator

Description

Currently, there's a bug in create-workload when users specify the number of docs to extract from an index. Addressing the fix in this PR. This PR also adds more error handling and warnings to streamline user experience.

Issues Resolved

#658

Testing

  • New functionality includes testing

Tested with these scenarios:

  • Running create-workload with 2+ number of docs entries
  • Testing number of docs entry with 0 or less docs
  • Testing number of docs entry with less than 1K docs
  • Testing number of docs entry that does not exist in the cluster
(.venv) hoangia@80a9971b1103 opensearch-benchmark % opensearch-benchmark create-workload --target-hosts=XXXXXX --client-options=basic_auth_user:'XXXXXX',basic_auth_password:'XXXXXX' --indices=movies-1000,movies-2000,nyc_taxis  --output-path=~/Desktop/ --workload=test-workload --number-of-docs="movies-2000:1500 nyc_taxis:1500"

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Connected to OpenSearch cluster [69622e766ec7eb17f038aed664796847] version [2.5.0].

A workload already exists at /Users/hoangia/Desktop/test-workload. Would you like to remove it? (y/n): y
[INFO] Removing workload of the same name.
Extracting documents for index [movies-1000] for test mode... 1000/1000 docs [100.0% done]
Extracting documents for index [movies-1000]...               1000/1000 docs [100.0% done]
Extracting documents for index [movies-2000] for test mode... 1000/1000 docs [100.0% done]
Extracting documents for index [movies-2000]...               1500/1500 docs [100.0% done]
Extracting documents for index [nyc_taxis] for test mode...   1000/1000 docs [100.0% done]
Extracting documents for index [nyc_taxis]...                 1500/1500 docs [100.0% done]

[INFO] Workload test-workload has been created. Run it with: opensearch-benchmark --workload-path=/Users/hoangia/Desktop/test-workload

-------------------------------
[INFO] SUCCESS (took 4 seconds)
-------------------------------

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ian Hoang <[email protected]>
osbenchmark/workload_generator/helpers.py Outdated Show resolved Hide resolved
osbenchmark/workload_generator/helpers.py Outdated Show resolved Hide resolved
osbenchmark/workload_generator/helpers.py Outdated Show resolved Hide resolved
osbenchmark/workload_generator/helpers.py Outdated Show resolved Hide resolved
osbenchmark/workload_generator/extractors.py Outdated Show resolved Hide resolved
Comment on lines 128 to 132

# If values contains spaces, user provided 2+ key value pairs
kv_pairs = values[0].split(" ")

for kv in kv_pairs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fIs this the case where the user provides multiple entries in a single arg within shell quotes?

--number-of-docs "x:3 y:4 z:8"

In that case, nargs should be 1, not +.
To handle both cases with +, this code should be activated only when len(values) is 1. In other cases, this should fall back to the earlier mode of processing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Will implement it to where it handles both use cases.

Signed-off-by: Ian Hoang <[email protected]>
@IanHoang IanHoang requested a review from gkamat October 2, 2024 17:10
@IanHoang IanHoang merged commit a967bfd into opensearch-project:main Oct 3, 2024
10 checks passed
gkamat pushed a commit to gkamat/opensearch-benchmark that referenced this pull request Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants