Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fixed number of requests #633

Merged
merged 17 commits into from
May 9, 2024
Merged

Support for fixed number of requests #633

merged 17 commits into from
May 9, 2024

Conversation

tgerdesnv
Copy link
Collaborator

@tgerdesnv tgerdesnv commented May 7, 2024

new CLI arg --request-count, which when specified tells PA exactly how many requests to issue for the experiment

Works with concurrency, request-rate, and custom load

Also:

  • Combined thread_config classes into one, so that the base LoadWorker class has access to it
  • Cleaned up some printing

@tgerdesnv tgerdesnv changed the title Support for fixed num requests Support for fixed number of requests May 7, 2024
@tgerdesnv tgerdesnv marked this pull request as ready for review May 8, 2024 18:15
Copy link
Contributor

@dyastremsky dyastremsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactors, especially ThreadConfig. This looks splendid!

Copy link
Contributor

@matthewkotila matthewkotila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall

src/c++/perf_analyzer/request_rate_manager.cc Show resolved Hide resolved
@tgerdesnv tgerdesnv merged commit c3cf131 into main May 9, 2024
3 checks passed
@tgerdesnv tgerdesnv deleted the tgerdes-fixed-num-req branch May 9, 2024 19:38
ganeshku1 pushed a commit that referenced this pull request May 11, 2024
* first pass. Hardcoded values

* Working for concurrency (hardcoded whenever count windows is used for now)

* working for req rate as well

* Add CLI. Add/fix unit tests

* Remove hack. Restore all normal functionality

* Refactor thread config into one class. Add more testing

* Rename arg to request-count

* Fix request rate bug

* Update info print

* fix corner case

* move fixme to a story tag

* add assert to avoid corner case

* rename variables

* self review #1

* copyright changes

* add doxygen to functions

* Don't allow sweeping over multiple concurrency or request rate with request-count
ganeshku1 pushed a commit that referenced this pull request May 11, 2024
* Fix empty response bug

* Fix unused variable

Fix test

Initialize logger to capture logs

Add unit test

Change to _ instead of removing

Check if args.model is not None

fix artifact path

Support Python 3.8 in GenAI-Perf (#643)

Add automation to run unit tests and check code coverage for GenAI-Perf against Python 3.10 (#640)

Changes to support Ensemble Top Level Response Caching (#560)

Support for fixed number of requests (#633)

* first pass. Hardcoded values

* Working for concurrency (hardcoded whenever count windows is used for now)

* working for req rate as well

* Add CLI. Add/fix unit tests

* Remove hack. Restore all normal functionality

* Refactor thread config into one class. Add more testing

* Rename arg to request-count

* Fix request rate bug

* Update info print

* fix corner case

* move fixme to a story tag

* add assert to avoid corner case

* rename variables

* self review #1

* copyright changes

* add doxygen to functions

* Don't allow sweeping over multiple concurrency or request rate with request-count

fix test (#637)

Support custom artifacts directory and improve default artifacts directory (#636)

* Add artifacts dir option and more descriptive profile export filename

* Clean up

* fix input data path

* Add tests

* create one to one plot dir for each profile run

* change the directory look

* add helper method

Extend genai perf plots to compare across multiple runs (#635)

* Modify PlotManager and plots classes

* Support plots for multiple runs -draft

* Fix default plot visualization

* Remove artifact

* Set default compare directory

* Support generating parquet files

* Remove annotations and fix heatmap

* Fix errors

* Fix pre-commit

* Fix CodeQL warning

* Remove unused comments

* remove x axis tick label for boxplot

* Add logging and label for heatmap subplots

* Allow users to adjust width and height

* fix grammer

---------

Co-authored-by: Hyunjae Woo <[email protected]>

Generate plot configurations for plot manager (#632)

* Introduce PlotConfig and PlotConfigParser class

* Port preprocessing steps and introduce ProfileRunData

* Create plot configs for default plots

* fix minor bug

* Fix comment

* Implement parse method in PlotConfigParser

* refactor

* fix test

* Add test

* Address feedback

* Handle custom endpoint

Add more metadata to profile export JSON file (#627)

* Add more metadata to profile export data

* Fix minor bug

* refactor

Add compare subcommand (#623)

* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header

Revert "Changes to support Ensemble Top Level Response Caching (#560) (#642)"

This reverts commit cc6a3b2.

Changes to support Ensemble Top Level Response Caching (#560) (#642)
mc-nv pushed a commit that referenced this pull request May 13, 2024
* first pass. Hardcoded values

* Working for concurrency (hardcoded whenever count windows is used for now)

* working for req rate as well

* Add CLI. Add/fix unit tests

* Remove hack. Restore all normal functionality

* Refactor thread config into one class. Add more testing

* Rename arg to request-count

* Fix request rate bug

* Update info print

* fix corner case

* move fixme to a story tag

* add assert to avoid corner case

* rename variables

* self review #1

* copyright changes

* add doxygen to functions

* Don't allow sweeping over multiple concurrency or request rate with request-count
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants