Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compare subcommand #623

Merged
merged 10 commits into from
May 2, 2024
Merged

Conversation

nv-hwoo
Copy link
Contributor

@nv-hwoo nv-hwoo commented May 1, 2024

Add compare subcommand that allows users to generate plots to compare multiple runs.

Users can provide paths to profile export files using --files CLI option as following

genai-perf compare --files [file, ...]

which will generate a set of plots using these runs as well as pre-filled, default yaml configuration file that users can edit to tweak the plotting parameters.

Users can also directly pass the yaml config file to generate customized plots detailed in the yaml file as following

genai-perf compare --config [YAML file]

General workflow of user can be:

  • first run genai-perf compare --files [file, ...] that gives plots as well as the initial yaml config file
  • repeatedly run genai-perf compare --config [YAML file] by tweaking plotting parameters such as title, x and y labels, x and y metrics, and etc.

Copy link
Contributor

@dyastremsky dyastremsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start!

Would you be able to look into what the most Pythonic way of doing this is? It seems to be that having some functionality be under a subcommand and some functionality would not be the generally suggested approach. I'd also want to see if this messes up the help menu.

I think we'd want to start introducing subcommands under GenAi-Perf, but that discussion is more on the design side of things.

@@ -210,9 +210,33 @@ def test_load_level_mutually_exclusive(self, monkeypatch, capsys):
captured = capsys.readouterr()
assert expected_output in captured.err

def test_compare_mutually_exclusive(self, monkeypatch, capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to add sections headers for regular vs compare subcommand, if we're going that route?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry what is sections headers?

Copy link
Contributor

@dyastremsky dyastremsky May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking for clarification! By a header, I mean something like a multi-line comment at the start of each test header. Anything that can help visually separate the tests. Something like this: https://github.com/triton-inference-server/triton_cli/blob/58ba74c4156b24541fd1fd6f462fe17f019d48d9/src/triton_cli/parser.py#L101-L103

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added section header.

This comment was marked as outdated.

### Handlers ###


def handler(args, extra_args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def handler(args, extra_args):
def profile_handler(args, extra_args):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to add any more work for you, but if you do make more changes, it would be nice to make this handle_profile to be consistent with Triton CLI and make any integration with it more straightforward.

If we do not do it as part of this ticket, the next person to add handlers should move to this approach to set the standard IMO.

@@ -280,7 +287,7 @@ def _add_endpoint_args(parser):
"-m",
"--model",
type=str,
required=True,
default=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would break the profile workflow if you did not add the check_model_args method?
Am I understanding that correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make sure we did not break any of the unit testing. Is this currently passing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly to make compare subcommand work. If this is required, then we cannot do genai-perf compare ... because we are missing -m option (we will need to call it like genai-perf -m <model> compare ...).

The model name is required because we need model names to prep for PA run and that is being done under _check_model_args.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this leads naturally to making profile a sub command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to do that as a separate ticket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean adding a profile subcommand?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It sounds like that is an active discussion, but we're going to run into these types of issues if we don't make each different use of GenAI-Perf a subcommand. I think if we're going to start coding things to work around or duplicate what our libraries do by default (e.g. the required check), it usually means we're doing something wrong design-wise.

Hyunjae, if we do end up making a separate sub-command and you are the person to implement it, please make sure to create a ticket for that extra work so that your time is being tracked. You should also update the relevant design document, since it sounds like this was out of scope for it.

"""
if args.subcommand == "compare":
if not args.config and not args.files:
parser.error("Either --config or --files option must be specified.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parser.error("Either --config or --files option must be specified.")
parser.error("Either the --config or --files option must be specified when comparing.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. The error format already specifies that we are in compare so I don't think we need to highlight "when comparing" again.

Example:

$ genai-perf compare
usage: genai-perf compare [-h] [--config CONFIG | -f FILES [FILES ...]]
genai-perf compare: error: Either the --config or --files option must be specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this approach, though it looks like "the" was not pushed.

@@ -431,6 +438,46 @@ def get_extra_inputs_as_dict(args: argparse.Namespace) -> dict:
return request_inputs


def _parse_compare_args(subparsers) -> argparse.ArgumentParser:
compare = subparsers.add_parser(
"compare", help="Generate plots that compare multiple runs."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"compare", help="Generate plots that compare multiple runs."
"compare", help="Generate plots that compare multiple profile runs."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

This comment was marked as outdated.

@nv-hwoo
Copy link
Contributor Author

nv-hwoo commented May 1, 2024

Would you be able to look into what the most Pythonic way of doing this is? It seems to be that having some functionality be under a subcommand and some functionality would not be the generally suggested approach. I'd also want to see if this messes up the help menu.

Great point. Yeah I am not really satisfied on how the current command/subcommand is structured, and we should definitely look into this (and I think @debermudez has already started the conversation on this). But I think that effort will require more effort than this current PR. I have TMA-1900 ticket to refactor CLI for this.

Copy link
Contributor

@dyastremsky dyastremsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback!

"""
if args.subcommand == "compare":
if not args.config and not args.files:
parser.error("Either --config or --files option must be specified.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this approach, though it looks like "the" was not pushed.

@@ -280,7 +287,7 @@ def _add_endpoint_args(parser):
"-m",
"--model",
type=str,
required=True,
default=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It sounds like that is an active discussion, but we're going to run into these types of issues if we don't make each different use of GenAI-Perf a subcommand. I think if we're going to start coding things to work around or duplicate what our libraries do by default (e.g. the required check), it usually means we're doing something wrong design-wise.

Hyunjae, if we do end up making a separate sub-command and you are the person to implement it, please make sure to create a ticket for that extra work so that your time is being tracked. You should also update the relevant design document, since it sounds like this was out of scope for it.

@@ -431,6 +438,46 @@ def get_extra_inputs_as_dict(args: argparse.Namespace) -> dict:
return request_inputs


def _parse_compare_args(subparsers) -> argparse.ArgumentParser:
compare = subparsers.add_parser(
"compare", help="Generate plots that compare multiple runs."

This comment was marked as outdated.

### Handlers ###


def handler(args, extra_args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to add any more work for you, but if you do make more changes, it would be nice to make this handle_profile to be consistent with Triton CLI and make any integration with it more straightforward.

If we do not do it as part of this ticket, the next person to add handlers should move to this approach to set the standard IMO.

@@ -210,9 +210,33 @@ def test_load_level_mutually_exclusive(self, monkeypatch, capsys):
captured = capsys.readouterr()
assert expected_output in captured.err

def test_compare_mutually_exclusive(self, monkeypatch, capsys):

This comment was marked as outdated.

Copy link
Contributor

@dyastremsky dyastremsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean code. Great work!

@nv-hwoo nv-hwoo merged commit 02fafb1 into compare-subcommand May 2, 2024
3 checks passed
@nv-hwoo nv-hwoo deleted the hwoo-add-compare-subcommand branch May 2, 2024 17:36
nv-hwoo added a commit that referenced this pull request May 2, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
nv-hwoo added a commit that referenced this pull request May 7, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
nv-hwoo added a commit that referenced this pull request May 8, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
nv-hwoo added a commit that referenced this pull request May 8, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
ganeshku1 pushed a commit that referenced this pull request May 11, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
ganeshku1 pushed a commit that referenced this pull request May 11, 2024
* Fix empty response bug

* Fix unused variable

Fix test

Initialize logger to capture logs

Add unit test

Change to _ instead of removing

Check if args.model is not None

fix artifact path

Support Python 3.8 in GenAI-Perf (#643)

Add automation to run unit tests and check code coverage for GenAI-Perf against Python 3.10 (#640)

Changes to support Ensemble Top Level Response Caching (#560)

Support for fixed number of requests (#633)

* first pass. Hardcoded values

* Working for concurrency (hardcoded whenever count windows is used for now)

* working for req rate as well

* Add CLI. Add/fix unit tests

* Remove hack. Restore all normal functionality

* Refactor thread config into one class. Add more testing

* Rename arg to request-count

* Fix request rate bug

* Update info print

* fix corner case

* move fixme to a story tag

* add assert to avoid corner case

* rename variables

* self review #1

* copyright changes

* add doxygen to functions

* Don't allow sweeping over multiple concurrency or request rate with request-count

fix test (#637)

Support custom artifacts directory and improve default artifacts directory (#636)

* Add artifacts dir option and more descriptive profile export filename

* Clean up

* fix input data path

* Add tests

* create one to one plot dir for each profile run

* change the directory look

* add helper method

Extend genai perf plots to compare across multiple runs (#635)

* Modify PlotManager and plots classes

* Support plots for multiple runs -draft

* Fix default plot visualization

* Remove artifact

* Set default compare directory

* Support generating parquet files

* Remove annotations and fix heatmap

* Fix errors

* Fix pre-commit

* Fix CodeQL warning

* Remove unused comments

* remove x axis tick label for boxplot

* Add logging and label for heatmap subplots

* Allow users to adjust width and height

* fix grammer

---------

Co-authored-by: Hyunjae Woo <[email protected]>

Generate plot configurations for plot manager (#632)

* Introduce PlotConfig and PlotConfigParser class

* Port preprocessing steps and introduce ProfileRunData

* Create plot configs for default plots

* fix minor bug

* Fix comment

* Implement parse method in PlotConfigParser

* refactor

* fix test

* Add test

* Address feedback

* Handle custom endpoint

Add more metadata to profile export JSON file (#627)

* Add more metadata to profile export data

* Fix minor bug

* refactor

Add compare subcommand (#623)

* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header

Revert "Changes to support Ensemble Top Level Response Caching (#560) (#642)"

This reverts commit cc6a3b2.

Changes to support Ensemble Top Level Response Caching (#560) (#642)
mc-nv pushed a commit that referenced this pull request May 13, 2024
* Move for better visibility

* Add compare subparser

* Add subcommand compare

* Fix test

* Add ticket

* add --files option and minor fix

* Fix tests

* Add unit tests

* Address feedback

* Fix minor error and add section header
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants