[Misc] Make benchmarks use EngineArgs #9529

JArnoldAMD · 2024-10-19T19:15:19Z

Update the benchmark scripts to directly use the CLI arguments provided by EngineArgs instead of duplicating a subset of these arguments in each benchmark script.

Currently the CLI arguments are duplicated, forcing changes to be made in multiple locations and resulting in some useful vLLM options not being exposed in the scripts. For example, the --num-scheduler-steps option is currently available in benchmark_throughput.py but not benchmark_latency.py, making it difficult to understand the latency impacts of this option. As another example, the benchmark_prioritization.py script appears to be broken currently because it was not updated to expose the --scheduling-policy option which is required for enabling priority.

These maintenance challenges are eliminated by using EngineArgs.add_cli_args to add support for all engine arguments directly, and then passing these options to the engine initialization.

One minor change in behavior is that when benchmark_throughput.py runs in async mode it no longer includes hard-coded settings for worker_use_ray=False (which is deprecated anyway) and disable_log_requests=True (but the user now has the option to pass --disable-log-requests on the command-line).

Similarly, benchmark_prefix_caching no longer has hard-coded values for trust_remote_code=True and enforce_eager=True, but these may now be passed on the command-line.

Update the benchmark scripts to directly use the CLI arguments provided by EngineArgs instead of duplicating a subset of these arguments in each benchmark script. Currently the CLI arguments are duplicated, forcing changes to be made in multiple locations and resulting in some useful vLLM options not being exposed in the scripts. For example, the --num-scheduler-steps option is currently available in benchmark_throughput.py but not benchmark_latency.py, making it difficult to understand the latency impacts of this option. As another example, the benchmark_prioritization.py script appears to be broken currently because it was not updated to expose the --scheduling-policy option which is required for enabling priority. These maintenance challenges are eliminated by using EngineArgs.add_cli_args to add support for all engine arguments directly, and then passing these options to the engine initialization. One minor change in behavior is that when benchmark_throughput.py runs in async mode it no longer includes hard-coded settings for worker_use_ray=False (which is deprecated anyway) and disable_log_requests=True (but the user now has the option to pass --disable-log-requests on the command-line). Similarly, benchmark_prefix_caching no longer has hard-coded values for trust_remote_code=True and enforce_eager=True, but these may now be passed on the command-line.

github-actions · 2024-10-19T19:15:31Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

comaniac

LGTM. It's pretty clean!

comaniac · 2024-10-19T20:43:43Z

benchmarks/benchmark_prefix_caching.py

@@ -190,9 +182,7 @@ def main(args):
                        default='128:256',
                        help='Range of input lengths for sampling prompts,'
                        'specified as "min:max" (e.g., "128:256").')
-    parser.add_argument("--seed",


Do we have "seed" in the engine arg as well?

Yes:
https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L402

comaniac · 2024-10-19T22:03:31Z

Also cc @KuntaiDu

mgoin

Ahh I've meant to get to this refactor for a while, thank you!

Signed-off-by: charlifu <[email protected]>

Signed-off-by: Alvant <[email protected]>

Signed-off-by: Erkin Sagiroglu <[email protected]>

Signed-off-by: Amit Garg <[email protected]>

Signed-off-by: qishuai <[email protected]>

Signed-off-by: NickLucche <[email protected]>

Signed-off-by: Sumit Dubey <[email protected]>

comaniac approved these changes Oct 19, 2024

View reviewed changes

comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 19, 2024

mgoin approved these changes Oct 21, 2024

View reviewed changes

simon-mo merged commit cb6fdaa into vllm-project:main Oct 22, 2024
39 of 41 checks passed

charlifu pushed a commit to charlifu/vllm that referenced this pull request Oct 23, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

ca9e474

Signed-off-by: charlifu <[email protected]>

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

1da7c87

Signed-off-by: Alvant <[email protected]>

MErkinSag pushed a commit to MErkinSag/vllm that referenced this pull request Oct 26, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

14c9e8c

Signed-off-by: Erkin Sagiroglu <[email protected]>

garg-amit pushed a commit to garg-amit/vllm that referenced this pull request Oct 28, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

307b7c3

Signed-off-by: Amit Garg <[email protected]>

FerdinandZhong pushed a commit to FerdinandZhong/vllm that referenced this pull request Oct 29, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

3c1c3dd

Signed-off-by: qishuai <[email protected]>

NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Oct 31, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

3d413f4

Signed-off-by: NickLucche <[email protected]>

NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Oct 31, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

3f1a3a8

Signed-off-by: NickLucche <[email protected]>

sumitd2 pushed a commit to sumitd2/vllm that referenced this pull request Nov 14, 2024

[Misc] Make benchmarks use EngineArgs (vllm-project#9529)

0f27e92

Signed-off-by: Sumit Dubey <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Make benchmarks use EngineArgs #9529

[Misc] Make benchmarks use EngineArgs #9529

JArnoldAMD commented Oct 19, 2024

github-actions bot commented Oct 19, 2024

comaniac left a comment

comaniac Oct 19, 2024

JArnoldAMD Oct 19, 2024

comaniac commented Oct 19, 2024

mgoin left a comment

[Misc] Make benchmarks use EngineArgs #9529

[Misc] Make benchmarks use EngineArgs #9529

Conversation

JArnoldAMD commented Oct 19, 2024

github-actions bot commented Oct 19, 2024

comaniac left a comment

Choose a reason for hiding this comment

comaniac Oct 19, 2024

Choose a reason for hiding this comment

JArnoldAMD Oct 19, 2024

Choose a reason for hiding this comment

comaniac commented Oct 19, 2024

mgoin left a comment

Choose a reason for hiding this comment