Benchmark: add H100 suite #6047

simon-mo · 2024-07-01T20:55:31Z

I have recently added an H100 agent which will be online for 12 hours per day. Let's test it out.

Successful build: https://buildkite.com/vllm/performance-benchmark/builds/4493

robertgshaw2-neuralmagic · 2024-07-02T02:02:21Z

Can I use this for some fp8 test - especially Mixtral

simon-mo · 2024-07-09T16:13:23Z

@KuntaiDu can you please review this? I think I got it working (see link in the description) by adding bunch of clean up in the shell script

KuntaiDu

LGTM. I kill vllm process via pkill pt_main_thread previously but I guess pkill -9 -f python3 also works.

KuntaiDu · 2024-07-09T21:16:06Z

.buildkite/nightly-benchmarks/run-benchmarks-suite.sh

-  /workspace/buildkite-agent artifact upload "$RESULTS_FOLDER/*"
+
+  # Use the determined command to annotate and upload artifacts
+  $BUILDKITE_AGENT_COMMAND annotate --style "info" --context "benchmark-results" < $RESULTS_FOLDER/benchmark_results.md


As the A100 benchmark also uses the same context (benchmark-results), it will overwrite the annotation. Maybe add --append parameter, or annotate in a different context (like "${gpu_name}-benchmark-results") .

KuntaiDu · 2024-07-09T21:17:09Z

.buildkite/nightly-benchmarks/run-benchmarks-suite.sh

+  # since we are in container anyway
+  pkill -9 -f python
+  pkill -9 -f python3
+


I tried pkill pt_main_thread, but I guess pkill -9 -f python3 also works.

KuntaiDu

Maybe change the annotation part

.buildkite/nightly-benchmarks/run-benchmarks-suite.sh

KuntaiDu · 2024-07-10T02:14:00Z

.buildkite/nightly-benchmarks/run-benchmarks-suite.sh

-  /workspace/buildkite-agent artifact upload "$RESULTS_FOLDER/*"
+
+  # Use the determined command to annotate and upload artifacts
+  $BUILDKITE_AGENT_COMMAND annotate --style "info" --context "h100-benchmark-results" < $RESULTS_FOLDER/benchmark_results.md


--context "${gpu_type}-benchmark-results"

comaniac

LGTM. We could add FP8 cases in a follow-up PR.

cadedaniel · 2024-07-11T22:27:57Z

this is awesome, thanks for adding

Signed-off-by: Alvant <[email protected]>

Benchmark: add H100 suite

9708b63

simon-mo added the perf-benchmarks label Jul 1, 2024

simon-mo added 2 commits July 1, 2024 16:56

try propogate uid-gid

9eee134

try propogate uid-gid

626871d

simon-mo added 10 commits July 1, 2024 20:29

try fix device_count

431d68e

apply kaichao's patch (to be revertted)

7e85042

better cleanup?

1bb0ae6

better cleanupx2

a4c7b22

Merge branch 'main' of github.com:vllm-project/vllm into h100-bench

e822040

revert

61cd79e

revert

92e74cf

Merge branch 'main' of github.com:vllm-project/vllm into h100-bench

b36115f

properly locate buildkite-agent

022cbe2

add back a100

002dcef

simon-mo requested a review from KuntaiDu July 9, 2024 16:12

KuntaiDu approved these changes Jul 9, 2024

View reviewed changes

KuntaiDu self-requested a review July 9, 2024 21:13

KuntaiDu reviewed Jul 9, 2024

View reviewed changes

KuntaiDu requested changes Jul 9, 2024

View reviewed changes

simon-mo commented Jul 9, 2024

View reviewed changes

.buildkite/nightly-benchmarks/run-benchmarks-suite.sh Outdated Show resolved Hide resolved

Update .buildkite/nightly-benchmarks/run-benchmarks-suite.sh

329b4f5

KuntaiDu reviewed Jul 10, 2024

View reviewed changes

fix annotation lable

d1d8443

comaniac approved these changes Jul 11, 2024

View reviewed changes

simon-mo merged commit 52b7fcb into vllm-project:main Jul 11, 2024
71 checks passed

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request Jul 17, 2024

Benchmark: add H100 suite (vllm-project#6047)

b06d137

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

Benchmark: add H100 suite (vllm-project#6047)

3bf5f47

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

Benchmark: add H100 suite (vllm-project#6047)

de6888b

Signed-off-by: Alvant <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: add H100 suite #6047

Benchmark: add H100 suite #6047

simon-mo commented Jul 1, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Jul 2, 2024 •

edited

Loading

simon-mo commented Jul 9, 2024

KuntaiDu left a comment

KuntaiDu Jul 9, 2024 •

edited

Loading

KuntaiDu Jul 9, 2024

KuntaiDu left a comment

KuntaiDu Jul 10, 2024

comaniac left a comment

cadedaniel commented Jul 11, 2024

Benchmark: add H100 suite #6047

Benchmark: add H100 suite #6047

Conversation

simon-mo commented Jul 1, 2024 • edited Loading

robertgshaw2-neuralmagic commented Jul 2, 2024 • edited Loading

simon-mo commented Jul 9, 2024

KuntaiDu left a comment

Choose a reason for hiding this comment

KuntaiDu Jul 9, 2024 • edited Loading

Choose a reason for hiding this comment

KuntaiDu Jul 9, 2024

Choose a reason for hiding this comment

KuntaiDu left a comment

Choose a reason for hiding this comment

KuntaiDu Jul 10, 2024

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

cadedaniel commented Jul 11, 2024

simon-mo commented Jul 1, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Jul 2, 2024 •

edited

Loading

KuntaiDu Jul 9, 2024 •

edited

Loading