[Frontend] Add max_tokens prometheus metric #9881

tomeras91 · 2024-10-31T15:37:30Z

This PR adds reporting of max_tokens as a prometheus metric (histogram) - vllm:request_params_max_tokens.

The number of generated tokens is already reported (vllm:request_generation_tokens), but that's the number of tokens actually generated, which is generally different from the number of tokens requested by the user - max_tokens

github-actions · 2024-10-31T15:37:41Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

…kens requested by the user, which is generally different than the number of tokens actually generated Signed-off-by: Tomer Asida <[email protected]>

Signed-off-by: Tomer Asida <[email protected]>

DarkLight1337 · 2024-11-01T06:03:22Z

@robertgshaw2-neuralmagic can you review this?

Signed-off-by: Tomer Asida <[email protected]>

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Richard Liu <[email protected]>

Signed-off-by: Tomer Asida <[email protected]>

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Loc Huynh <[email protected]>

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>

tomeras91 requested review from DarkLight1337, robertgshaw2-neuralmagic, simon-mo, WoosukKwon, zhuohan123, youkaichao, alexm-neuralmagic, comaniac and njhill as code owners October 31, 2024 15:37

tomeras91 added 3 commits October 31, 2024 17:39

Add max_tokens to prometheus metrics. This is the number of output to…

981efbf

…kens requested by the user, which is generally different than the number of tokens actually generated Signed-off-by: Tomer Asida <[email protected]>

Add vllm:request_params_max_tokens metric to tests

9986120

Signed-off-by: Tomer Asida <[email protected]>

format

9267cd0

Signed-off-by: Tomer Asida <[email protected]>

tomeras91 force-pushed the add-max-tokens-metric branch from f842335 to 9267cd0 Compare October 31, 2024 15:39

simon-mo approved these changes Nov 4, 2024

View reviewed changes

simon-mo enabled auto-merge (squash) November 4, 2024 19:33

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 4, 2024

simon-mo merged commit ac04a97 into vllm-project:main Nov 4, 2024
72 checks passed

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 4, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

f68649d

Signed-off-by: Tomer Asida <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Nov 4, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

fce5a69

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

richardsliu pushed a commit to richardsliu/vllm that referenced this pull request Nov 4, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

20f6878

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Richard Liu <[email protected]>

bigPYJ1151 pushed a commit to bigPYJ1151/vllm that referenced this pull request Nov 5, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

64cf790

Signed-off-by: Tomer Asida <[email protected]>

DarkLight1337 pushed a commit that referenced this pull request Nov 5, 2024

[Frontend] Add max_tokens prometheus metric (#9881)

5861646

Signed-off-by: Tomer Asida <[email protected]>

tomeras91 deleted the add-max-tokens-metric branch November 5, 2024 08:05

JC1DA pushed a commit to JC1DA/vllm that referenced this pull request Nov 11, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

4340a3d

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Loc Huynh <[email protected]>

sumitd2 pushed a commit to sumitd2/vllm that referenced this pull request Nov 14, 2024

[Frontend] Add max_tokens prometheus metric (vllm-project#9881)

89e3357

Signed-off-by: Tomer Asida <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Add max_tokens prometheus metric #9881

[Frontend] Add max_tokens prometheus metric #9881

tomeras91 commented Oct 31, 2024

github-actions bot commented Oct 31, 2024

DarkLight1337 commented Nov 1, 2024

[Frontend] Add max_tokens prometheus metric #9881

[Frontend] Add max_tokens prometheus metric #9881

Conversation

tomeras91 commented Oct 31, 2024

github-actions bot commented Oct 31, 2024

DarkLight1337 commented Nov 1, 2024