Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend] Add --logprobs argument to benchmark_serving.py #8191

Merged
7 changes: 2 additions & 5 deletions benchmarks/benchmark_serving.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,6 @@
when using tgi backend, add
--endpoint /generate_stream
to the end of the command above.

Use --logprobs <num logprobs> to specify the number of logprobs-per-token
to return as part of the request (or leave the argument unspecified
to default to 1 logprob-per-token.
"""
import argparse
import asyncio
Expand Down Expand Up @@ -735,7 +731,8 @@ def main(args: argparse.Namespace):
"--logprobs",
type=int,
default=None,
help="Number of logprobs-per-token to return as part of the request.",
help=("Number of logprobs-per-token to return as part of the request "
"(default 1)"),
afeldman-nm marked this conversation as resolved.
Show resolved Hide resolved
)
parser.add_argument(
"--sonnet-prefix-len",
Expand Down
2 changes: 1 addition & 1 deletion tests/multi_step/test_correctness_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def test_multi_step_llm(
GPU -> CPU output transfer
num_prompts: number of example prompts under test
num_logprobs: corresponds to the `logprobs` argument to the OpenAI
completions endpoint; `None` -> no logprobs
completions endpoint; `None` -> 1 logprob returned.
"""

prompts = example_prompts
Expand Down
Loading