We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running vllm with openai chat apis, the benchmarking script will fail as it asserts the backend API of assert api_url.endswith("v1/completions").
assert api_url.endswith("v1/completions")
python benchmark_serving.py --backend openai --model mistralai/Mistral-7B-v0.1 --dataset ShareGPT_V3_unfiltered_cleaned_split.json --save-result
The logs are as follows:
Namespace(backend='openai', version='N/A', base_url=None, host='localhost', port=8000, endpoint='/generate', dataset='ShareGPT_V3_unfiltered_cleaned_split.json', model='mistralai/Mistral-7B-v0.1', tokenizer=None, best_of=1, use_beam_search=False, num_prompts=1000, request_rate=inf, seed=0, trust_remote_code=False, disable_tqdm=False, save_result=True) 0%| | 0/1000 [00:00<?, ?it/s]Traffic request rate: inf Traceback (most recent call last): File "/home/chenw/vllm/benchmarks/benchmark_serving.py", line 387, in <module> main(args) File "/home/chenw/vllm/benchmarks/benchmark_serving.py", line 259, in main benchmark_result = asyncio.run( File "/home/chenw/miniconda3/envs/myenv/lib/python3.9/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/home/chenw/miniconda3/envs/myenv/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete return future.result() File "/home/chenw/vllm/benchmarks/benchmark_serving.py", line 195, in benchmark outputs = await asyncio.gather(*tasks) File "/home/chenw/vllm/benchmarks/backend_request_func.py", line 223, in async_request_openai_completions assert api_url.endswith("v1/completions") AssertionError 0%| | 0/1000 [00:00<?, ?it/s]
The backend_request_func.py should not only allow chat apis like: assert api_url.endswith("v1/chat/completions").
backend_request_func.py
assert api_url.endswith("v1/chat/completions")
The text was updated successfully, but these errors were encountered:
fix issue vllm-project#2940
3925602
keep the openai backend url as /v1/completions and add openai-chat backend url as /v1/chat/completions yapf format add newline
Successfully merging a pull request may close this issue.
When running vllm with openai chat apis, the benchmarking script will fail as it asserts the backend API of
assert api_url.endswith("v1/completions")
.The logs are as follows:
The
backend_request_func.py
should not only allow chat apis like:assert api_url.endswith("v1/chat/completions")
.The text was updated successfully, but these errors were encountered: