Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
[Frontend] Separate OpenAI Batch Runner usage from API Server (vllm-p…
Browse files Browse the repository at this point in the history
  • Loading branch information
wuisawesome authored and robertgshaw2-redhat committed May 19, 2024
1 parent 3426d29 commit de61ba7
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/entrypoints/openai/run_batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ async def main(args):

engine_args = AsyncEngineArgs.from_cli_args(args)
engine = AsyncLLMEngine.from_engine_args(
engine_args, usage_context=UsageContext.OPENAI_API_SERVER)
engine_args, usage_context=UsageContext.OPENAI_BATCH_RUNNER)

# When using single vLLM without engine_use_ray
model_config = await engine.get_model_config()
Expand Down
1 change: 1 addition & 0 deletions vllm/usage/usage_lib.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ class UsageContext(str, Enum):
LLM_CLASS = "LLM_CLASS"
API_SERVER = "API_SERVER"
OPENAI_API_SERVER = "OPENAI_API_SERVER"
OPENAI_BATCH_RUNNER = "OPENAI_BATCH_RUNNER"
ENGINE_CONTEXT = "ENGINE_CONTEXT"


Expand Down

0 comments on commit de61ba7

Please sign in to comment.