Skip to content

Commit

Permalink
[V1]Enable APC by default only for text models (vllm-project#10148)
Browse files Browse the repository at this point in the history
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Loc Huynh <[email protected]>
  • Loading branch information
ywang96 authored and JC1DA committed Nov 11, 2024
1 parent 456b16e commit 81aaaf9
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion vllm/v1/engine/llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@ def __init__(
elif usage_context == UsageContext.OPENAI_API_SERVER:
scheduler_config.max_num_seqs = 1024
scheduler_config.max_num_batched_tokens = 2048
cache_config.enable_prefix_caching = True

# TODO (ywang96): Enable APC by default when VLM supports it.
if not model_config.is_multimodal_model:
cache_config.enable_prefix_caching = True

logger.info(
"Initializing an LLM engine (v%s) with config: "
Expand Down

0 comments on commit 81aaaf9

Please sign in to comment.