Skip to content

Commit

Permalink
Update readme & doc for the vllm upgrade to v0.6.2 (#12399)
Browse files Browse the repository at this point in the history
Co-authored-by: ATMxsp01 <[email protected]>
  • Loading branch information
ATMxsp01 and ATMxsp01 authored Nov 14, 2024
1 parent 59b01fa commit 6726b19
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docker/llm/serving/xpu/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,8 @@ To set up model serving using `IPEX-LLM` as backend using FastChat, you can refe
--model-path /llm/models/Yi-1.5-34B \
--device xpu \
--enforce-eager \
--disable-async-output-proc \
--distributed-executor-backend ray \
--dtype float16 \
--load-in-low-bit fp8 \
--tensor-parallel-size 4 \
Expand Down
2 changes: 2 additions & 0 deletions docs/mddocs/DockerGuides/vllm_docker_quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -852,6 +852,8 @@ We can set up model serving using `IPEX-LLM` as backend using FastChat, the foll
--model-path /llm/models/Yi-1.5-34B \
--device xpu \
--enforce-eager \
--disable-async-output-proc \
--distributed-executor-backend ray \
--dtype float16 \
--load-in-low-bit fp8 \
--tensor-parallel-size 4 \
Expand Down

0 comments on commit 6726b19

Please sign in to comment.