Update Readme for FastChat docker demo #12354

ATMxsp01 · 2024-11-07T02:52:01Z

Update Readme for FastChat docker demo

liu-shaojun · 2024-11-07T03:51:12Z

docker/llm/serving/xpu/docker/README.md

@@ -63,6 +63,70 @@ For convenience, we have included a file `/llm/start-pp_serving-service.sh` in t

 To run model-serving using `IPEX-LLM` as backend using FastChat, you can refer to this [quickstart](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/fastchat_quickstart.html#).

+In short, you need to start a docker container with `--device=/dev/dri`, a recommended command is:


To set up model serving using IPEX-LLM as the backend with FastChat, you can refer to this Quickstart guide or follow these quick steps to deploy a demo.

Quick Setup for FastChat with IPEX-LLM

Start the Docker Container

Run the following command to launch a Docker container with device access:

#/bin/bash export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:latest sudo docker run -itd \ --net=host \ --device=/dev/dri \ --name=demo-container \ -v /LLM_MODELS/:/llm/models/ \ # Example: map host model directory to container --shm-size="16g" \ -e http_proxy=... \ # Optional: set proxy if needed -e https_proxy=... \ -e no_proxy="127.0.0.1,localhost" \ $DOCKER_IMAGE

Start the FastChat Service

Enter the container and start the FastChat service:

#/bin/bash # Stop any existing FastChat processes ps -ef | grep "fastchat" | awk '{print $2}' | xargs kill -9 # Install the required Gradio version pip install -U gradio==4.43.0 # Launch the FastChat controller python -m fastchat.serve.controller & # Set environment variables for CCL export TORCH_LLM_ALLREDUCE=0 export CCL_DG2_ALLREDUCE=1 export CCL_WORKER_COUNT=2 # Optional: Pin CCL workers to specific cores # export CCL_WORKER_AFFINITY=32,33,34,35 export FI_PROVIDER=shm export CCL_ATL_TRANSPORT=ofi export CCL_ZE_IPC_EXCHANGE=sockets export CCL_ATL_SHM=1 # Load Intel CCL settings source /opt/intel/1ccl-wks/setvars.sh # Start the model worker (replace "Yi-1.5-34B" with your model name) python -m ipex_llm.serving.fastchat.vllm_worker \ --model-path /llm/models/Yi-1.5-34B \ --device xpu \ --enforce-eager \ --dtype float16 \ --load-in-low-bit fp8 \ --tensor-parallel-size 4 \ --gpu-memory-utilization 0.9 \ --max-model-len 4096 \ --max-num-batched-tokens 8000 & # Wait for initialization sleep 120 # Start the Gradio web server for FastChat python -m fastchat.serve.gradio_web_server &

This quick setup allows you to deploy FastChat with IPEX-LLM efficiently.

glorysdj

LGTM

liu-shaojun

LGTM

update Readme for FastChat docker demo

8c7e625

liu-shaojun reviewed Nov 7, 2024

View reviewed changes

update readme

faf9171

liu-shaojun requested a review from glorysdj November 7, 2024 06:33

glorysdj approved these changes Nov 7, 2024

View reviewed changes

ATMxsp01 added 2 commits November 7, 2024 15:03

add 'Serving with FastChat' part in docs

0195385

polish docs

2cb01de

liu-shaojun approved these changes Nov 7, 2024

View reviewed changes

liu-shaojun merged commit ce0c6ae into intel-analytics:main Nov 7, 2024

ATMxsp01 deleted the doc-update branch November 14, 2024 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Readme for FastChat docker demo #12354

Update Readme for FastChat docker demo #12354

ATMxsp01 commented Nov 7, 2024

liu-shaojun Nov 7, 2024

glorysdj left a comment

liu-shaojun left a comment

		@@ -63,6 +63,70 @@ For convenience, we have included a file `/llm/start-pp_serving-service.sh` in t

		To run model-serving using `IPEX-LLM` as backend using FastChat, you can refer to this [quickstart](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/fastchat_quickstart.html#).

		In short, you need to start a docker container with `--device=/dev/dri`, a recommended command is:

Update Readme for FastChat docker demo #12354

Update Readme for FastChat docker demo #12354

Conversation

ATMxsp01 commented Nov 7, 2024

liu-shaojun Nov 7, 2024

Choose a reason for hiding this comment

Quick Setup for FastChat with IPEX-LLM

glorysdj left a comment

Choose a reason for hiding this comment

liu-shaojun left a comment

Choose a reason for hiding this comment