Skip to content

Commit

Permalink
Update install & doc by using latest repo from ipex (#111)
Browse files Browse the repository at this point in the history
* Update install & doc

Signed-off-by: Wu, Xiaochang <[email protected]>

* update

Signed-off-by: Wu, Xiaochang <[email protected]>

* update

Signed-off-by: Wu, Xiaochang <[email protected]>

* nit

Signed-off-by: Wu, Xiaochang <[email protected]>

* test

Signed-off-by: Wu, Xiaochang <[email protected]>

* update

Signed-off-by: Wu, Xiaochang <[email protected]>

---------

Signed-off-by: Wu, Xiaochang <[email protected]>
  • Loading branch information
xwu99 authored Feb 26, 2024
1 parent 9833288 commit 57ade22
Show file tree
Hide file tree
Showing 10 changed files with 26 additions and 23 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ LLM-on-Ray is a comprehensive solution designed to empower users in building, cu

LLM-on-Ray harnesses the power of Ray, an industry-leading framework for distributed computing, to scale your AI workloads efficiently. This integration ensures robust fault tolerance and cluster resource management, making your LLM projects more resilient and scalable.

LLM-on-Ray is built to operate across various hardware setups, including Intel CPU, Intel GPU and Intel Gaudi2. It incorporates several industry and Intel optimizations to maximize performance, including [vLLM](https://github.com/vllm-project/vllm), [llama.cpp](https://github.com/ggerganov/llama.cpp), [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch)/[Deepspeed](https://github.com/intel/intel-extension-for-deepspeed), [BigDL-LLM](https://github.com/intel-analytics/BigDL), [RecDP-LLM](https://github.com/intel/e2eAIOK/tree/main/RecDP/pyrecdp/LLM), [NeuralChat](https://huggingface.co/Intel/neural-chat-7b-v3-1) and more.
LLM-on-Ray is built to operate across various hardware setups, including Intel CPU, Intel GPU and Intel Gaudi2. It incorporates several industry and Intel optimizations to maximize performance, including [vLLM](https://github.com/vllm-project/vllm), [llama.cpp](https://github.com/ggerganov/llama.cpp), [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch)/[Deepspeed](https://github.com/intel/intel-extension-for-deepspeed), [BigDL-LLM](https://github.com/intel-analytics/BigDL), [RecDP-LLM](https://github.com/intel/e2eAIOK/tree/main/RecDP/pyrecdp/LLM), [NeuralChat](https://huggingface.co/Intel/neural-chat-7b-v3-1) and more.

## Solution Technical Overview
LLM-on-Ray's modular workflow structure is designed to comprehensively cater to the various stages of LLM development, from pretraining and finetuning to serving. These workflows are intuitive, highly configurable, and tailored to meet the specific needs of each phase in the LLM lifecycle:
Expand Down Expand Up @@ -44,12 +44,15 @@ git clone https://github.com/intel/llm-on-ray.git
cd llm-on-ray
conda create -n llm-on-ray python=3.9
conda activate llm-on-ray
pip install .[cpu] -f https://developer.intel.com/ipex-whl-stable-cpu -f https://download.pytorch.org/whl/torch_stable.html
# Dynamic link oneCCL and Intel MPI libraries
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl;print(torch_ccl.cwd)")/env/setvars.sh
pip install .[cpu] --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
```

#### 2. Start Ray
__[Optional]__ If DeepSpeed is enabled or doing distributed finetuing, oneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts:
```bash
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl; print(torch_ccl.cwd)")/env/setvars.sh
```

Start Ray locally using the following command. To launch a Ray cluster, please follow the [setup](docs/setup.md) document.
```bash
ray start --head
Expand Down Expand Up @@ -117,7 +120,7 @@ The following are detailed guidelines for pretraining, finetuning and serving LL
### Web UI
* [Finetune and Deploy LLMs through Web UI](docs/web_ui.md)

## Disclaimer
To the extent that any public datasets are referenced by Intel or accessed using tools or code on this site those datasets are provided by the third party indicated as the data source. Intel does not create the data, or datasets, and does not warrant their accuracy or quality. By accessing the public dataset(s), or using a model trained on those datasets, you agree to the terms associated with those datasets and that your use complies with the applicable license.
## Disclaimer
To the extent that any public datasets are referenced by Intel or accessed using tools or code on this site those datasets are provided by the third party indicated as the data source. Intel does not create the data, or datasets, and does not warrant their accuracy or quality. By accessing the public dataset(s), or using a model trained on those datasets, you agree to the terms associated with those datasets and that your use complies with the applicable license.

Intel expressly disclaims the accuracy, adequacy, or completeness of any public datasets, and is not liable for any errors, omissions, or defects in the data, or for any reliance on the data. Intel is not liable for any liability or damages relating to your use of public datasets.
4 changes: 2 additions & 2 deletions dev/docker/Dockerfile.bigdl-cpu
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ COPY ./MANIFEST.in .

RUN mkdir ./finetune && mkdir ./inference

RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[bigdl-cpu] -f https://developer.intel.com/ipex-whl-stable-cpu \
-f https://download.pytorch.org/whl/torch_stable.html
RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[bigdl-cpu] --extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/

# Used to invalidate docker build cache with --build-arg CACHEBUST=$(date +%s)
ARG CACHEBUST=1
Expand Down
4 changes: 2 additions & 2 deletions dev/docker/Dockerfile.cpu_and_deepspeed
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ COPY ./MANIFEST.in .

RUN mkdir ./finetune && mkdir ./inference

RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[cpu,deepspeed] -f https://developer.intel.com/ipex-whl-stable-cpu \
-f https://download.pytorch.org/whl/torch_stable.html
RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[cpu,deepspeed] --extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/

RUN ds_report

Expand Down
4 changes: 2 additions & 2 deletions dev/docker/Dockerfile.cpu_and_deepspeed.pip_non_editable
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ RUN --mount=type=cache,target=/opt/conda/pkgs conda init bash && \
# copy all checkedout file for later non-editable pip
COPY . .

RUN --mount=type=cache,target=/root/.cache/pip pip install .[cpu,deepspeed] -f https://developer.intel.com/ipex-whl-stable-cpu \
-f https://download.pytorch.org/whl/torch_stable.html
RUN --mount=type=cache,target=/root/.cache/pip pip install .[cpu,deepspeed] --extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/

RUN ds_report

Expand Down
4 changes: 2 additions & 2 deletions dev/docker/Dockerfile.vllm
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ COPY ./dev/scripts/install-vllm-cpu.sh .

RUN mkdir ./finetune && mkdir ./inference

RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[cpu] -f https://developer.intel.com/ipex-whl-stable-cpu \
-f https://download.pytorch.org/whl/torch_stable.html
RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[cpu] --extra-index-url https://download.pytorch.org/whl/cpu \
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/

# Install vllm-cpu
# Activate base first for loading g++ envs ($CONDA_PREFIX/etc/conda/activate.d/*)
Expand Down
2 changes: 1 addition & 1 deletion dev/k8s/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ RUN "$HOME/anaconda3/bin/pip install accelerate==0.19.0" \
"$HOME/anaconda3/bin/pip install gymnasium" \
"$HOME/anaconda3/bin/pip install dm-tree" \
"$HOME/anaconda3/bin/pip install scikit-image" \
"$HOME/anaconda3/bin/pip install oneccl_bind_pt==1.13 -f https://developer.intel.com/ipex-whl-stable-cpu"
"$HOME/anaconda3/bin/pip install oneccl_bind_pt==1.13 --extra-index-url https://developer.intel.com/ipex-whl-stable-cpu"

# set http_proxy & https_proxy
ENV http_proxy=${http_proxy}
Expand Down
2 changes: 1 addition & 1 deletion dev/scripts/install-vllm-cpu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ version_greater_equal "${gcc_version}" 12.3.0 || { echo "GNU C++ Compiler 12.3.0

# Install from source
MAX_JOBS=8 pip install -v git+https://github.com/bigPYJ1151/vllm@PR_Branch \
-f https://download.pytorch.org/whl/torch_stable.html
--extra-index-url https://download.pytorch.org/whl/cpu
2 changes: 1 addition & 1 deletion docs/serve_bigdl.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The integration with BigDL-LLM currently only supports running on Intel CPU.
## Setup
Please follow [setup.md](setup.md) to setup the environment first. Additional, you will need to install bigdl dependencies as below.
```bash
pip install .[bigdl-cpu] -f https://developer.intel.com/ipex-whl-stable-cpu -f https://download.pytorch.org/whl/torch_stable.html
pip install .[bigdl-cpu] --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
```

## Configure Serving Parameters
Expand Down
8 changes: 4 additions & 4 deletions docs/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,15 @@ conda activate llm-on-ray
```
For CPU:
```bash
pip install .[cpu] -f https://developer.intel.com/ipex-whl-stable-cpu -f https://download.pytorch.org/whl/torch_stable.html
pip install .[cpu] --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
```
For GPU:
```bash
pip install .[gpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
pip install .[gpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
```
If DeepSpeed is enabled or doing distributed finetuing, oneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts:
```bash
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl;print(torch_ccl.cwd)")/env/setvars.sh
source $(python -c "import oneccl_bindings_for_pytorch as torch_ccl; print(torch_ccl.cwd)")/env/setvars.sh
```

For Gaudi:
Expand All @@ -68,7 +68,7 @@ docker build \
After the image is built successfully, start a container:

```bash
docker run -it --runtime=habana -v ./llm-on-ray:/root/llm-ray --name="llm-ray-habana-demo" llm-ray-habana:latest
docker run -it --runtime=habana -v ./llm-on-ray:/root/llm-ray --name="llm-ray-habana-demo" llm-ray-habana:latest
```

#### 3. Launch Ray cluster
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ cpu = [
"transformers>=4.35.0",
"intel_extension_for_pytorch==2.1.0+cpu",
"torch==2.1.0+cpu",
"oneccl_bind_pt==2.1.0"
"oneccl_bind_pt==2.1.0+cpu"
]

gpu = [
Expand Down

0 comments on commit 57ade22

Please sign in to comment.