[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

adi-lb-phoenix · 2024-09-18T09:42:19Z

Your current environment

The output of `python collect_env.py`

The below output is from podman distrobox :
uname -a
Linux ubuntu22vllm.JOHNAIC 6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

the below output is from the host
Your output of `python collect_env.py` here

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.3.110+xpu
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.1.3.1
[pip3] nvidia-cuda-cupti-cu12==12.1.105
[pip3] nvidia-cuda-nvrtc-cu12==12.1.105
[pip3] nvidia-cuda-runtime-cu12==12.1.105
[pip3] nvidia-cudnn-cu12==8.9.2.26
[pip3] nvidia-cufft-cu12==11.0.2.54
[pip3] nvidia-curand-cu12==10.3.2.106
[pip3] nvidia-cusolver-cu12==11.4.5.107
[pip3] nvidia-cusparse-cu12==12.1.0.106
[pip3] nvidia-nccl-cu12==2.19.3
[pip3] nvidia-nvjitlink-cu12==12.6.68
[pip3] nvidia-nvtx-cu12==12.1.105
[pip3] pyzmq==26.2.0
[pip3] torch==2.3.1+cxx11.abi
[pip3] transformers==4.44.2
[pip3] triton==2.2.0
[pip3] triton-xpu==3.0.0b2
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: N/A
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect

Model Input Dumps

No response

🐛 Describe the bug

For installing vllm for linux to execute on intel GPU, we have followed the instructions in https://docs.vllm.ai/en/latest/getting_started/xpu-installation.html#installation-with-xpu .

Case 1 :
We have created a ubuntu image using distrobox on podman. We tried to build from source .
The commands were

pip install --upgrade pip
pip install -v -r requirements-xpu.txt

logfile of the above command
output.txt

 VLLM_TARGET_DEVICE=xpu python setup.py install
error in vllm setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Expected package name at the start of dependency specifier
    --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

CASE 2:
The below error is when we tried to build using dockerfile on the host system outside podman using the instructions present in https://docs.vllm.ai/en/latest/getting_started/xpu-installation.html#quick-start-using-dockerfile . Instead of docker, we used podman to build it which it failed at.

podman  build -f Dockerfile.xpu -t vllm-xpu-env --shm-size=4g .
STEP 1/9: FROM intel/oneapi-basekit:2024.2.1-0-devel-ubuntu22.04
STEP 2/9: RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/intel-oneapi-archive-keyring.gpg > /dev/null &&     echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " | tee /etc/apt/sources.list.d/oneAPI.list &&     chmod 644 /usr/share/keyrings/intel-oneapi-archive-keyring.gpg &&     wget -O- https://repositories.intel.com/graphics/intel-graphics.key | gpg --dearmor | tee /usr/share/keyrings/intel-graphics.gpg > /dev/null &&     echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc" | tee /etc/apt/sources.list.d/intel.gpu.jammy.list &&     chmod 644 /usr/share/keyrings/intel-graphics.gpg
--> Using cache 098758915996e49e32aed0d13aeb3b8df2c91e8fdc5da8fbaa14ff225b08e617
--> 09875891599
STEP 3/9: RUN apt-get update  -y && apt-get install -y curl libicu70 lsb-release git wget vim numactl python3 python3-pip ffmpeg libsm6 libxext6 libgl1 
--> Using cache 829501e9861021f829aea63da9ed910954aed2512b659241f1a1c1498aa22b51
--> 829501e9861
STEP 4/9: RUN git clone https://github.com/intel/pti-gpu &&     cd pti-gpu/sdk &&     mkdir build &&     cd build &&     cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/icpx_toolchain.cmake -DBUILD_TESTING=OFF .. &&     make -j &&     cmake --install . --config Release --prefix "/usr/local"
--> Using cache be5f38cbc94153cbcec3f5a26186898029a06f485e42460bfcbcd485109336e9
--> be5f38cbc94
STEP 5/9: COPY ./ /workspace/vllm
--> Using cache 5a66d6e91f3297586deab5dcda71c39ee7167bae205fde72fe23918b92c71fd0
--> 5a66d6e91f3
STEP 6/9: WORKDIR /workspace/vllm
--> Using cache 284936c2e4cb04c65775b41cab7aa8d3df76f59fd25eb3d3ecc70af9b947d4d0
--> 284936c2e4c
STEP 7/9: RUN pip install -v -r requirements-xpu.txt
--> Using cache b5c12deac57e71d03e8ed05f892be4ebace345ee2a146b2dec0feb3444831edd
--> b5c12deac57
STEP 8/9: RUN VLLM_TARGET_DEVICE=xpu python3 setup.py install
error in vllm setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Parse error at "'--extra-'": Expected W:(abcd...)
Error: error building at STEP "RUN VLLM_TARGET_DEVICE=xpu python3 setup.py install": error while running runtime: exit status 1

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

adi-lb-phoenix added the bug Something isn't working label Sep 18, 2024

yma11 mentioned this issue Sep 20, 2024

[Bugfix] fix docker build for xpu #8652

Merged

youkaichao closed this as completed in #8652 Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

adi-lb-phoenix commented Sep 18, 2024

[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

Comments

adi-lb-phoenix commented Sep 18, 2024

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...