Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile #8563

Closed
1 task done
adi-lb-phoenix opened this issue Sep 18, 2024 · 0 comments · Fixed by #8652
Closed
1 task done
Labels
bug Something isn't working

Comments

@adi-lb-phoenix
Copy link

Your current environment

The output of `python collect_env.py`
The below output is from podman distrobox :
uname -a
Linux ubuntu22vllm.JOHNAIC 6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

the below output is from the host
Your output of `python collect_env.py` here

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.3.110+xpu
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.1.3.1
[pip3] nvidia-cuda-cupti-cu12==12.1.105
[pip3] nvidia-cuda-nvrtc-cu12==12.1.105
[pip3] nvidia-cuda-runtime-cu12==12.1.105
[pip3] nvidia-cudnn-cu12==8.9.2.26
[pip3] nvidia-cufft-cu12==11.0.2.54
[pip3] nvidia-curand-cu12==10.3.2.106
[pip3] nvidia-cusolver-cu12==11.4.5.107
[pip3] nvidia-cusparse-cu12==12.1.0.106
[pip3] nvidia-nccl-cu12==2.19.3
[pip3] nvidia-nvjitlink-cu12==12.6.68
[pip3] nvidia-nvtx-cu12==12.1.105
[pip3] pyzmq==26.2.0
[pip3] torch==2.3.1+cxx11.abi
[pip3] transformers==4.44.2
[pip3] triton==2.2.0
[pip3] triton-xpu==3.0.0b2
[conda] Could not collect
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: N/A
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect

Model Input Dumps

No response

🐛 Describe the bug

For installing vllm for linux to execute on intel GPU, we have followed the instructions in https://docs.vllm.ai/en/latest/getting_started/xpu-installation.html#installation-with-xpu .

Case 1 :
We have created a ubuntu image using distrobox on podman. We tried to build from source .
The commands were

pip install --upgrade pip
pip install -v -r requirements-xpu.txt

logfile of the above command
output.txt

 VLLM_TARGET_DEVICE=xpu python setup.py install
error in vllm setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Expected package name at the start of dependency specifier
    --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

CASE 2:
The below error is when we tried to build using dockerfile on the host system outside podman using the instructions present in https://docs.vllm.ai/en/latest/getting_started/xpu-installation.html#quick-start-using-dockerfile . Instead of docker, we used podman to build it which it failed at.

podman  build -f Dockerfile.xpu -t vllm-xpu-env --shm-size=4g .
STEP 1/9: FROM intel/oneapi-basekit:2024.2.1-0-devel-ubuntu22.04
STEP 2/9: RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/intel-oneapi-archive-keyring.gpg > /dev/null &&     echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " | tee /etc/apt/sources.list.d/oneAPI.list &&     chmod 644 /usr/share/keyrings/intel-oneapi-archive-keyring.gpg &&     wget -O- https://repositories.intel.com/graphics/intel-graphics.key | gpg --dearmor | tee /usr/share/keyrings/intel-graphics.gpg > /dev/null &&     echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc" | tee /etc/apt/sources.list.d/intel.gpu.jammy.list &&     chmod 644 /usr/share/keyrings/intel-graphics.gpg
--> Using cache 098758915996e49e32aed0d13aeb3b8df2c91e8fdc5da8fbaa14ff225b08e617
--> 09875891599
STEP 3/9: RUN apt-get update  -y && apt-get install -y curl libicu70 lsb-release git wget vim numactl python3 python3-pip ffmpeg libsm6 libxext6 libgl1 
--> Using cache 829501e9861021f829aea63da9ed910954aed2512b659241f1a1c1498aa22b51
--> 829501e9861
STEP 4/9: RUN git clone https://github.com/intel/pti-gpu &&     cd pti-gpu/sdk &&     mkdir build &&     cd build &&     cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchains/icpx_toolchain.cmake -DBUILD_TESTING=OFF .. &&     make -j &&     cmake --install . --config Release --prefix "/usr/local"
--> Using cache be5f38cbc94153cbcec3f5a26186898029a06f485e42460bfcbcd485109336e9
--> be5f38cbc94
STEP 5/9: COPY ./ /workspace/vllm
--> Using cache 5a66d6e91f3297586deab5dcda71c39ee7167bae205fde72fe23918b92c71fd0
--> 5a66d6e91f3
STEP 6/9: WORKDIR /workspace/vllm
--> Using cache 284936c2e4cb04c65775b41cab7aa8d3df76f59fd25eb3d3ecc70af9b947d4d0
--> 284936c2e4c
STEP 7/9: RUN pip install -v -r requirements-xpu.txt
--> Using cache b5c12deac57e71d03e8ed05f892be4ebace345ee2a146b2dec0feb3444831edd
--> b5c12deac57
STEP 8/9: RUN VLLM_TARGET_DEVICE=xpu python3 setup.py install
error in vllm setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Parse error at "'--extra-'": Expected W:(abcd...)
Error: error building at STEP "RUN VLLM_TARGET_DEVICE=xpu python3 setup.py install": error while running runtime: exit status 1

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@adi-lb-phoenix adi-lb-phoenix added the bug Something isn't working label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant