[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support #2790

hongxiayang · 2024-02-06T19:32:10Z

Fixes: #2725
Current head failed to build on ROCm, and I got errors like:

g++ -pthread -B /opt/conda/envs/py_3.8/compiler_compat -Wl,--sysroot=/ -pthread -shared -B /opt/conda/envs/py_3.8/compiler_compat -L/opt/conda/envs/py_3.8/lib -Wl,-rpath=/opt/conda/envs/py_3.8/lib -WlEN,--no-as-needed -Wl,--sysroot=/ /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/activation_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/attention/attention_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/cache_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/hip_utils_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/layernorm_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/moe_align_block_size_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/pos_encoding_kernels.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/pybind.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/quantization/gptq/q_gemm.o /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/quantization/squeezellm/quant_hip_kernel.o -L/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-38/vllm/_C.cpython-38-x86_64-linux-gnu.so
/opt/conda/envs/py_3.8/compiler_compat/ld: /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/opt/conda/envs/py_3.8/compiler_compat/ld: /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/opt/conda/envs/py_3.8/compiler_compat/ld: /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/opt/conda/envs/py_3.8/compiler_compat/ld: /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /app/vllm/build/temp.linux-x86_64-cpython-38/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/opt/conda/envs/py_3.8/compiler_compat/ld: /app/vllm/build/temp.linux-x86_64-cpython-

We need a patch to fix the compilation issue before the next ROCm release is available.
This pull request fixed the build issue for ROCm build.

jamestwhedbee · 2024-02-06T20:33:20Z

I am using ROCm 5.7 and before this PR, v0.3.0 of vLLM could not be built. Applying this patch fixes it for me! Just thought it was worth mentioning that this issue was not limited to ROCm 6.0

hongxiayang · 2024-02-06T20:51:56Z

I am using ROCm 5.7 and before this PR, v0.3.0 of vLLM could not be built. Applying this patch fixes it for me! Just thought it was worth mentioning that this issue was not limited to ROCm 6.0

Thanks for your comment. I will update to apply the patch regardless the version.

… 5.7

zhuohan123

LGTM! Thanks for the fix!

… kv-cache support (vllm-project#2790)

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support (vllm-project#2790) Add documentation on how to do incremental builds (vllm-project#2796) [Ray] Integration compiled DAG off by default (vllm-project#2471) Disable custom all reduce by default (vllm-project#2808) add usage context removed usage_context from Engine_args Move IO to another process added http request [ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (vllm-project#2768) Add documentation section about LoRA (vllm-project#2834) Refactor 2 awq gemm kernels into m16nXk32 (vllm-project#2723) Co-authored-by: Chunan Zeng <[email protected]> Added additional arg for from_engine_args comments

… kv-cache support (vllm-project#2790)

fxmarty · 2024-04-18T14:07:15Z

Thank you @hongxiayang

[ROCm] Fix build problem resulted from FP8 kv-cache commit

0e7eebd

hongxiayang marked this pull request as ready for review February 6, 2024 19:38

hongxiayang mentioned this pull request Feb 6, 2024

[BUG] Compile source code error for ROCM platform when using #include <hip/hip_bf16.h> #2725

Closed

hongxiayang mentioned this pull request Feb 6, 2024

Fix compile error when using rocm #2648

Merged

remove the version check to apply the patch for both ROCm 6.0 or ROCm…

ca871a6

… 5.7

zhuohan123 approved these changes Feb 7, 2024

View reviewed changes

zhuohan123 merged commit c81dddb into vllm-project:main Feb 7, 2024
17 checks passed

hongxiayang added a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

dbdb467

… kv-cache support (vllm-project#2790)

alexm-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Feb 13, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

ea19b11

… kv-cache support (vllm-project#2790)

jvmncs pushed a commit to jvmncs/vllm that referenced this pull request Feb 14, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

5cb2c3a

… kv-cache support (vllm-project#2790)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

38deef5

… kv-cache support (vllm-project#2790)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

88483a6

… kv-cache support (vllm-project#2790)

andy-neuma mentioned this pull request Feb 23, 2024

andy/bump main to v0.3.2 neuralmagic/nm-vllm#49

Closed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8…

8f5d8d1

… kv-cache support (vllm-project#2790)

fxmarty mentioned this pull request Apr 18, 2024

Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support #2790

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support #2790

hongxiayang commented Feb 6, 2024 •

edited

Loading

jamestwhedbee commented Feb 6, 2024

hongxiayang commented Feb 6, 2024

zhuohan123 left a comment

fxmarty commented Apr 18, 2024

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support #2790

[ROCm] Fix build problem resulted from previous commit related to FP8 kv-cache support #2790

Conversation

hongxiayang commented Feb 6, 2024 • edited Loading

jamestwhedbee commented Feb 6, 2024

hongxiayang commented Feb 6, 2024

zhuohan123 left a comment

Choose a reason for hiding this comment

fxmarty commented Apr 18, 2024

hongxiayang commented Feb 6, 2024 •

edited

Loading