fix(llama-cpp-server): fix rocm build by setting `GGML_HIPBLAS` #2835

richard-jfc · 2024-08-12T00:54:40Z

Various LLAMA_ build flags have been renamed to GGML_ in llama-cpp-server (See ggerganov/llama.cpp@f3f6542) and need to be renamed in tabby to allow correctly targeting Metal/Rocm/Vulkan.

It appears that this was already done for Cuda.

May be related: #2811

wsxiaoys · 2024-08-12T01:00:04Z

Have you verified if the PR fixes your issue?

cmake handles the migration with macros defined in https://demo.tabbyml.com/files/git/llama.cpp/-/blob/7a221b672e49dfae459b1af27210ba3f2b5419b6/CMakeLists.txt?plain=1#L101

So my guess is it won't fix any build issues you are currently encountering.

richard-jfc · 2024-08-12T01:09:26Z

I built rocm v0.15.0 tag using docker. And have been running the built Rocm version via docker.

Without my change I was seeing a CPU spike for every chat request and no VRAM or GPU usage.

After my change I was seeing roughly 6GB VRAM and a GPU usage spike.

wsxiaoys · 2024-08-12T01:11:35Z

It does seem that GGML_HIPBLAS is the flag not handled by the llama_option_depr migration. In that case, would you consider updating the PR to contain only the GGML_HIPBLAS change, since it's the only one you've verified?

richard-jfc · 2024-08-12T01:17:34Z

Thanks, Done

richard-jfc force-pushed the main branch from 4310726 to eae5e0e Compare August 12, 2024 00:58

fix: renamed LLAMA_HIPBLAS flag not handled by llama_option_depr

77b9dd5

richard-jfc force-pushed the main branch from eae5e0e to 77b9dd5 Compare August 12, 2024 01:16

wsxiaoys changed the title ~~Fixing renamed llama-server build flags~~ fix(llama-cpp-server): fix rocm build by setting GGML_HIPBLAS Aug 12, 2024

wsxiaoys enabled auto-merge (squash) August 12, 2024 01:18

wsxiaoys disabled auto-merge August 12, 2024 01:32

wsxiaoys merged commit 48dba77 into TabbyML:main Aug 12, 2024
3 of 5 checks passed

richard-jfc mentioned this pull request Aug 12, 2024

ROCm and Vulkan seems like doesn't work #2810

Open

michalwarda mentioned this pull request Sep 12, 2024

fix(llama-cpp-server): fix vulkan build by setting GGML_VULKAN #3133

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llama-cpp-server): fix rocm build by setting `GGML_HIPBLAS` #2835

fix(llama-cpp-server): fix rocm build by setting `GGML_HIPBLAS` #2835

richard-jfc commented Aug 12, 2024 •

edited

Loading

wsxiaoys commented Aug 12, 2024

richard-jfc commented Aug 12, 2024

wsxiaoys commented Aug 12, 2024

richard-jfc commented Aug 12, 2024

fix(llama-cpp-server): fix rocm build by setting GGML_HIPBLAS #2835

fix(llama-cpp-server): fix rocm build by setting GGML_HIPBLAS #2835

Conversation

richard-jfc commented Aug 12, 2024 • edited Loading

wsxiaoys commented Aug 12, 2024

richard-jfc commented Aug 12, 2024

wsxiaoys commented Aug 12, 2024

richard-jfc commented Aug 12, 2024

fix(llama-cpp-server): fix rocm build by setting `GGML_HIPBLAS` #2835

fix(llama-cpp-server): fix rocm build by setting `GGML_HIPBLAS` #2835

richard-jfc commented Aug 12, 2024 •

edited

Loading