-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: multiple GPU support for llama.cpp engine #1202
Comments
谢谢,既然 llama.cpp 已经支持多 GPU,我们会尽快实现。 |
n_gpu accpet int paramters, if you send a string, it would occur error, you can use as this:
then you can use 2 gpu to launch model |
Xinference can support this now. |
I guess it was because your gpu haven't enough memory
you can try to set |
I tried another model (qwen1.5-32b q8_0). It works. |
Is your feature request related to a problem? Please describe
启动GGUF模型时,总是只能使用一颗GPU
尝试修改参数
报错,不支持设置n_gpu
Describe the solution you'd like
The text was updated successfully, but these errors were encountered: