-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Qwen2.5 bitsandbytes support #8941
Comments
cc @chenqianfzh could you plz look at this issue,thanks |
😃 Hi, I create a similar PR to support Qwen2.5 and get correct results with the following scripts. Hope this PR helps.
And the corresponding outputs:
|
May I ask what version of the model are you using in this example? is it the Unsloth/Qwen2.5-0.5B-Instruct-bnb-4bit ? I am trying to run inference on the Unsloth/Qwen2.5-7B-bnb-4bit model using your example But Im getting the error: [rank0]: AttributeError: Model Qwen2ForCausalLM does not support BitsAndBytes quantization yet. btw ive tried running inference on the models--unsloth--Llama-3.2-1B-Instruct and It works without any issues so Im assuming the issue may be specific to unsloths qwen 2.5 model library. |
Hi, updating the vllm version may work, because llama3 bnb is supported before Aug, but Qwen2.5 bnb support is added recently. |
@blueyo0 is right. The bnb support of qwen2 was added recently. |
hi, could you also add support of BNB quantization for phi series, such as |
sure,I will take a look at Phi 3.5 mini |
🚀 The feature, motivation and pitch
Description:
Qwen2.5 (32B) is a state-of-the-art model, especially interesting in 4-bit precision (bitsandbytes).
in this notebook i show how the model is working using hugginface, and how after adding bitsandbytes support the output is gibberish
i tried to add this lines, under
Qwen2ForCausalLM
class:bad output example
Alternatives
No response
Additional context
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: