New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Could you help to provide a solution for Qwen-14B-Chat-Int4(gptq) using vllm? Many Thanks! #1881

Closed

micronetboy opened this issue Dec 1, 2023 · 1 comment

micronetboy commented Dec 1, 2023

Could you help to provide a solution for Qwen-14B-Chat-Int4(gptq) using vllm? Many Thanks!

Contributor

chenxu2048 commented Dec 4, 2023

You can try this PR #1580, which contains the basic GPTQ support.

Or quantize Qwen-14B-Chat to int4 in AutoAWQ, with the script in this pr.

hmellor closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment