Add Ascend NPU support for nf4 quant #1422

statelesshz · 2024-11-21T10:49:59Z

What does this PR do?

This PR adds Ascend NPU support for nf4 quant/dequant and allows QLoRA fine-tuning for LLMs using transformers, peft, and trl.

You may notice that the nf4 quantization method is currently implemented in PyTorch. This interim measure is due to the fact that the high-performance version implemented with AscendC is still in progress 😞 . Meanwhile, we've received feedback from many in the Ascend NPU community expressing their keen interest in using QLoRA to fine-tune LLMs as soon as possible, so there is this PR.

Related PR: huggingface/transformers#31512

Collaborators

@SlightwindSec @Ginray @MatrixPlayer

cc @Titus-von-Koeller @matthewdouglas

Co-authored-by: Slightwind <[email protected]> Co-authored-by: Ginray <[email protected]>

statelesshz · 2024-11-21T11:35:18Z

Refer to this blog, I did a E2E test on the llama2-7b-hf with QLoRA fine-tuning in my env with NPU device, it works 🤗.

Here is the script I used.

baymax591 · 2024-11-22T07:57:53Z

Thanks a lot for sharing this PR and the video demo! Thanks to the demo, I was able to successfully run NF4 quant/dequant on the NPU with ease. The detailed explanation in the video really helped me understand the process and key steps. Looking forward to more updates in the future—great work!

baymax591 · 2024-11-22T08:02:46Z

I hope this PR can be merged soon, as it provides valuable improvements. Looking forward to seeing it merged!
cc @Titus-von-Koeller

SunMarc · 2024-11-27T12:56:54Z

Nice work and thanks for the demo ! Can you have a look @matthewdouglas ?

github-actions · 2024-11-27T19:02:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

matthewdouglas · 2024-11-27T19:10:03Z

I will be able to look in more detail next week, but at first glance it looks nice. Thanks @statelesshz !

Add npu support for nf4 quant

d57c778

Co-authored-by: Slightwind <[email protected]> Co-authored-by: Ginray <[email protected]>

statelesshz mentioned this pull request Nov 21, 2024

add bnb support for Ascend NPU huggingface/transformers#31512

Open

5 tasks

code format

f5a4c57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ascend NPU support for nf4 quant #1422

Add Ascend NPU support for nf4 quant #1422

statelesshz commented Nov 21, 2024 •

edited

Loading

statelesshz commented Nov 21, 2024 •

edited

Loading

baymax591 commented Nov 22, 2024

baymax591 commented Nov 22, 2024

SunMarc commented Nov 27, 2024

github-actions bot commented Nov 27, 2024

matthewdouglas commented Nov 27, 2024

Add Ascend NPU support for nf4 quant #1422

Are you sure you want to change the base?

Add Ascend NPU support for nf4 quant #1422

Conversation

statelesshz commented Nov 21, 2024 • edited Loading

What does this PR do?

Collaborators

statelesshz commented Nov 21, 2024 • edited Loading

baymax591 commented Nov 22, 2024

baymax591 commented Nov 22, 2024

SunMarc commented Nov 27, 2024

github-actions bot commented Nov 27, 2024

matthewdouglas commented Nov 27, 2024

statelesshz commented Nov 21, 2024 •

edited

Loading

statelesshz commented Nov 21, 2024 •

edited

Loading