Update compression config for openlm-research/open_llama_3b_v2 #860

nikita-savelyevv · 2024-08-07T09:29:31Z

Reduce PPL for compressed openlm-research/open_llama_3b_v2 by removing all_layers=True and adding AWQ.

Precision	all_layers	AWQ	PPL
FP16			12.40
INT4_SYM	False	False	13.45
INT4_SYM	True	False	13.73
INT4_ASYM	False	True	12.77
INT4_ASYM	False	False	13.15
INT4_ASYM	True	False	13.36

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

…a_3b_v2

nikita-savelyevv · 2024-08-07T09:31:18Z

@KodiaqQ please take a look

KodiaqQ · 2024-08-09T13:26:39Z

@nikita-savelyevv, you've provided numbers in the description with the ASYM mode, while configuration contains "sym": True. Why so?

nikita-savelyevv · 2024-08-09T18:44:53Z

@KodiaqQ Thanks for noticing. There were some errors in the collected metrics. I've updated the metrics in the PR description and changed the mode to asymmetric.

HuggingFaceDocBuilderDev · 2024-08-13T11:30:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* Remove compression with all_layers=True for openlm-research/open_llama_3b_v2 * Fix sym parameter * Add AWQ

Copy changes from huggingface/optimum-intel#860

Remove compression with all_layers=True for openlm-research/open_llam…

dcb7088

…a_3b_v2

Fix sym parameter

30c5496

KodiaqQ approved these changes Aug 9, 2024

View reviewed changes

AlexKoff88 approved these changes Aug 13, 2024

View reviewed changes

Add AWQ

f423b30

AlexKoff88 merged commit 1b7bd9f into huggingface:main Aug 16, 2024
13 of 17 checks passed

nikita-savelyevv mentioned this pull request Aug 16, 2024

Update config for open-llama-3b-v2 openvinotoolkit/openvino.genai#778

Merged

IlyasMoutawwakil pushed a commit that referenced this pull request Aug 16, 2024

Update compression config for openlm-research/open_llama_3b_v2 (#860)

4e1d405

* Remove compression with all_layers=True for openlm-research/open_llama_3b_v2 * Fix sym parameter * Add AWQ

github-merge-queue bot pushed a commit to openvinotoolkit/openvino.genai that referenced this pull request Aug 19, 2024

Update config for open-llama-3b-v2 (#778)

c662828

Copy changes from huggingface/optimum-intel#860

eaidova pushed a commit to openvinotoolkit/openvino.genai that referenced this pull request Aug 19, 2024

Update config for open-llama-3b-v2 (#778)

09363b5

Copy changes from huggingface/optimum-intel#860

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update compression config for openlm-research/open_llama_3b_v2 #860

Update compression config for openlm-research/open_llama_3b_v2 #860

nikita-savelyevv commented Aug 7, 2024 •

edited

Loading

nikita-savelyevv commented Aug 7, 2024

KodiaqQ commented Aug 9, 2024

nikita-savelyevv commented Aug 9, 2024

HuggingFaceDocBuilderDev commented Aug 13, 2024

Update compression config for openlm-research/open_llama_3b_v2 #860

Update compression config for openlm-research/open_llama_3b_v2 #860

Conversation

nikita-savelyevv commented Aug 7, 2024 • edited Loading

Before submitting

nikita-savelyevv commented Aug 7, 2024

KodiaqQ commented Aug 9, 2024

nikita-savelyevv commented Aug 9, 2024

HuggingFaceDocBuilderDev commented Aug 13, 2024

nikita-savelyevv commented Aug 7, 2024 •

edited

Loading