[DOC]: Added INT4 weight compression description #20812

AlexKoff88 · 2023-11-01T10:35:27Z

Added docs for INT4

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md

yury-gorbachev · 2023-11-01T12:13:00Z

LGTM

docs/articles_en/openvino_workflow/gen_ai.md

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md

ngaloppo · 2023-11-01T16:20:47Z

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md

+
+   OpenVINO also supports models from Hugging Face `Transformers <https://github.com/huggingface/transformers>`__ library optimized 
+   with `GPTQ <https://github.com/PanQiWei/AutoGPTQ>`__. There is no need to do an extra step of model optimization in this case because 
+   model conversion will ensure that int4 optimization results are preserved and model inference will benefit from it.


How does this work? By loading the model with OVModelForCausalLM and everything else happens automagically? Is that documented somewhere, and if so perhaps a link to there from here?

This works as it is described, i.e. with OVModelForCausalLM . Happy to see a proposal if it is not clear here.

docs/optimization_guide/nncf/code/weight_compression_openvino.py

Co-authored-by: Nico Galoppo <[email protected]>

docs/articles_en/openvino_workflow/gen_ai.md

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md

ngaloppo · 2023-11-02T21:55:26Z

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md

+   OpenVINO also supports models from Hugging Face `Transformers <https://github.com/huggingface/transformers>`__ library optimized 
+   with `GPTQ <https://github.com/PanQiWei/AutoGPTQ>`__. There is no need to do an extra step of model optimization in this case because 
+   model conversion will ensure that int4 optimization results are preserved and model inference will benefit from it.


Suggested change

OpenVINO also supports models from Hugging Face `Transformers <https://github.com/huggingface/transformers>`__ library optimized

with `GPTQ <https://github.com/PanQiWei/AutoGPTQ>`__. There is no need to do an extra step of model optimization in this case because

model conversion will ensure that int4 optimization results are preserved and model inference will benefit from it.

OpenVINO also supports models from Hugging Face `Transformers <https://github.com/huggingface/transformers>`__ library optimized

with `GPTQ <https://github.com/PanQiWei/AutoGPTQ>`__. Those models can be loaded and converted directly with the `from_pretrained()` methods of the `Optimum Intel <https://huggingface.co/docs/optimum/main/en/intel/inference>`__ wrappers for Hugging Face models. Model conversion will ensure that int4 optimization results are preserved and model inference will benefit from it.

Co-authored-by: Tatiana Savina <[email protected]>

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

AlexKoff88 · 2023-11-06T12:53:39Z

I think we should proceed with the merge. @yury-gorbachev, please vote if you agree.

AlexKoff88 · 2023-11-08T06:18:39Z

@tsavina, we need to have this in the release branch as well.

* Added INT4 information into weight compression doc * Added GPTQ info. Fixed comments * Fixed list * Fixed issues. Updated Gen.AI doc * Applied comments * Added additional infor about GPTQ support * Fixed typos * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Nico Galoppo <[email protected]> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Nico Galoppo <[email protected]> * Update docs/optimization_guide/nncf/code/weight_compression_openvino.py Co-authored-by: Nico Galoppo <[email protected]> * Applied changes * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <[email protected]> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <[email protected]> * Added table with results * One more comment --------- Co-authored-by: Nico Galoppo <[email protected]> Co-authored-by: Tatiana Savina <[email protected]>

AlexKoff88 added 2 commits October 31, 2023 19:57

Added INT4 information into weight compression doc

6b6de88

Added GPTQ info. Fixed comments

1b59ce4

AlexKoff88 requested a review from a team as a code owner November 1, 2023 10:35

AlexKoff88 requested review from bstankix and removed request for a team November 1, 2023 10:35

github-actions bot added the category: docs OpenVINO documentation label Nov 1, 2023

Fixed list

67c7781

yury-gorbachev reviewed Nov 1, 2023

View reviewed changes

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/master' into ak/int4_docs

64698e0

yury-gorbachev reviewed Nov 1, 2023

View reviewed changes

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Show resolved Hide resolved

yury-gorbachev reviewed Nov 1, 2023

View reviewed changes

docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Outdated Show resolved Hide resolved

AlexKoff88 added 2 commits November 1, 2023 16:24

Fixed issues. Updated Gen.AI doc

7baf691

Applied comments

8394eb1

ngaloppo reviewed Nov 1, 2023

View reviewed changes

AlexKoff88 and others added 6 commits November 2, 2023 10:34

Added additional infor about GPTQ support

408e6c2

Fixed typos

bcaf449

Update docs/articles_en/openvino_workflow/gen_ai.md

f652c3b

Co-authored-by: Nico Galoppo <[email protected]>

Update docs/articles_en/openvino_workflow/gen_ai.md

2d4e1a4

Co-authored-by: Nico Galoppo <[email protected]>

Update docs/optimization_guide/nncf/code/weight_compression_openvino.py

86c1f3b

Co-authored-by: Nico Galoppo <[email protected]>

Applied changes

6a6d763

AlexKoff88 requested a review from tsavina November 2, 2023 09:07

tsavina reviewed Nov 2, 2023

View reviewed changes

ngaloppo reviewed Nov 2, 2023

View reviewed changes

AlexKoff88 and others added 6 commits November 3, 2023 10:49

Update docs/articles_en/openvino_workflow/gen_ai.md

5107f0f

Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/gen_ai.md

c03b1c2

Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/gen_ai.md

e0558ae

Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/model_optimization_guide/we…

689ed1e

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/model_optimization_guide/we…

ee92c32

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/model_optimization_guide/we…

6779364

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

AlexKoff88 and others added 5 commits November 3, 2023 10:51

Update docs/articles_en/openvino_workflow/model_optimization_guide/we…

35e8e22

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

Update docs/articles_en/openvino_workflow/model_optimization_guide/we…

044fbf5

…ight_compression.md Co-authored-by: Tatiana Savina <[email protected]>

Added table with results

ce5c409

Merged with upstream

55a506b

One more comment

704d210

tsavina approved these changes Nov 7, 2023

View reviewed changes

AlexKoff88 merged commit 0f260c2 into openvinotoolkit:master Nov 8, 2023
11 checks passed

msmykx-intel mentioned this pull request Nov 8, 2023

[DOCS] Added INT4 weight compression description for 23.2 #20959

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC]: Added INT4 weight compression description #20812

[DOC]: Added INT4 weight compression description #20812

AlexKoff88 commented Nov 1, 2023 •

edited

Loading

yury-gorbachev commented Nov 1, 2023

ngaloppo Nov 1, 2023

AlexKoff88 Nov 2, 2023

ngaloppo Nov 2, 2023

AlexKoff88 commented Nov 6, 2023

AlexKoff88 commented Nov 8, 2023

[DOC]: Added INT4 weight compression description #20812

[DOC]: Added INT4 weight compression description #20812

Conversation

AlexKoff88 commented Nov 1, 2023 • edited Loading

yury-gorbachev commented Nov 1, 2023

ngaloppo Nov 1, 2023

Choose a reason for hiding this comment

AlexKoff88 Nov 2, 2023

Choose a reason for hiding this comment

ngaloppo Nov 2, 2023

Choose a reason for hiding this comment

AlexKoff88 commented Nov 6, 2023

AlexKoff88 commented Nov 8, 2023

AlexKoff88 commented Nov 1, 2023 •

edited

Loading