Skip to content

Commit

Permalink
Minor
Browse files Browse the repository at this point in the history
  • Loading branch information
daniil-lyakhov committed Dec 17, 2024
1 parent 664f1bb commit cd0c6d1
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions docs/articles_en/openvino-workflow/torch-compile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -329,12 +329,12 @@ Stable Diffusion WebUI is supported on Intel CPUs, Intel integrated GPUs, and In


Model Quantization and Weights Compression
######################################
#############################################

Model quantization and weights compression are effective methods for accelerating model inference and reducing memory consumption, with minimal impact on model accuracy.
The ``torch.compile`` OpenVINO backend supports two key model optimization APIs:

1. Neural Network Compression Framework(`NNCF <https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html>`__). NNCF offers advanced algorithms for post-training quantization and weights compression in the OpenVINO toolkit.
1. Neural Network Compression Framework (`NNCF <https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html>`__). NNCF offers advanced algorithms for post-training quantization and weights compression in the OpenVINO toolkit.

2. PyTorch 2 export quantization. A general-purpose API designed for quantizing models captured by ``torch.export``.

Expand All @@ -344,7 +344,7 @@ NNCF is the recommended approach for model quantization and weights compression.
NNCF Model Optimization Support (Preview)
+++++++++++++++++++++++++++++++++++++++++++++

The Neural Network Compression Framework(`NNCF <https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html>`__) implements advanced quantization and weights compression algorithms, which can be applied to ``torch.fx.GraphModule`` to speed up inference
The Neural Network Compression Framework (`NNCF <https://docs.openvino.ai/2024/openvino-workflow/model-optimization.html>`__) implements advanced quantization and weights compression algorithms, which can be applied to ``torch.fx.GraphModule`` to speed up inference
and decrease memory consumption.

Model quantization example:
Expand Down

0 comments on commit cd0c6d1

Please sign in to comment.