From cd0c6d1f9bf3c90e0bc669490e752ef7079ccf7f Mon Sep 17 00:00:00 2001 From: dlyakhov Date: Tue, 17 Dec 2024 14:22:01 +0100 Subject: [PATCH] Minor --- docs/articles_en/openvino-workflow/torch-compile.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/articles_en/openvino-workflow/torch-compile.rst b/docs/articles_en/openvino-workflow/torch-compile.rst index e86fc4ac8c51d7..c625dffe25bcb0 100644 --- a/docs/articles_en/openvino-workflow/torch-compile.rst +++ b/docs/articles_en/openvino-workflow/torch-compile.rst @@ -329,12 +329,12 @@ Stable Diffusion WebUI is supported on Intel CPUs, Intel integrated GPUs, and In Model Quantization and Weights Compression -###################################### +############################################# Model quantization and weights compression are effective methods for accelerating model inference and reducing memory consumption, with minimal impact on model accuracy. The ``torch.compile`` OpenVINO backend supports two key model optimization APIs: -1. Neural Network Compression Framework(`NNCF `__). NNCF offers advanced algorithms for post-training quantization and weights compression in the OpenVINO toolkit. +1. Neural Network Compression Framework (`NNCF `__). NNCF offers advanced algorithms for post-training quantization and weights compression in the OpenVINO toolkit. 2. PyTorch 2 export quantization. A general-purpose API designed for quantizing models captured by ``torch.export``. @@ -344,7 +344,7 @@ NNCF is the recommended approach for model quantization and weights compression. NNCF Model Optimization Support (Preview) +++++++++++++++++++++++++++++++++++++++++++++ -The Neural Network Compression Framework(`NNCF `__) implements advanced quantization and weights compression algorithms, which can be applied to ``torch.fx.GraphModule`` to speed up inference +The Neural Network Compression Framework (`NNCF `__) implements advanced quantization and weights compression algorithms, which can be applied to ``torch.fx.GraphModule`` to speed up inference and decrease memory consumption. Model quantization example: