diff --git a/docs/ModelZoo.md b/docs/ModelZoo.md index 381784d2cbe..cf5194c4830 100644 --- a/docs/ModelZoo.md +++ b/docs/ModelZoo.md @@ -2,7 +2,7 @@ Here we present the results achieved using our sample scripts, example patches to third-party repositories and NNCF configuration files. -The applied quantization compression algorithms are divided into two broad categories: Quantization-Aware Training ([QAT](../README.md#training-time-compression)) and Post-Training Quantization ([PTQ](../README.md#post-training-quantization)). Here we mainly report the QAT results and the PTQ results may be found on an OpenVino Performance Benchmarks [page](https://docs.openvino.ai/latest/openvino_docs_performance_benchmarks.html). +The applied quantization compression algorithms are divided into two broad categories: Quantization-Aware Training ([QAT](../README.md#training-time-compression)) and Post-Training Quantization ([PTQ](../README.md#post-training-quantization)). Here we mainly report the QAT results and the PTQ results may be found on an OpenVino Performance Benchmarks [page](https://docs.openvino.ai/2024/about-openvino/performance-benchmarks.html). - [PyTorch](#pytorch) - [Classification](#pytorch-classification) diff --git a/nncf/experimental/torch/sparsity/movement/MovementSparsity.md b/nncf/experimental/torch/sparsity/movement/MovementSparsity.md index bd0eac7a53f..a53bec622c6 100644 --- a/nncf/experimental/torch/sparsity/movement/MovementSparsity.md +++ b/nncf/experimental/torch/sparsity/movement/MovementSparsity.md @@ -41,7 +41,7 @@ This diagram is the sparsity level of BERT-base model over the optimization life ## Inference Acceleration via [OpenVINO](https://docs.openvino.ai/latest/index.html) -Optimized models are compatible with OpenVINO toolchain. Use `compression_controller.export_model("movement_sparsified_model.onnx")` to export model in onnx format. Sparsified parameters in the onnx are in value of zero. Structured sparse structures can be discarded during ONNX translation to OpenVINO IR using [Model Optimizer](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) with additional option `--transform=Pruning`. Corresponding IR is compressed and deployable with [OpenVINO Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_UG_OV_Runtime_User_Guide.html). To quantify inference performance improvement, both ONNX and IR can be profiled using [Benchmark Tool](https://docs.openvino.ai/latest/openvino_inference_engine_tools_benchmark_tool_README.html). +Optimized models are compatible with OpenVINO toolchain. Use `compression_controller.export_model("movement_sparsified_model.onnx")` to export model in onnx format. Sparsified parameters in the onnx are in value of zero. Structured sparse structures can be discarded during ONNX translation to OpenVINO IR using [Model Conversion](https://docs.openvino.ai/2024/openvino-workflow/model-preparation/convert-model-to-ir.html) with utilizing [pruning transformation](https://docs.openvino.ai/2024/documentation/legacy-features/transition-legacy-conversion-api.html#transform). Corresponding IR is compressed and deployable with [OpenVINO Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_UG_OV_Runtime_User_Guide.html). To quantify inference performance improvement, both ONNX and IR can be profiled using [Benchmark Tool](https://docs.openvino.ai/latest/openvino_inference_engine_tools_benchmark_tool_README.html). ## Getting Started