openvinotoolkit · KodiaqQ · Nov 19, 2024 · Nov 11, 2024 · Nov 11, 2024 · Nov 11, 2024
@@ -1,5 +1,64 @@
 # Release Notes
 
+## New in Release 2.14.0
+
+Post-training Quantization:
+
+- Breaking changes:
+  - ...
+- General:
+  - ...
+- Features:
+  - (OpenVINO) Extended support of data-free and data-aware weights compression methods ([nncf.compress_weights()](docs/usage/post_training_compression/weights_compression/Usage.md#user-guide) API) with NF4 per-channel quantization, which makes compressed LLMs more accurate and faster on NPU.
+  - Introduced `backup_mode` optional parameter in `nncf.compress_weights()` to specify the data type for embeddings, convolutions and last linear layers during 4-bit weights compression. Available options are INT8_ASYM by default, INT8_SYM, and NONE which retains the original floating-point precision of the model weights.
+  - Added preview support for the optimization of models in [Torch FX](https://pytorch.org/docs/stable/fx.html) format, nncf.quantize() and nncf.compress_weights() methods. After the optimization such models can be directly executed via [torch.compile()](https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html). See [int8 quantization example](https://github.com/openvinotoolkit/nncf/tree/develop/examples/post_training_quantization/torch_fx/resnet18) for more details.
+  - ...
+- Fixes:
+  - (OpenVINO) Fixed GPTQ weight compression method for Stable Diffusion models.
+  - (Torch, ONNX) Scaled dot product attention pattern quantization setup is aligned with OpenVINO.
+  - ...
+- Improvements:
+  - The `ultralytics` version has been updated to 8.3.22.
+  - Reduction in peak memory by 30-50% for data-aware weight compression with AWQ, SE, LoRA and mixed precision algorithms.
+  - Reduction in compression time by 10-20% for weight compression with AWQ algorithm.
+  - ...
+- Tutorials:
+  - [Post-Training Optimization of Llama-3.2-11B-Vision Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/mllama-3.2/mllama-3.2.ipynb)
+  - [Post-Training Optimization of YOLOv11 Model](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/yolov11-optimization/yolov11-object-detection.ipynb)
+  - [Post-Training Optimization of Whisper Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/whisper-asr-genai/whisper-asr-genai.ipynb)
+  - [Post-Training Optimization of Pixtral Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/pixtral/pixtral.ipynb)
+  - [Post-Training Optimization of LLM ReAct Agent Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-agent-react/llm-agent-react.ipynb)
+- Known issues:
+  - ...
+
+Compression-aware training:
+
+- Breaking changes:
+  - ...
+- General:
+  - ...
+- Features:
+  - ...
+- Fixes:
+  - ...
+- Improvements:
+  - ...
+- Tutorials:
+  - ...
+- Known issues:
+  - ...
+
+Deprecations/Removals:
+
+- nncf.torch.create_compressed_model() function has been deprecated for PyTorch backend.
+- Removed support for python 3.8.
+- The `tensorflow_addons` has been removed from the dependencies.
+- ...
+
+Requirements:
+
+- ...
+
 ## New in Release 2.13.0
 
 Post-training Quantization: