Skip to content

Commit

Permalink
Merge branch 'releases/2024/3' into cmakelist-update-24-3
Browse files Browse the repository at this point in the history
  • Loading branch information
msmykx-intel authored Jul 30, 2024
2 parents d1c7002 + dca5ddc commit 4d42518
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 27 deletions.
36 changes: 17 additions & 19 deletions docs/articles_en/about-openvino/release-notes-openvino.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
OpenVINO Release Notes
=============================

.. meta::
:description: See what has changed in OpenVINO with the latest release, as well as all
previous releases in this year's cycle.

OpenVINO Release Notes
=============================

.. toctree::
:maxdepth: 1
Expand All @@ -14,7 +15,7 @@ OpenVINO Release Notes



2024.3 - 30 July 2024
2024.3 - 31 July 2024
#############################

:doc:`System Requirements <./release-notes-openvino/system-requirements>` | :doc:`Release policy <./release-notes-openvino/release-policy>` | :doc:`Installation Guides <./../get-started/install-openvino>`
Expand All @@ -23,21 +24,21 @@ OpenVINO Release Notes
What's new
+++++++++++++++++++++++++++++

More Gen AI coverage and framework integrations to minimize code changes.
* More Gen AI coverage and framework integrations to minimize code changes.

* OpenVINO pre-optimized models are now available in Hugging Face making it easier for developers
to get started with these models.
* OpenVINO pre-optimized models are now available in Hugging Face making it easier for developers
to get started with these models.

Broader Large Language Model (LLM) support and more model compression techniques.
* Broader Large Language Model (LLM) support and more model compression techniques.

* Significant improvement in LLM performance on Intel built-in and discrete GPUs with the addition
of dynamic quantization, Multi-Head Attention (MHA), and OneDNN enhancements.
* Significant improvement in LLM performance on Intel discrete GPUs with the addition of
Multi-Head Attention (MHA) and OneDNN enhancements.

More portability and performance to run AI at the edge, in the cloud, or locally.
* More portability and performance to run AI at the edge, in the cloud, or locally.

* Improved CPU performance when serving LLMs with the inclusion of vLLM and continuous batching
in the OpenVINO Model Server (OVMS). vLLM is an easy-to-use open-source library that supports
efficient LLM inferencing and model serving.
* Improved CPU performance when serving LLMs with the inclusion of vLLM and continuous batching
in the OpenVINO Model Server (OVMS). vLLM is an easy-to-use open-source library that supports
efficient LLM inferencing and model serving.



Expand All @@ -59,7 +60,7 @@ Common

* Increasing support for models like YoloV10 or PixArt-XL-2, thanks to enabling Squeeze and
Concat layers.
* Performance of precision conversion fp16/bf16 -> fp32.
* Performance of precision conversion FP16/BF16 -> FP32.



Expand Down Expand Up @@ -97,9 +98,6 @@ GPU Device Plugin

* LLMs and Stable Diffusion on discrete GPUs, due to latency decrease, through optimizations
such as Multi-Head Attention (MHA) and oneDNN improvements.
* First token latency of LLMs for large input cases on Core Ultra integrated GPU. It can be
further improved with dynamic quantization enabled with an application
`interface <https://docs.openvino.ai/2024/api/c_cpp_api/group__ov__dev__exec__model.html#_CPPv4N2ov4hint31dynamic_quantization_group_sizeE>`__.
* Whisper models on discrete GPU.


Expand Down Expand Up @@ -191,7 +189,7 @@ Neural Network Compression Framework
Act->MatMul and Act->MUltiply->MatMul to cover the Phi family models.
* The representation of symmetrically quantized weights has been updated to a signed data type
with no zero point. This allows NPU to support compressed LLMs with the symmetric mode.
* bf16 models in Post-Training Quantization are now supported; nncf.quantize().
* BF16 models in Post-Training Quantization are now supported; nncf.quantize().
* `Activation Sparsity <https://arxiv.org/abs/2310.17157>`__ (Contextual Sparsity) algorithm in
the Weight Compression method is now supported (preview), speeding up LLM inference.
The algorithm is enabled by setting the ``target_sparsity_by_scope`` option in
Expand Down Expand Up @@ -431,7 +429,7 @@ Previous 2024 releases
compression of LLMs. Enabled by `gptq=True`` in nncf.compress_weights().
* Scale Estimation algorithm for more accurate 4-bit compressed LLMs. Enabled by
`scale_estimation=True`` in nncf.compress_weights().
* Added support for models with bf16 weights in nncf.compress_weights().
* Added support for models with BF16 weights in nncf.compress_weights().
* nncf.quantize() method is now the recommended path for quantization initialization of
PyTorch models in Quantization-Aware Training. See example for more details.
* compressed_model.nncf.get_config() and nncf.torch.load_from_config() API have been added to
Expand Down
2 changes: 1 addition & 1 deletion docs/sphinx_setup/_static/js/graphs.js
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ class Filter {

// param: GraphData[], ieType
static FilterByIeType(graphDataArr, value) {
return graphDataArr.filter((data) => data.ieType.includes(value));
return graphDataArr.filter((data) => data.ieType && data.ieType.includes(value));
}

// param: GraphData[], clientPlatforms[]
Expand Down
19 changes: 12 additions & 7 deletions docs/sphinx_setup/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ OpenVINO 2024.3
.. meta::
:google-site-verification: _YqumYQ98cmXUTwtzM_0WIIadtDc6r_TMYGbmGgNvrk

**OpenVINO is an open-source toolkit** for optimizing and deploying deep learning models from cloud
to edge. It accelerates deep learning inference across various use cases, such as generative AI, video,
audio, and language with models from popular frameworks like PyTorch, TensorFlow, ONNX, and more.
Convert and optimize models, and deploy across a mix of Intel® hardware and environments, on-premises
and on-device, in the browser or in the cloud.
**OpenVINO is an open-source toolkit** for optimizing and deploying deep learning models from
cloud to edge. It accelerates deep learning inference across various use cases, such as
generative AI, video, audio, and language with models from popular frameworks like PyTorch,
TensorFlow, ONNX, and more. Convert and optimize models, and deploy across a mix of Intel®
hardware and environments, on-premises and on-device, in the browser or in the cloud.

Check out the `OpenVINO Cheat Sheet. <https://docs.openvino.ai/2024/_static/download/OpenVINO_Quick_Start_Guide.pdf>`__

Expand All @@ -26,16 +26,21 @@ Check out the `OpenVINO Cheat Sheet. <https://docs.openvino.ai/2024/_static/down
<div class="splide__track">
<ul class="splide__list">
<li id="ov-homepage-slide1" class="splide__slide">
<p class="ov-homepage-slide-title">OpenVINO models on Hugging Face!</p>
<p class="ov-homepage-slide-subtitle">Get pre-optimized OpenVINO models, no need to convert!</p>
<a class="ov-homepage-banner-btn" href="https://huggingface.co/OpenVINO">Visit Hugging Face</a>
</li>
<li id="ov-homepage-slide2" class="splide__slide">
<p class="ov-homepage-slide-title">New Generative AI API</p>
<p class="ov-homepage-slide-subtitle">Generate text with LLMs in only a few lines of code!</p>
<a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html">Check out our guide</a>
</li>
<li id="ov-homepage-slide2" class="splide__slide">
<li id="ov-homepage-slide3" class="splide__slide">
<p class="ov-homepage-slide-title">Improved model serving</p>
<p class="ov-homepage-slide-subtitle">OpenVINO Model Server has improved parallel inferencing!</p>
<a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2024/ovms_what_is_openvino_model_server.html">Learn more</a>
</li>
<li id="ov-homepage-slide3" class="splide__slide">
<li id="ov-homepage-slide4" class="splide__slide">
<p class="ov-homepage-slide-title">OpenVINO via PyTorch 2.0 torch.compile()</p>
<p class="ov-homepage-slide-subtitle">Use OpenVINO directly in PyTorch-native applications!</p>
<a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html">Learn more</a>
Expand Down

0 comments on commit 4d42518

Please sign in to comment.