Merge branch 'itikhono/bug_fix/keep_const_precision_attr' of https://…

…github.com/itikhono/openvino into itikhono/bug_fix/keep_const_precision_attr
itikhono · Dec 16, 2024 · 1fc379b · 1fc379b
2 parents b4b3513 + ad10166
commit 1fc379b
Show file tree

Hide file tree

Showing 46 changed files with 1,055 additions and 573 deletions.
diff --git a/.github/workflows/job_python_api_tests.yml b/.github/workflows/job_python_api_tests.yml
@@ -101,10 +101,10 @@ jobs:
             --junitxml=${INSTALL_TEST_DIR}/TEST-Pyngraph.xml \
             --ignore=${INSTALL_TEST_DIR}/tests/pyopenvino/tests/test_utils/test_utils.py
 
-      - name: Python API Tests -- numpy>=2.0.0
+      - name: Python API Tests -- numpy<2.0.0
         run: |
           python3 -m pip uninstall -y numpy
-          python3 -m pip install "numpy~=2.0.0"
+          python3 -m pip install "numpy~=1.26.0"
           python3 -m pip install -r ${INSTALL_TEST_DIR}/tests/bindings/python/requirements_test.txt
           # for 'template' extension
           export LD_LIBRARY_PATH=${INSTALL_TEST_DIR}/tests/:$LD_LIBRARY_PATH

diff --git a/docs/articles_en/about-openvino/compatibility-and-support/supported-devices.rst b/docs/articles_en/about-openvino/compatibility-and-support/supported-devices.rst
@@ -83,7 +83,7 @@ For setting up a relevant configuration, refer to the
 :doc:`Integrate with Customer Application <../../openvino-workflow/running-inference/integrate-openvino-with-your-application>`
 topic (step 3 "Configure input and output").
 
-.. dropdown:: Device support across OpenVINO 2024.5 distributions
+.. dropdown:: Device support across OpenVINO 2024.6 distributions
 
    ===============  ==========  ======  ===============  ========  ============ ========== ========== ==========
    Device           Archives    PyPI    APT/YUM/ZYPPER    Conda     Homebrew     vcpkg      Conan       npm

diff --git a/docs/articles_en/about-openvino/release-notes-openvino.rst b/docs/articles_en/about-openvino/release-notes-openvino.rst
diff --git a/docs/articles_en/documentation/openvino-extensibility.rst b/docs/articles_en/documentation/openvino-extensibility.rst
@@ -45,7 +45,7 @@ The first part is required for inference. The second part is required for succes
 Definition of Operation Semantics
 #################################
 
-If the custom operation can be mathematically represented as a combination of exiting OpenVINO operations and such decomposition gives desired performance, then low-level operation implementation is not required. Refer to the latest OpenVINO operation set, when deciding feasibility of such decomposition. You can use any valid combination of exiting operations. The next section of this document describes the way to map a custom operation.
+If the custom operation can be mathematically represented as a combination of existing OpenVINO operations and such decomposition gives desired performance, then low-level operation implementation is not required. Refer to the latest OpenVINO operation set, when deciding feasibility of such decomposition. You can use any valid combination of existing operations. The next section of this document describes the way to map a custom operation.
 
 If such decomposition is not possible or appears too bulky with a large number of constituent operations that do not perform well, then a new class for the custom operation should be implemented, as described in the :doc:`Custom Operation Guide <openvino-extensibility/custom-openvino-operations>`.
 

diff --git a/docs/articles_en/get-started/configurations/genai-dependencies.rst b/docs/articles_en/get-started/configurations/genai-dependencies.rst
@@ -4,12 +4,12 @@ OpenVINO™ GenAI Dependencies
 OpenVINO™ GenAI depends on both `OpenVINO <https://github.com/openvinotoolkit/openvino>`__ and
 `OpenVINO Tokenizers <https://github.com/openvinotoolkit/openvino_tokenizers>`__. During OpenVINO™
 GenAI installation from PyPi, the same versions of OpenVINO and OpenVINO Tokenizers
-are used (e.g. ``openvino==2024.5.0`` and ``openvino-tokenizers==2024.5.0.0`` are installed for
-``openvino-genai==2024.5.0``).
+are used (e.g. ``openvino==2024.6.0`` and ``openvino-tokenizers==2024.6.0.0`` are installed for
+``openvino-genai==2024.6.0``).
 
-Trying to update any of the dependency packages might result in a version incompatiblibty
+Trying to update any of the dependency packages might result in a version incompatibility
 due to different Application Binary Interfaces (ABIs), which will result in errors while running
-OpenVINO GenAI. Having package version in the ``<MAJOR>.<MINOR>.<PATCH>.<REVISION>`` format, allows
+OpenVINO GenAI. Having package version in the ``<MAJOR>.<MINOR>.<PATCH>.<REVISION>`` format, enables
 changing the ``<REVISION>`` portion of the full version to ensure ABI compatibility. Changing
 ``<MAJOR>``, ``<MINOR>`` or ``<PATCH>`` part of the version may break ABI.
 

diff --git a/docs/articles_en/get-started/install-openvino.rst b/docs/articles_en/get-started/install-openvino.rst
@@ -1,4 +1,4 @@
-Install OpenVINO™ 2024.5
+Install OpenVINO™ 2024.6
 ==========================
 
 
@@ -23,10 +23,11 @@ Install OpenVINO™ 2024.5
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <iframe id="selector" src="../_static/selector-tool/selector-2a63478.html" style="width: 100%; border: none" title="Download Intel® Distribution of OpenVINO™ Toolkit"></iframe>
 
-OpenVINO 2024.5, described here, is not a Long-Term-Support version!
+OpenVINO 2024.6, described here, is a Long-Term-Support version!
 All currently supported versions are:
 
-* 2024.5 (development)
+* 2025.0 (in development)
+* 2024.6 (LTS)
 * 2023.3 (LTS)
 
 

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide.rst b/docs/articles_en/learn-openvino/llm_inference_guide.rst
@@ -20,12 +20,12 @@ Generative AI workflow
 Generative AI is a specific area of Deep Learning models used for producing new and “original”
 data, based on input in the form of image, sound, or natural language text. Due to their
 complexity and size, generative AI pipelines are more difficult to deploy and run efficiently.
-OpenVINO simplifies the process and ensures high-performance integrations, with the following
+OpenVINO™ simplifies the process and ensures high-performance integrations, with the following
 options:
 
 .. tab-set::
 
-   .. tab-item:: OpenVINO GenAI
+   .. tab-item:: OpenVINO™ GenAI
 
       | - Suggested for production deployment for the supported use cases.
       | - Smaller footprint and fewer dependencies.
@@ -39,6 +39,8 @@ options:
       text generation loop, tokenization, and scheduling, offering ease of use and high
       performance.
 
+      `Check out the OpenVINO GenAI Quick-start Guide [PDF] <https://docs.openvino.ai/nightly/_static/download/GenAI_Quick_Start_Guide.pdf>`__
+
    .. tab-item:: Hugging Face integration
 
       | - Suggested for prototyping and, if the use case is not covered by OpenVINO GenAI, production.
@@ -54,49 +56,34 @@ options:
       as well as conversion on the fly. For integration with the final product it may offer
       lower performance, though.
 
-`Check out the GenAI Quick-start Guide [PDF] <https://docs.openvino.ai/2024/_static/download/GenAI_Quick_Start_Guide.pdf>`__
-
-The advantages of using OpenVINO for LLM deployment:
-
-.. dropdown:: Fewer dependencies and smaller footprint
-   :animate: fade-in-slide-down
-   :color: secondary
-
-   Less bloated than frameworks such as Hugging Face and PyTorch, with a smaller binary size and reduced
-   memory footprint, makes deployments easier and updates more manageable.
-
-.. dropdown:: Compression and precision management
-   :animate: fade-in-slide-down
-   :color: secondary
 
-   Techniques such as 8-bit and 4-bit weight compression, including embedding layers, and storage
-   format reduction. This includes fp16 precision for non-compressed models and int8/int4 for
-   compressed models, like GPTQ models from `Hugging Face <https://huggingface.co/models>`__.
 
-.. dropdown:: Enhanced inference capabilities
-   :animate: fade-in-slide-down
-   :color: secondary
+The advantages of using OpenVINO for generative model deployment:
 
-   Advanced features like in-place KV-cache, dynamic quantization, KV-cache quantization and
-   encapsulation, dynamic beam size configuration, and speculative sampling, and more are
-   available.
+| **Fewer dependencies and smaller footprint**
+|    Less bloated than frameworks such as Hugging Face and PyTorch, with a smaller binary size and reduced
+     memory footprint, makes deployments easier and updates more manageable.
 
-.. dropdown:: Stateful model optimization
-   :animate: fade-in-slide-down
-   :color: secondary
+| **Compression and precision management**
+|    Techniques such as 8-bit and 4-bit weight compression, including embedding layers, and storage
+     format reduction. This includes fp16 precision for non-compressed models and int8/int4 for
+     compressed models, like GPTQ models from `Hugging Face <https://huggingface.co/models>`__.
 
-   Models from the Hugging Face Transformers are converted into a stateful form, optimizing
-   inference performance and memory usage in long-running text generation tasks by managing past
-   KV-cache tensors more efficiently internally. This feature is automatically activated for
-   many supported models, while unsupported ones remain stateless. Learn more about the
-   :doc:`Stateful models and State API <../openvino-workflow/running-inference/stateful-models>`.
+| **Enhanced inference capabilities**
+|    Advanced features like in-place KV-cache, dynamic quantization, KV-cache quantization and
+     encapsulation, dynamic beam size configuration, and speculative sampling, and more are
+     available.
 
-.. dropdown:: Optimized LLM inference
-   :animate: fade-in-slide-down
-   :color: secondary
+| **Stateful model optimization**
+|    Models from the Hugging Face Transformers are converted into a stateful form, optimizing
+     inference performance and memory usage in long-running text generation tasks by managing past
+     KV-cache tensors more efficiently internally. This feature is automatically activated for
+     many supported models, while unsupported ones remain stateless. Learn more about the
+     :doc:`Stateful models and State API <../openvino-workflow/running-inference/stateful-models>`.
 
-   Includes a Python API for rapid development and C++ for further optimization, offering
-   better performance than Python-based runtimes.
+| **Optimized LLM inference**
+|    Includes a Python API for rapid development and C++ for further optimization, offering
+     better performance than Python-based runtimes.
 
 
 Proceed to guides on:

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
@@ -28,6 +28,10 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
 
 .. dropdown:: Text-to-Image Generation
 
+   OpenVINO GenAI introduces the openvino_genai.Text2ImagePipeline for inference of text-to-image
+   models such as: as Stable Diffusion 1.5, 2.1, XL, LCM, Flex, and more.
+   See the following usage example for reference.
+
    .. tab-set::
 
       .. tab-item:: Python
@@ -579,8 +583,9 @@ compression is done by NNCF at the model export stage. The exported model contai
 information necessary for execution, including the tokenizer/detokenizer and the generation
 config, ensuring that its results match those generated by Hugging Face.
 
-The `LLMPipeline` is the main object used for decoding and handles all the necessary steps.
-You can construct it directly from the folder with the converted model.
+The `LLMPipeline` is the main object to setup the model for text generation. You can provide the
+converted model to this object, specify the device for inference, and provide additional
+parameters.
 
 
 .. tab-set::
@@ -911,7 +916,7 @@ running the following code:
 GenAI API
 #######################################
 
-The use case described here uses the following OpenVINO GenAI API methods:
+The use case described here uses the following OpenVINO GenAI API classes:
 
 * generation_config - defines a configuration class for text generation,
   enabling customization of the generation process such as the maximum length of
@@ -921,7 +926,6 @@ The use case described here uses the following OpenVINO GenAI API methods:
   text generation, and managing outputs with configurable options.
 * streamer_base - an abstract base class for creating streamers.
 * tokenizer - the tokenizer class for text encoding and decoding.
-* visibility  -  controls the visibility of the GenAI library.
 
 Learn more from the `GenAI API reference <https://docs.openvino.ai/2024/api/genai_api/api.html>`__.
 

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-model-preparation.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-model-preparation.rst
@@ -7,8 +7,8 @@ Generative Model Preparation
 
 
 
-Since generative AI models tend to be big and resource-heavy, it is advisable to store them
-locally and optimize for efficient inference. This article will show how to prepare
+Since generative AI models tend to be big and resource-heavy, it is advisable to
+optimize them for efficient inference. This article will show how to prepare
 LLM models for inference with OpenVINO by:
 
 * `Downloading Models from Hugging Face <#download-generative-models-from-hugging-face-hub>`__

diff --git a/docs/dev/ov_dependencies.txt b/docs/dev/ov_dependencies.txt
@@ -1,6 +1,6 @@
 # Copyright (C) 2024 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
-#This file provides a comprehensive list of all dependencies of OpenVINO 2024.5
+#This file provides a comprehensive list of all dependencies of OpenVINO 2024.6
 #The file is part of the automation pipeline for posting OpenVINO IR models on the HuggingFace Hub, including OneBOM dependency checks.
 
 

diff --git a/docs/sphinx_setup/_static/download/GenAI_Quick_Start_Guide.pdf b/docs/sphinx_setup/_static/download/GenAI_Quick_Start_Guide.pdf
diff --git a/docs/sphinx_setup/index.rst b/docs/sphinx_setup/index.rst
@@ -25,16 +25,16 @@ hardware and environments, on-premises and on-device, in the browser or in the c
          <section class="splide" aria-label="Splide Banner Carousel">
            <div class="splide__track">
          		<ul class="splide__list">
+               <li id="ov-homepage-slide2" class="splide__slide">
+                  <p class="ov-homepage-slide-title">New GenAI API</p>
+                  <p class="ov-homepage-slide-subtitle">Generative AI in only a few lines of code!</p>
+                  <a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/nightly/learn-openvino/llm_inference_guide/genai-guide.html">Check out our guide</a>
+                  </li>
                   <li id="ov-homepage-slide1" class="splide__slide">
                   <p class="ov-homepage-slide-title">OpenVINO models on Hugging Face!</p>
                   <p class="ov-homepage-slide-subtitle">Get pre-optimized OpenVINO models, no need to convert!</p>
                   <a class="ov-homepage-banner-btn" href="https://huggingface.co/OpenVINO">Visit Hugging Face</a>
                   </li>
-                  <li id="ov-homepage-slide2" class="splide__slide">
-                  <p class="ov-homepage-slide-title">New Generative AI API</p>
-                  <p class="ov-homepage-slide-subtitle">Generate text with LLMs in only a few lines of code!</p>
-                  <a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html">Check out our guide</a>
-                  </li>
                   <li id="ov-homepage-slide3" class="splide__slide">
                   <p class="ov-homepage-slide-title">Improved model serving</p>
                   <p class="ov-homepage-slide-subtitle">OpenVINO Model Server has improved parallel inferencing!</p>

diff --git a/src/bindings/js/node/package.json b/src/bindings/js/node/package.json
@@ -51,6 +51,17 @@
     "host": "https://storage.openvinotoolkit.org"
   },
   "keywords": [
-    "OpenVINO"
+    "OpenVINO",
+    "openvino",
+    "openvino-node",
+    "openvino npm",
+    "openvino binding",
+    "openvino node.js",
+    "openvino library",
+    "intel openvino",
+    "openvino toolkit",
+    "openvino API",
+    "openvino SDK",
+    "openvino integration"
   ]
 }
diff --git a/src/bindings/python/constraints.txt b/src/bindings/python/constraints.txt
@@ -1,5 +1,5 @@
 # used in multiple components
-numpy>=1.16.6,<2.2.0  # Python bindings, frontends
+numpy>=1.16.6,<2.3.0  # Python bindings, frontends
 
 # pytest
 pytest>=5.0,<8.4

diff --git a/src/bindings/python/requirements.txt b/src/bindings/python/requirements.txt
@@ -1,3 +1,3 @@
-numpy>=1.16.6,<2.2.0
+numpy>=1.16.6,<2.3.0
 openvino-telemetry>=2023.2.1
 packaging
diff --git a/src/bindings/python/src/openvino/__init__.py b/src/bindings/python/src/openvino/__init__.py
@@ -27,11 +27,11 @@
 from openvino import properties as properties
 
 # Import most important classes and functions from openvino.runtime
-from openvino.runtime import Model
-from openvino.runtime import Core
-from openvino.runtime import CompiledModel
-from openvino.runtime import InferRequest
-from openvino.runtime import AsyncInferQueue
+from openvino._ov_api import Model
+from openvino._ov_api import Core
+from openvino._ov_api import CompiledModel
+from openvino._ov_api import InferRequest
+from openvino._ov_api import AsyncInferQueue
 
 from openvino.runtime import Symbol
 from openvino.runtime import Dimension
@@ -43,12 +43,13 @@
 from openvino.runtime import Tensor
 from openvino.runtime import OVAny
 
-from openvino.runtime import compile_model
+# Helper functions for openvino module
+from openvino.runtime.utils.data_helpers import tensor_from_file
+from openvino._ov_api import compile_model
 from openvino.runtime import get_batch
 from openvino.runtime import set_batch
 from openvino.runtime import serialize
 from openvino.runtime import shutdown
-from openvino.runtime import tensor_from_file
 from openvino.runtime import save_model
 from openvino.runtime import layout_helpers
 

diff --git a/...ngs/python/src/openvino/runtime/ie_api.py → src/bindings/python/src/openvino/_ov_api.py b/...ngs/python/src/openvino/runtime/ie_api.py → src/bindings/python/src/openvino/_ov_api.py
diff --git a/...python/src/openvino/runtime/exceptions.py → ...indings/python/src/openvino/exceptions.py b/...python/src/openvino/runtime/exceptions.py → ...indings/python/src/openvino/exceptions.py
diff --git a/src/bindings/python/src/openvino/opset8/ops.py b/src/bindings/python/src/openvino/opset8/ops.py
@@ -7,7 +7,7 @@
 from typing import List, Optional, Tuple
 
 import numpy as np
-from openvino.runtime.exceptions import UserInputError
+from openvino.exceptions import UserInputError
 from openvino.op import Constant, Parameter, if_op
 from openvino.runtime import Node
 from openvino.runtime.opset_utils import _get_node_factory

diff --git a/src/bindings/python/src/openvino/runtime/exceptions/__init__.py b/src/bindings/python/src/openvino/runtime/exceptions/__init__.py
@@ -0,0 +1,7 @@
+# -*- coding: utf-8 -*-
+# Copyright (C) 2018-2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from openvino.exceptions import OVError
+from openvino.exceptions import UserInputError
+from openvino.exceptions import OVTypeError
diff --git a/src/bindings/python/src/openvino/runtime/ie_api/__init__.py b/src/bindings/python/src/openvino/runtime/ie_api/__init__.py
@@ -0,0 +1,12 @@
+# -*- coding: utf-8 -*-
+# Copyright (C) 2018-2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from openvino._ov_api import Core
+from openvino._ov_api import CompiledModel
+from openvino._ov_api import InferRequest
+from openvino._ov_api import Model
+from openvino._ov_api import AsyncInferQueue
+
+from openvino._ov_api import tensor_from_file
+from openvino._ov_api import compile_model
diff --git a/src/bindings/python/src/pyopenvino/graph/preprocess/pre_post_process.cpp b/src/bindings/python/src/pyopenvino/graph/preprocess/pre_post_process.cpp
@@ -191,7 +191,7 @@ static void regclass_graph_PreProcessSteps(py::module m) {
             :param pads_end: Number of elements matches the number of indices in data attribute. Specifies the number of padding elements at the ending of each axis.
             :type pads_end: 1D tensor of type T_INT.
             :param value: All new elements are populated with this value or with 0 if input not provided. Shouldn’t be set for other pad_mode values.
-            :type value: scalar tensor of type T. 
+            :type value: scalar tensor of type T.
             :param mode: pad_mode specifies the method used to generate new element values.
             :type mode: string
             :return: Reference to itself, allows chaining of calls in client's code in a builder-like manner.
@@ -219,7 +219,7 @@ static void regclass_graph_PreProcessSteps(py::module m) {
             :param pads_end: Number of elements matches the number of indices in data attribute. Specifies the number of padding elements at the ending of each axis.
             :type pads_end: 1D tensor of type T_INT.
             :param value: All new elements are populated with this value or with 0 if input not provided. Shouldn’t be set for other pad_mode values.
-            :type value: scalar tensor of type T. 
+            :type value: scalar tensor of type T.
             :param mode: pad_mode specifies the method used to generate new element values.
             :type mode: string
             :return: Reference to itself, allows chaining of calls in client's code in a builder-like manner.
@@ -308,7 +308,8 @@ static void regclass_graph_InputTensorInfo(py::module m) {
         },
         py::arg("layout"),
         R"(
-            Set layout for input tensor info 
+            Set layout for input tensor info
+
             :param layout: layout to be set
             :type layout: Union[str, openvino.runtime.Layout]
         )");
@@ -422,7 +423,8 @@ static void regclass_graph_OutputTensorInfo(py::module m) {
         },
         py::arg("layout"),
         R"(
-            Set layout for output tensor info 
+            Set layout for output tensor info
+
             :param layout: layout to be set
             :type layout: Union[str, openvino.runtime.Layout]
         )");
@@ -475,7 +477,8 @@ static void regclass_graph_OutputModelInfo(py::module m) {
         },
         py::arg("layout"),
         R"(
-            Set layout for output model info 
+            Set layout for output model info
+
             :param layout: layout to be set
             :type layout: Union[str, openvino.runtime.Layout]
         )");

diff --git a/src/bindings/python/tests/test_runtime/test_input_node.py b/src/bindings/python/tests/test_runtime/test_input_node.py
@@ -75,7 +75,8 @@ def test_input_get_source_output(device):
     net_input = compiled_model.output(0)
     input_node = net_input.get_node().inputs()[0]
     name = input_node.get_source_output().get_node().get_friendly_name()
-    assert name == "relu"
+    # Expected ReLu node name can be changed if conversion precision applied (new Convert node added)
+    assert name in ("relu", "relu.0")
 
 
 def test_input_get_tensor(device):