diff --git a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-convert-models-as-python-objects.rst b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-convert-models-as-python-objects.rst index 7749ef4f5fe10d..212aea1cf5790f 100644 --- a/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-convert-models-as-python-objects.rst +++ b/docs/articles_en/documentation/legacy-features/transition-legacy-conversion-api/legacy-conversion-api/[legacy]-convert-models-as-python-objects.rst @@ -7,7 +7,7 @@ The code described here has been **deprecated!** Do not use it to avoid working with a legacy solution. It will be kept for some time to ensure backwards compatibility, but **you should not use** it in contemporary applications. - This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Model Preparation <../../../../openvino-workflow/model-preparation>` article. + This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Model Preparation <../../../../openvino-workflow/model-preparation>` article. Model conversion API is represented by ``convert_model()`` method in openvino.tools.mo namespace. ``convert_model()`` is compatible with types from openvino.runtime, like PartialShape, Layout, Type, etc. @@ -32,8 +32,8 @@ Example of converting a PyTorch model directly from memory: The following types are supported as an input model for ``convert_model()``: -* PyTorch - ``torch.nn.Module``, ``torch.jit.ScriptModule``, ``torch.jit.ScriptFunction``. Refer to the :doc:`Converting a PyTorch Model<[legacy]-supported-model-formats/[legacy]-convert-pytorch>` article for more details. -* TensorFlow / TensorFlow 2 / Keras - ``tf.keras.Model``, ``tf.keras.layers.Layer``, ``tf.compat.v1.Graph``, ``tf.compat.v1.GraphDef``, ``tf.Module``, ``tf.function``, ``tf.compat.v1.session``, ``tf.train.checkpoint``. Refer to the :doc:`Converting a TensorFlow Model<[legacy]-supported-model-formats/[legacy]-convert-tensorflow>` article for more details. +* PyTorch - ``torch.nn.Module``, ``torch.jit.ScriptModule``, ``torch.jit.ScriptFunction``. Refer to the :doc:`Converting a PyTorch Model <[legacy]-supported-model-formats/[legacy]-convert-pytorch>` article for more details. +* TensorFlow / TensorFlow 2 / Keras - ``tf.keras.Model``, ``tf.keras.layers.Layer``, ``tf.compat.v1.Graph``, ``tf.compat.v1.GraphDef``, ``tf.Module``, ``tf.function``, ``tf.compat.v1.session``, ``tf.train.checkpoint``. Refer to the :doc:`Converting a TensorFlow Model <[legacy]-supported-model-formats/[legacy]-convert-tensorflow>` article for more details. ``convert_model()`` accepts all parameters available in the MO command-line tool. Parameters can be specified by Python classes or string analogs, similar to the command-line tool. @@ -64,7 +64,7 @@ Example of using a tuple in the ``input`` parameter to cut a model: ov_model = convert_model(model, input=("input_name", [3], np.float32)) -For complex cases, when a value needs to be set in the ``input`` parameter, the ``InputCutInfo`` class can be used. ``InputCutInfo`` accepts four parameters: ``name``, ``shape``, ``type``, and ``value``. +For complex cases, when a value needs to be set in the ``input`` parameter, the ``InputCutInfo`` class can be used. ``InputCutInfo`` accepts four parameters: ``name``, ``shape``, ``type``, and ``value``. ``InputCutInfo("input_name", [3], np.float32, [0.5, 2.1, 3.4])`` is equivalent of ``InputCutInfo(name="input_name", shape=[3], type=np.float32, value=[0.5, 2.1, 3.4])``. @@ -85,11 +85,11 @@ Example of using ``InputCutInfo`` to freeze an input with value: ov_model = convert_model(model, input=InputCutInfo("input_name", [3], np.float32, [0.5, 2.1, 3.4])) To set parameters for models with multiple inputs, use ``list`` of parameters. -Parameters supporting ``list``: +Parameters supporting ``list``: * input * input_shape -* layout +* layout * source_layout * dest_layout * mean_values diff --git a/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations/step3-main.rst b/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations/step3-main.rst index cf4961502f10e8..66c46124e1c1a2 100644 --- a/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations/step3-main.rst +++ b/docs/articles_en/documentation/openvino-extensibility/openvino-plugin-library/advanced-guides/low-precision-transformations/step3-main.rst @@ -69,7 +69,7 @@ Main transformations are the majority of low precision transformations. Transfor * :doc:`MultiplyPartialTransformation ` * :doc:`MVNTransformation ` * :doc:`NormalizeL2Transformation ` -* :doc:`PadTransformation` +* :doc:`PadTransformation ` * :doc:`PReluTransformation ` * :doc:`ReduceMaxTransformation ` * :doc:`ReduceMeanTransformation ` diff --git a/docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.rst b/docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.rst index 126051473c79b6..023b2d1f189b4e 100644 --- a/docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.rst +++ b/docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.rst @@ -213,7 +213,7 @@ Alternatively, it can be enabled explicitly via the device notion, for example ` :fragment: compile_model_auto_batch -For more details, see the :doc:`Automatic batching`. +For more details, see the :doc:`Automatic batching `. Multi-stream Execution +++++++++++++++++++++++++++++++++++++++ @@ -230,7 +230,7 @@ which means that the incoming infer requests can be processed simultaneously. When multiple inferences of the same model need to be executed in parallel, the multi-stream feature is preferred to multiple instances of the model or application. The reason for this is that the implementation of streams in the GPU plugin supports weight memory sharing across streams, thus, memory consumption may be lower, compared to the other approaches. -For more details, see the :doc:`optimization guide<../optimize-inference>`. +For more details, see the :doc:`optimization guide <../optimize-inference>`. Dynamic Shapes +++++++++++++++++++++++++++++++++++++++ @@ -365,9 +365,9 @@ The GPU plugin has the following additional preprocessing options: With such preprocessing, GPU plugin will expect ``ov::intel_gpu::ocl::ClImage2DTensor`` (or derived) to be passed for each NV12 plane via ``ov::InferRequest::set_tensor()`` or ``ov::InferRequest::set_tensors()`` methods. -For usage examples, refer to the :doc:`RemoteTensor API`. +For usage examples, refer to the :doc:`RemoteTensor API `. -For more details, see the :doc:`preprocessing API<../optimize-inference/optimize-preprocessing>`. +For more details, see the :doc:`preprocessing API <../optimize-inference/optimize-preprocessing>`. Model Caching +++++++++++++++++++++++++++++++++++++++ @@ -465,7 +465,7 @@ GPU Performance Checklist: Summary Since OpenVINO relies on the OpenCL kernels for the GPU implementation, many general OpenCL tips apply: -- Prefer ``FP16`` inference precision over ``FP32``, as Model Conversion API can generate both variants, and the ``FP32`` is the default. To learn about optimization options, see :doc:`Optimization Guide<../../model-optimization>`. +- Prefer ``FP16`` inference precision over ``FP32``, as Model Conversion API can generate both variants, and the ``FP32`` is the default. To learn about optimization options, see :doc:`Optimization Guide <../../model-optimization>`. - Try to group individual infer jobs by using :doc:`automatic batching `. - Consider :doc:`caching <../optimize-inference/optimizing-latency/model-caching-overview>` to minimize model load time. - If your application performs inference on the CPU alongside the GPU, or otherwise loads the host heavily, make sure that the OpenCL driver threads do not starve. :doc:`CPU configuration options ` can be used to limit the number of inference threads for the CPU plugin. diff --git a/docs/articles_en/openvino-workflow/running-inference/stateful-models.rst b/docs/articles_en/openvino-workflow/running-inference/stateful-models.rst index 11e7cbabbd9431..acaa118c06c77f 100644 --- a/docs/articles_en/openvino-workflow/running-inference/stateful-models.rst +++ b/docs/articles_en/openvino-workflow/running-inference/stateful-models.rst @@ -68,15 +68,15 @@ from the application code to OpenVINO and all related internal work is hidden fr There are three methods of turning an OpenVINO model into a stateful one: -* :doc:`Optimum-Intel<../generative-ai-models-guide>` - the most user-friendly option. All necessary optimizations +* :doc:`Optimum-Intel <../generative-ai-models-guide>` - the most user-friendly option. All necessary optimizations are recognized and applied automatically. The drawback is, the tool does not work with all models. -* :ref:`MakeStateful transformation.` - enables the user to choose which +* :ref:`MakeStateful transformation ` - enables the user to choose which pairs of Parameter and Result to replace, as long as the paired operations are of the same shape and element type. -* :ref:`LowLatency2 transformation.` - automatically detects and replaces +* :ref:`LowLatency2 transformation ` - automatically detects and replaces Parameter and Result pairs connected to hidden and cell state inputs of LSTM/RNN/GRU operations or Loop/TensorIterator operations.