Skip to content

Commit

Permalink
comments
Browse files Browse the repository at this point in the history
  • Loading branch information
eshoguli committed Jul 16, 2024
1 parent 8eafd23 commit 051f374
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. {#openvino_docs_OV_UG_quantization_scheme}
Quantization Scheme
==============================

Expand All @@ -12,11 +10,13 @@ Quantization Scheme
:caption: Low Precision Transformations

Key steps in the quantization scheme:

* Low Precision Transformations: ``FakeQuantize`` decomposition to Quantize with a low precision output and Dequantize. For more details, refer to the :doc:`Quantize decomposition <../low-precision-transformations>` section.
* Low Precision Transformations: move Dequantize through operations. For more details, refer to the :doc:`Main transformations <../step3-main>` section.
* Plugin: fuse operations with Quantize and inference in low precision.

Quantization scheme features:

* Quantization operation is expressed through the ``FakeQuantize`` operation, which involves more than scale and shift. For more details, see: :doc:`FakeQuantize-1 <../../../../openvino-ir-format/operation-sets/operation-specs/operation-specs/quantization/fake-quantize-1>`. If the ``FakeQuantize`` input and output intervals are the same, ``FakeQuantize`` degenerates to ``Multiply``, ``Subtract`` and ``Convert`` (scale & shift).
* Dequantization operation is expressed through element-wise ``Convert``, ``Subtract`` and ``Multiply`` operations. ``Convert`` and ``Subtract`` are optional. These operations can be handled as typical element-wise operations, for example, fused or transformed to another.
* OpenVINO plugins fuse ``Dequantize`` and ``Quantize`` operations after a low precision operation and do not fuse ``Quantize`` before it.
Expand Down

0 comments on commit 051f374

Please sign in to comment.