-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Opset15][Spec] Col2Im-15 specification (#23947)
### Details: - Similar in functionality to https://pytorch.org/docs/stable/generated/torch.nn.Fold.html, `Col2Im` is `torch.nn.Fold` restricted to two output spatial dimensions - Some `Col2Im` related discussions: #20549 ### Related PRs - #24197 ### Tickets: - CVS-133358
- Loading branch information
Showing
1 changed file
with
225 additions
and
0 deletions.
There are no files selected for viewing
225 changes: 225 additions & 0 deletions
225
...tation/openvino-ir-format/operation-sets/operation-specs/movement/col2im-15.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,225 @@ | ||
.. {#openvino_docs_ops_type_Col2Im_15} | ||
Col2Im | ||
=================== | ||
|
||
|
||
.. meta:: | ||
:description: Learn about Col2Im-15 - data movement operation which combines sliding blocks into an image tensor. | ||
|
||
**Versioned name**: *Col2Im-15* | ||
|
||
**Category**: *Data movement* | ||
|
||
**Short description**: *Col2Im* operation constructs an image based on ``input`` tensor containing sliding data blocks (blocks of the image) and desired ``output_size``. | ||
|
||
**Detailed description** | ||
|
||
Consider an ``input`` tensor containing batches of image blocks of shape ``(N, C * Product(kernel_size), L)``, where: | ||
|
||
* ``N`` is the batch dimension, | ||
* ``C * Product(kernel_size)`` is the number of elements within a block (each block contains ``Product(kernel_size)`` vectors containing values from each channel ``C``), | ||
* ``L`` is the total number of blocks calculated as follows: | ||
|
||
L = product from d=1 to 2 of floor((output_size[d] + pads_begin[d] + pads_end[d] - dilation[d] * (kernel_size[d] - 1) - 1) / stride[d] + 1) | ||
|
||
where ``d`` is over all spatial dimensions. | ||
|
||
The ``input`` blocks are being moved into the ``output`` tensor of shape ``(N, C, output_size[0], output_size[1])`` by combining the values contained in blocks. | ||
|
||
Non-batched inputs are also supported, in which case the ``input`` has shape ``(C * Product(kernel_size), L)`` and the output has shape ``(C, output_size[0], output_size[1])``. | ||
|
||
**Attributes**: | ||
|
||
* *strides* | ||
|
||
* **Description**: stride in the sliding blocks in the input spatial dimensions. | ||
* **Range of values**: 1D tensor of positive integer numbers | ||
* **Type**: *T_IDX* | ||
* **Default value**: [1, 1] | ||
* **Required**: *no* | ||
|
||
* *dilations* | ||
|
||
* **Description**: controls local stride of the elements. | ||
* **Range of values**: 1D tensor of non-negative integer numbers | ||
* **Type**: *T_IDX* | ||
* **Default value**: [1, 1] | ||
* **Required**: *no* | ||
|
||
* *pads_begin* | ||
|
||
* **Description**: *pads_begin* is a number of zero-value pixels to add to the beginning along each axis. For example, *pads_begin* equal "1,2" means adding 1 pixel to the top of the input and 2 to the left of the input. | ||
* **Range of values**: 1D tensor of non-negative integer numbers | ||
* **Type**: *T_IDX* | ||
* **Default value**: [0, 0] | ||
* **Required**: *no* | ||
|
||
* *pads_end* | ||
|
||
* **Description**: *pads_end* is a number of zero-value pixels to add to the ending along each axis. For example, *pads_end* equal "1,2" means adding 1 pixel to the bottom of the input and 2 to the right of the input. | ||
* **Range of values**: 1D tensor of non-negative integer numbers | ||
* **Type**: *T_IDX* | ||
* **Default value**: [0, 0] | ||
* **Required**: *no* | ||
|
||
**Inputs** | ||
|
||
* **1**: *data* | ||
|
||
* **Description**: A batched 3D tensor of type *T* and shape ``(N, C * Product(kernel_size), L)`` or an unbatched 2D tensor of type *T* and shape ``(C * Product(kernel_size), L)``. **Required.** | ||
* **Range of values**: 1D tensor of non-negative integer numbers | ||
* **Type**: *T* | ||
|
||
* **2**: *output_size* | ||
|
||
* **Description**: controls the shape of the spatial dimensions of the output image. **Required.** | ||
* **Range of values**: 1D tensor of two positive integer numbers (height and width) | ||
* **Type**: *T_IDX* | ||
|
||
* **3**: *kernel_size* | ||
|
||
* **Description**: size of the sliding blocks. **Required.** | ||
* **Range of values**: 1D tensor of non-negative integer numbers | ||
* **Type**: *T_IDX* | ||
|
||
**Outputs** | ||
|
||
* **1**: The output tensor the output image of type *T* and shape: | ||
|
||
* ``(N, C, output_size[0], output_size[1])`` in case of batched input, | ||
* ``(C, output_size[0], output_size[1])`` in case of non-batched input. | ||
|
||
**Types** | ||
|
||
* *T*: any supported data type. | ||
* *T_IDX*: ``int64`` or ``int32``. | ||
|
||
**Examples** | ||
|
||
All examples assume ``C = 3``. | ||
|
||
*Example 1: default optional Parameters* | ||
|
||
For inputs ``output_size`` = [16, 16] and ``kernel_size`` = [2, 2] | ||
|
||
.. code-block:: xml | ||
:force: | ||
<layer ... type="Col2Im" ... > | ||
<input> | ||
<port id="0" precision="I32"> | ||
<dim>3</dim> <!-- batch_axis --> | ||
<dim>12</dim> <!-- C * Product(kernel_size) --> | ||
<dim>225</dim> <!-- L --> | ||
</port> | ||
<port id="1" precision="I32"> | ||
<dim>2</dim> <!-- output_size --> | ||
</port> | ||
<port id="2" precision="I32"> | ||
<dim>2</dim> <!-- kernel_size --> | ||
</port> | ||
</input> | ||
<output> | ||
<port id="1" precision="I32"> | ||
<dim>3</dim> <!-- batch_axis --> | ||
<dim>3</dim> <!-- C --> | ||
<dim>16</dim> <!-- output_size[0] --> | ||
<dim>16</dim> <!-- output_size[1] --> | ||
</port> | ||
</output> | ||
</layer> | ||
*Example 2: non-default dilations, padding and strides* | ||
|
||
For inputs ``output_size`` = [16, 16] and ``kernel_size`` = [3, 3] | ||
|
||
.. code-block:: xml | ||
:force: | ||
<layer ... type="Col2Im" ... > | ||
<data dilations="2,2" pads_begin="1,1" pads_end="1,1" strides="2,2"/> | ||
<input> | ||
<port id="0" precision="I32"> | ||
<dim>1</dim> <!-- batch_axis --> | ||
<dim>27/dim> <!-- C * Product(kernel_size) --> | ||
<dim>25</dim> <!-- L --> | ||
</port> | ||
<port id="1" precision="I32"> | ||
<dim>2</dim> <!-- output_size --> | ||
</port> | ||
<port id="2" precision="I32"> | ||
<dim>2</dim> <!-- kernel_size --> | ||
</port> | ||
</input> | ||
<output> | ||
<port id="1" precision="I32"> | ||
<dim>1</dim> <!-- batch_axis --> | ||
<dim>3</dim> <!-- C --> | ||
<dim>16</dim> <!-- output_size[0] --> | ||
<dim>16</dim> <!-- output_size[1] --> | ||
</port> | ||
</output> | ||
</layer> | ||
*Example 3: non-default dilations and padding* | ||
|
||
For inputs ``output_size`` = [32, 32] and ``kernel_size`` = [2, 2] | ||
|
||
.. code-block:: xml | ||
:force: | ||
<layer ... type="Col2Im" ... > | ||
<data dilations="2,2" pads_begin="3,3" pads_end="3,3"/> | ||
<input> | ||
<port id="0" precision="I32"> | ||
<dim>12</dim> <!-- batch_axis --> | ||
<dim>12/dim> <!-- C * Product(kernel_size) --> | ||
<dim>324</dim> <!-- L --> | ||
</port> | ||
<port id="1" precision="I32"> | ||
<dim>2</dim> <!-- output_size --> | ||
</port> | ||
<port id="2" precision="I32"> | ||
<dim>2</dim> <!-- kernel_size --> | ||
</port> | ||
</input> | ||
<output> | ||
<port id="1" precision="I32"> | ||
<dim>12</dim> <!-- batch_axis --> | ||
<dim>3</dim> <!-- C --> | ||
<dim>32</dim> <!-- output_size[0] --> | ||
<dim>32</dim> <!-- output_size[1] --> | ||
</port> | ||
</output> | ||
</layer> | ||
*Example 4: default optional Parameters, unbatched* | ||
|
||
For inputs ``output_size`` = [16, 16] and ``kernel_size`` = [2, 2] | ||
|
||
.. code-block:: xml | ||
:force: | ||
<layer ... type="Col2Im" ... > | ||
<input> | ||
<port id="0" precision="I32"> | ||
<dim>12</dim> <!-- C * Product(kernel_size) --> | ||
<dim>225</dim> <!-- L --> | ||
</port> | ||
<port id="1" precision="I32"> | ||
<dim>2</dim> <!-- output_size --> | ||
</port> | ||
<port id="2" precision="I32"> | ||
<dim>2</dim> <!-- kernel_size --> | ||
</port> | ||
</input> | ||
<output> | ||
<port id="1" precision="I32"> | ||
<dim>3</dim> <!-- C --> | ||
<dim>16</dim> <!-- output_size[0] --> | ||
<dim>16</dim> <!-- output_size[1] --> | ||
</port> | ||
</output> | ||
</layer> |