[IE][VPU]: Enables Extract Dynamic Batch Transformation #3715

ggladilov · 2020-12-23T12:57:34Z

Description

General approach is following:

Extracted sub-graphs should have exactly one input and output operation. Otherwise, it's possible that memory consumption of model will be increased since loops implementation on Myriad-X requires to keep all inputs and outputs of loop to be alive along with memory used by loop body. In layout consolidation scenario it reflects intention to use minimized amount of permutations.
Extracted sub-graph should not have external connections (the only nodes that allowed to have predecessor or successor outside of sub-graph are input and output). Otherwise, it's possible that memory consumption of model will be increased for the same reason as in previous point.

To make sure this restriction is met transformation looks for leaves in both directions, finds corresponding LCA (Lowest Common Ancestor) and checks if such sub-graph has external connections. If so, it repeats leaves search procedure stopping if it approaches leaves from previous iteration and finds LCA again. It is repeated until sub-graph without external connections is found (it exists, at least source itself forms it).

Leaf in current context is a node which satisfies one of the following conditions (depending on direction):

Top:
1. It has no predecessors which are neither Parameter, nor Constant
2. It's unknown how to slice this operation
3. It could not be sliced (different batch for inputs and outputs)
Bottom:
1. It has no successors which are not Result
2. It's unknown how to slice this operation
3. It could not be sliced (different batch for inputs and outputs)

Tests are going to be added later once ngraph::opset5::Loop will be supported in Myriad-X plugin.

Task

#-43586

ggladilov · 2020-12-23T12:59:09Z

inference-engine/src/vpu/common/include/vpu/ngraph/utilities.hpp

-void printTo(std::ostream& stream, const ngraph::NodeTypeInfo& object);
+template<>
+inline void printTo(std::ostream& stream, const ngraph::NodeTypeInfo& object) {
+    stream << object.name << " ver. " << object.version;
+}


I still not sure why I had to make this change, but otherwise compiler picks up wrong printTo definition and prints empty string for ngraph::NodeTypeInfo - default printTo implementation

ggladilov · 2020-12-23T13:00:07Z

...ne/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/extract_dynamic_batch.cpp

+    // constant's shape has to be scalar (not empty) since if this constant has empty shape, so Gather will
+    // have empty shape as well (Gather produces scalar). When this Gather will become ScatterElementsUpdate
+    // argument ScatterElementsUpdate shape inference function will fail, since it requires indices and updates
+    // to have exactly the same shape (indices rank must be the same as rank of data input which is 1D vector,
+    // so its rank = 1 != 0)
+    const auto constant = std::make_shared<ngraph::opset5::Constant>(ngraph::element::i64, ngraph::Shape{1}, 0);


@lazarevevgeny

ggladilov · 2020-12-23T17:11:19Z

ngraph/core/include/ngraph/partial_shape.hpp

@@ -44,7 +44,13 @@ namespace ngraph
    ///     (Informal notation examples: `{1,2,3,4}`, `{6}`, `{}`)
    class NGRAPH_API PartialShape
    {
+        using Dimensions = std::vector<Dimension>;


@ilyachur please take a look

ilyachur

nGraph part LGTM

ilyachur · 2020-12-24T11:08:30Z

ngraph/core/include/ngraph/partial_shape.hpp

@@ -223,6 +230,18 @@ namespace ngraph
                                         const PartialShape& src,
                                         const op::AutoBroadcastSpec& autob);

+        iterator begin() noexcept { return m_dimensions.begin(); }


Just one comment: Please add doxygen documentation for new methods.

@ilyachur, should we change m_shape_type as is done in operator[]?
IMHO begin, end and operator[] should return some wrapper which on each write should update m_shape_type.

inference-engine/src/vpu/common/include/vpu/ngraph/utilities.hpp

...nce-engine/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/slice_mat_mul.cpp

...ne/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/extract_dynamic_batch.cpp

...u/common/src/ngraph/transformations/extract_dynamic_batch/batch_extraction_configuration.cpp

...ne/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/extract_dynamic_batch.cpp

ghost

LGTM. Good job 👍

It's convenient to be able to use STL algorithms on PartialShape since semantically PartialShape is a sequence of Dimensions. Signed-off-by: Gladilov, Gleb <[email protected]>

Introduces Depth-First-Search and Breadth-First-Search utilities for tree traversal. Templated arguments makes them extensible for different use-case scenarios. BFS is designed in way to make it possible to guarantee node will be visited only after all its predecessors have been visited: a / \ b c | | d | \ / e There with accordingly provided functors (NumEntries) it's guaranteed node "e" will be visited after "d" and "c". Such a property is important for nodes depth evaluation. Signed-off-by: Gladilov, Gleb <[email protected]>

For some reason if printTo for nGraph type is usual function it's not picked up by VPU_THROW_UNLESS triggered inside DynamicToStaticShape transformations. Making it template specialization does the job. Signed-off-by: Gladilov, Gleb <[email protected]>

SliceConfiguration is a class that's intended to express the result of operation slicing by batch. The result of slicing is configuration that specifies what to do with each data object associated with operation. There are two options defined: Slice and Unchanged. Typical slice scenario is Slice, when operation has the same batch for all inputs and outputs, so all corresponding data object will be "sliced" (replaced with copy where batch equal to 1). At some cases, data object should not sliced (ex. if operation has constant input which is the same for all input data batches and so, does not have batch - Add of 2 tensors with shapes [10, 1000] and [1000]). To represent such cases there is option "Unchanged". At cases when operation should not be sliced at all (ex. does not have batch, have different batch for inputs and outputs, has static batch and so on) SliceConfiguration object will return false for "hasSlice" method call. In these cases inputs and outputs methods calls will throw an exception. Signed-off-by: Gladilov, Gleb <[email protected]>

In case of static batch, operation is not going to be sliced, since for handling such cases other transformation is used. Such approach allows both passes to co-exist while one is being replaced with another. If data input has other dynamic dimension than batch error will be thrown since Myriad-X plugin does not support convolutions (HW accelerated operations) with dynamism in spatial dimensions. Signed-off-by: Gladilov, Gleb <[email protected]>

Since extract dynamic batch transformation will handle dynamism only by batch (so requires body loop to be static) operations with dynamism in dimension other than batch should not be covered by loop. In case of dynamism in dimension other than batch eltwise will be considered unsupported for sub-graph extraction. Signed-off-by: Gladilov, Gleb <[email protected]>

Since extract dynamic batch transformation will handle dynamism only by batch (so requires body loop to be static) operations with dynamism in dimension other than batch should not be covered by loop. In case of dynamism in dimension other than batch eltwise will be considered unsupported for sub-graph extraction. It's template function since different binary eltwise operations have the same broadcasting rules. Signed-off-by: Gladilov, Gleb <[email protected]>

General approach is following: 1. Extracted sub-graphs should have exactly one input and output operation. Otherwise, it's possible that memory consumption of model will be increased since loops implementation on Myriad-X requires to keep all inputs and outputs of loop to be alive along with memory used by loop body. In layout consolidation scenario it reflects intention to use minimized amount of permutations. 2. Extracted sub-graph should not have external connections ( the only nodes that allowed to have predecessor or successor outside of sub-graph are input and output). Otherwise, it's possible that memory consumption of model will be increased for the same reason as in previous point. To make sure this restriction is met transformation looks for leaves in both directions, finds corresponding LCA (Lowest Common Ancestor) and checks if such sub-graph has external connections. If so, it repeats leaves search procedure stopping if it approaches leaves from previous iteration and finds LCA again. It is repeated until sub-graph without external connections is found (it exists, at least source itself forms it). Leaf in current context is a node which satisfies one of the following conditions (depending on direction): Top: 1. It has no predecessors which are neither Parameter, nor Constant 2. It's unknown how to slice this operation 3. It could not be sliced (different batch for inputs and outputs) Bottom: 1. It has no successors which are not Result 2. It's unknown how to slice this operation 3. It could not be sliced (different batch for inputs and outputs) Signed-off-by: Gladilov, Gleb <[email protected]>

...ne/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/extract_dynamic_batch.cpp

...nce-engine/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/slice_mat_mul.cpp

…kit#3715) * [IE][nGraph]: Enables begin/end iterators for PartialShape It's convenient to be able to use STL algorithms on PartialShape since semantically PartialShape is a sequence of Dimensions. * [IE][VPU][nGraph]: Introduces tree utilities Introduces Depth-First-Search and Breadth-First-Search utilities for tree traversal. Templated arguments makes them extensible for different use-case scenarios. BFS is designed in way to make it possible to guarantee node will be visited only after all its predecessors have been visited: a / \ b c | | d | \ / e There with accordingly provided functors (NumEntries) it's guaranteed node "e" will be visited after "d" and "c". Such a property is important for nodes depth evaluation. * [IE][VPU][nGraph]: Fixes printTo for nGraph type For some reason if printTo for nGraph type is usual function it's not picked up by VPU_THROW_UNLESS triggered inside DynamicToStaticShape transformations. Making it template specialization does the job. * [IE][VPU]: Introduces SliceConfiguration class SliceConfiguration is a class that's intended to express the result of operation slicing by batch. The result of slicing is configuration that specifies what to do with each data object associated with operation. There are two options defined: Slice and Unchanged. Typical slice scenario is Slice, when operation has the same batch for all inputs and outputs, so all corresponding data object will be "sliced" (replaced with copy where batch equal to 1). At some cases, data object should not sliced (ex. if operation has constant input which is the same for all input data batches and so, does not have batch - Add of 2 tensors with shapes [10, 1000] and [1000]). To represent such cases there is option "Unchanged". At cases when operation should not be sliced at all (ex. does not have batch, have different batch for inputs and outputs, has static batch and so on) SliceConfiguration object will return false for "hasSlice" method call. In these cases inputs and outputs methods calls will throw an exception. * [IE][VPU][nGraph]: Enables MatMul operation slice In case of static batch, operation is not going to be sliced, since for handling such cases other transformation is used. Such approach allows both passes to co-exist while one is being replaced with another. If data input has other dynamic dimension than batch error will be thrown since Myriad-X plugin does not support convolutions (HW accelerated operations) with dynamism in spatial dimensions. * [IE][VPU][nGraph]: Enables Convolution operations slice In case of static batch, operation is not going to be sliced, since for handling such cases other transformation is used. Such approach allows both passes to co-exist while one is being replaced with another. If data input has other dynamic dimension than batch error will be thrown since Myriad-X plugin does not support convolutions (HW accelerated operations) with dynamism in spatial dimensions. * [IE][VPU][nGraph]: Enables unary eltwise slice Since extract dynamic batch transformation will handle dynamism only by batch (so requires body loop to be static) operations with dynamism in dimension other than batch should not be covered by loop. In case of dynamism in dimension other than batch eltwise will be considered unsupported for sub-graph extraction. * [IE][VPU][nGraph]: Enables binary eltwise slice Since extract dynamic batch transformation will handle dynamism only by batch (so requires body loop to be static) operations with dynamism in dimension other than batch should not be covered by loop. In case of dynamism in dimension other than batch eltwise will be considered unsupported for sub-graph extraction. It's template function since different binary eltwise operations have the same broadcasting rules. * [IE][VPU][nGraph]: Enables extract dynamic batch transformation General approach is following: 1. Extracted sub-graphs should have exactly one input and output operation. Otherwise, it's possible that memory consumption of model will be increased since loops implementation on Myriad-X requires to keep all inputs and outputs of loop to be alive along with memory used by loop body. In layout consolidation scenario it reflects intention to use minimized amount of permutations. 2. Extracted sub-graph should not have external connections ( the only nodes that allowed to have predecessor or successor outside of sub-graph are input and output). Otherwise, it's possible that memory consumption of model will be increased for the same reason as in previous point. To make sure this restriction is met transformation looks for leaves in both directions, finds corresponding LCA (Lowest Common Ancestor) and checks if such sub-graph has external connections. If so, it repeats leaves search procedure stopping if it approaches leaves from previous iteration and finds LCA again. It is repeated until sub-graph without external connections is found (it exists, at least source itself forms it). Leaf in current context is a node which satisfies one of the following conditions (depending on direction): Top: 1. It has no predecessors which are neither Parameter, nor Constant 2. It's unknown how to slice this operation 3. It could not be sliced (different batch for inputs and outputs) Bottom: 1. It has no successors which are not Result 2. It's unknown how to slice this operation 3. It could not be sliced (different batch for inputs and outputs) Signed-off-by: Gladilov, Gleb <[email protected]>

ggladilov added the category: VPU label Dec 23, 2020

ggladilov added this to the 2021.3 milestone Dec 23, 2020

ggladilov requested review from Maxim-Doronin, itikhono, andrejsokolov and a user December 23, 2020 12:57

ggladilov assigned ghost Dec 23, 2020

ggladilov requested a review from a team as a code owner December 23, 2020 12:57

ggladilov commented Dec 23, 2020

View reviewed changes

ggladilov requested a review from a team December 23, 2020 17:09

ggladilov commented Dec 23, 2020

View reviewed changes

ilyachur approved these changes Dec 24, 2020

View reviewed changes

ghost reviewed Dec 25, 2020

View reviewed changes

ggladilov requested a review from a user January 11, 2021 13:11

ghost approved these changes Jan 11, 2021

View reviewed changes

ggladilo added 9 commits January 12, 2021 10:23

[IE][nGraph]: Enables begin/end iterators for PartialShape

111bc75

It's convenient to be able to use STL algorithms on PartialShape since semantically PartialShape is a sequence of Dimensions. Signed-off-by: Gladilov, Gleb <[email protected]>

ggladilov assigned andrejsokolov and unassigned ghost Jan 12, 2021

andrejsokolov reviewed Jan 13, 2021

View reviewed changes

...ne/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/extract_dynamic_batch.cpp Show resolved Hide resolved

andrejsokolov reviewed Jan 13, 2021

View reviewed changes

...nce-engine/src/vpu/common/src/ngraph/transformations/extract_dynamic_batch/slice_mat_mul.cpp Show resolved Hide resolved

andrejsokolov self-requested a review January 13, 2021 09:59

andrejsokolov approved these changes Jan 13, 2021

View reviewed changes

ggladilov merged commit 1601c7f into openvinotoolkit:master Jan 13, 2021

ggladilov deleted the vpu/gg/extract-dynamic-batch branch January 13, 2021 10:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IE][VPU]: Enables Extract Dynamic Batch Transformation #3715

[IE][VPU]: Enables Extract Dynamic Batch Transformation #3715

ggladilov commented Dec 23, 2020 •

edited

Loading

ggladilov Dec 23, 2020 •

edited

Loading

ggladilov Dec 23, 2020

ggladilov Dec 23, 2020

ilyachur left a comment

ilyachur Dec 24, 2020

ggladilov Dec 25, 2020

pelszkow Jul 26, 2021

ghost left a comment

[IE][VPU]: Enables Extract Dynamic Batch Transformation #3715

[IE][VPU]: Enables Extract Dynamic Batch Transformation #3715

Conversation

ggladilov commented Dec 23, 2020 • edited Loading

Description

Task

ggladilov Dec 23, 2020 • edited Loading

Choose a reason for hiding this comment

ggladilov Dec 23, 2020

Choose a reason for hiding this comment

ggladilov Dec 23, 2020

Choose a reason for hiding this comment

ilyachur left a comment

Choose a reason for hiding this comment

ilyachur Dec 24, 2020

Choose a reason for hiding this comment

ggladilov Dec 25, 2020

Choose a reason for hiding this comment

pelszkow Jul 26, 2021

Choose a reason for hiding this comment

ghost left a comment

Choose a reason for hiding this comment

ggladilov commented Dec 23, 2020 •

edited

Loading

ggladilov Dec 23, 2020 •

edited

Loading