[Snippets] Added Softmax support #57

a-sidorova · 2022-11-24T11:18:57Z

Details:

Added Softmax support to Snippets partially: added the corresponding config parameter to disable Softmax in Snippets pipeline to avoid performance regressions and enable in tests for validation
Added support of Reshape around Softmax via SoftmaxReshapeElimination pass that removes the Reshape ops
More details here

Tickets:

92363
95636

Blockers:

[Snippets] Added support for Reshape around Softmax applied comment part Added config parameter to disable MHA ops tokenization Buffer 2D Loops

dmitry-gorokhov

first part

src/common/snippets/include/snippets/generator.hpp

src/common/snippets/include/snippets/op/fill.hpp

src/common/snippets/src/op/buffer.cpp

src/common/snippets/src/op/horizon_max.cpp

src/common/snippets/include/snippets/op/subgraph.hpp

src/common/snippets/src/pass/load_movebroadcast_to_broadcastload.cpp

dmitry-gorokhov · 2022-12-14T11:25:11Z

@v-Golubev could you please take a look on transformations? Maybe some generic comments

dmitry-gorokhov · 2022-12-14T11:25:36Z

@IvanNovoselov do you have any additional comments on latest version?

src/common/snippets/src/pass/insert_loops.cpp

src/common/snippets/src/pass/reset_buffer.cpp

src/common/snippets/src/pass/softmax_decomposition.cpp

v-Golubev · 2022-12-14T16:25:20Z

src/common/snippets/src/pass/softmax_decomposition.cpp

+        /* ====== ReduceMax decomposition ====== */
+
+        const auto vector_buffer_max = std::make_shared<ngraph::snippets::op::VectorBuffer>();
+        const auto loop_max_begin = ngraph::snippets::op::insertLoopBegin(ngraph::OutputVector{data, data});


I believe that the data was added to OutputVector intentionally but could you please leave some comment explaining why we do that?

Honestly I don't understand your comment))
Could you clarify what do you mean? Why LoopBegin input is data? Or why we add data twice? Or why we use ngraph::OutputVector Or something else?)

Sorry, looks like I lost a couple of words while translating them from thoughts into a comment.
I meant that it isn't clear why we add data twice

Oh, it's a long story... But I tried to explain it in the comment. Thank you!
4a1f00b

src/common/snippets/tests/src/pass/softmax_reshape_elimination.cpp

v-Golubev · 2022-12-15T09:40:49Z

src/common/snippets/tests/src/pass/softmax_reshape_elimination.cpp

+        auto reshape1 = std::make_shared<ov::op::v1::Reshape>(softmax_v1, shape1, false);
+        function = std::make_shared<Function>(NodeVector{reshape1}, ParameterVector{data});
+
+        manager.register_pass<pass::InitNodeInfo>();


Minor: InitNodeInfo pass is added by default in TransformationTestsF::SetUp() method so we can remove it

Removed, thanks!
4a1f00b

IvanNovoselov

First part

src/common/snippets/include/snippets/op/buffer.hpp

src/common/snippets/include/snippets/op/fill.hpp

src/common/snippets/include/snippets/pass/insert_buffer.hpp

src/common/snippets/include/snippets/op/fill.hpp

src/common/snippets/include/snippets/op/buffer.hpp

src/common/snippets/include/snippets/pass/insert_loops.hpp

src/common/snippets/src/pass/insert_loops.cpp

src/common/snippets/src/pass/softmax_decomposition.cpp

src/plugins/intel_cpu/src/emitters/jit_snippets_emitters.cpp

src/plugins/intel_cpu/src/nodes/subgraph.cpp

IvanNovoselov

Second part

IvanNovoselov · 2022-12-19T09:43:23Z

src/common/snippets/src/pass/set_buffer_offset.cpp

+        // Propagate to down: in Load. Buffer can have several Load and Loops after himself. We should go through all target inputs
+        {
+            std::function<void(const Input<Node>&)> propagate_down;
+            propagate_down = [&](const Input<Node>& target_input) {


I'm a bit concerned that we intend to change a node below the matched one. If we'll have another transformation (after the present one - let's call it Next) that matches on Loads, there could be a following situation:

Present transformation matches on node n

Next matches on n's child

Present updates n's child

Next overwrites n's child ignoring the update from Present, since it was made after the match.

As far as I understand, Matcher pass mechanics guarantees that it's safe to modify parent nodes, and the present node (as long as you return true), but modification of child nodes could cause a transformation conflict as described.
Could you make sure that we won't have such problems here please?

Thank you for the detailed example! But honestly I don't understand the possible problems. The pass matches on Buffer node and updates the corresponding operations like Store, Load, Brgemm. To find them we should go over all LoopBase ops. I mean that Buffer node has own MemoryAccess nodes and they're different for each Buffer node. Yeah, Buffer nodes can have the same LoopBase nodes but we don't touch them, only MemoryAccess

Ok, then subsequent matcher-pass that modifies MemoryAccess can overwrite your changes (it can be completely different transformation not connected to the Buffer)

As we discussed offline, I left the comment
8967b35

src/common/snippets/src/pass/insert_movebroadcast.cpp

IvanNovoselov · 2022-12-19T10:24:23Z

src/common/snippets/src/pass/softmax_decomposition.cpp

+        //           Maximum       |  /
+        //              /    LoopEnd
+        //       HorizonMax     |
+        //             \   LoopBegin[Sub + Exp + ReduceSum]


This is a good example on why do we need to move control-flow operations to a different IR.
As well as add_control_dependency block on L157-164

IvanNovoselov · 2022-12-19T10:37:33Z

src/common/snippets/src/pass/softmax_reshape_elimination.cpp

+            // Eliminate Reshape before Softmax
+            reshape0->output(0).replace(reshape0->input_value(0));
+            copy_runtime_info({reshape0->input_value(0).get_node_shared_ptr(), reshape0->output(0).get_node_shared_ptr()},
+                reshape0->input_value(0).get_node_shared_ptr());


It's interesting why reshape0->input_value(0).get_node_shared_ptr() both in from and to arguments. Why do we need to copy rt_info from a node to itself?

For second Reshape I call replace_output_update_name because we should save output names. When we remove first Reshape we don't need to save the name so I just call reshape0->output(0).replace(reshape0->input_value(0)). But to align with replace_output_update_name I call this copy_rt_info because there is the same code line in replace_output_update_name.

src/plugins/intel_cpu/src/emitters/jit_snippets_emitters.cpp

dmitry-gorokhov

LGTM

dmitry-gorokhov · 2022-12-21T07:59:53Z

@IvanNovoselov, @v-Golubev any remaining comments?

v-Golubev

No comments from my side that could block the merge

v-Golubev · 2022-12-21T09:43:51Z

src/common/snippets/tests/src/pass/softmax_reshape_elimination.cpp

+        auto data = std::make_shared<opset1::Parameter>(element::f32, Shape{1, 2, 340, 240});
+        auto shape0 = std::make_shared<ov::op::v0::Constant>(ov::element::i32, ov::Shape{2}, std::vector<int32_t>{2, 81600});
+        auto reshape0 = std::make_shared<ov::op::v1::Reshape>(data, shape0, false);
+        auto softmax_v1 = std::make_shared<ov::op::v8::Softmax>(reshape0, -1);
+        auto shape1 = std::make_shared<ov::op::v0::Constant>(ov::element::i32, ov::Shape{4}, std::vector<int32_t>{1, 2, 340, 240});
+        auto reshape1 = std::make_shared<ov::op::v1::Reshape>(softmax_v1, shape1, false);
+        function_ref = std::make_shared<Function>(NodeVector{reshape1}, ParameterVector{data});


I completely forgot to say that in case of negative test-cases (when the transformation changes nothing) we don't need to define function_ref field of TransformationTestsF class: this is handled automatically. Sorry for that.

Don't worry! Thank you very much for the small introduction into transformations tests! 😄
Removed

v-Golubev · 2022-12-21T09:48:43Z

src/common/snippets/tests/src/pass/softmax_decomposition.cpp

+
+void SoftmaxTests::SetUp() {
+    const size_t count = 10;
+    manager.register_pass<ngraph::pass::InitNodeInfo>();


We don't need to register InitNodeInfo here and on L62 because it is registered in base class

You're right that it's registered in base class but we just override this method (it doesn't work like ctor). But I understand your point and I added LoweringTests::SetUp() call instead of explicit pass registration.

IvanNovoselov

Just discussed a tiny comment offline, but it shouldn't block the merge
Everything is as good as it could be

a-sidorova force-pushed the feature/snippets/softmax branch 5 times, most recently from a025b91 to 026360c Compare December 5, 2022 09:23

a-sidorova mentioned this pull request Dec 5, 2022

[Snippets] Added Select, Broadcast support #63

Merged

1 task

a-sidorova force-pushed the feature/snippets/softmax branch from 026360c to f916fe2 Compare December 5, 2022 10:58

a-sidorova requested a review from dmitry-gorokhov December 5, 2022 11:19

a-sidorova force-pushed the feature/snippets/softmax branch from f916fe2 to 70a7e56 Compare December 6, 2022 10:34

a-sidorova force-pushed the feature/snippets/softmax branch from 70a7e56 to 53dc219 Compare December 13, 2022 12:47

[Snippets] Added Softmax support

9d2d721

[Snippets] Added support for Reshape around Softmax applied comment part Added config parameter to disable MHA ops tokenization Buffer 2D Loops

a-sidorova force-pushed the feature/snippets/softmax branch from 53dc219 to 9d2d721 Compare December 13, 2022 12:50

dmitry-gorokhov reviewed Dec 14, 2022

View reviewed changes

Applied Dmitry comments

2ba9972

v-Golubev reviewed Dec 14, 2022

View reviewed changes

Applied Vladislav comments

1cac5d5

a-sidorova force-pushed the feature/snippets/softmax branch from 339a1b8 to 1cac5d5 Compare December 15, 2022 06:06

a-sidorova requested a review from v-Golubev December 15, 2022 07:09

v-Golubev reviewed Dec 15, 2022

View reviewed changes

Applied Vladislav comments 2

4a1f00b

a-sidorova requested a review from v-Golubev December 15, 2022 10:50

IvanNovoselov reviewed Dec 16, 2022

View reviewed changes

dmitry-gorokhov reviewed Dec 19, 2022

View reviewed changes

Applied Dmitry and Ivan comments. First part

16bffe1

IvanNovoselov reviewed Dec 19, 2022

View reviewed changes

a-sidorova added 3 commits December 19, 2022 16:50

Applied Ivan comments

4fbb1c2

Moved call_args into parallel section

a705f10

Refactored offset for Buffer

fcd610e

a-sidorova added 2 commits December 20, 2022 18:15

Buffer allocation rank description

75b7531

Refactored execution

0445ad3

a-sidorova requested review from IvanNovoselov, v-Golubev and dmitry-gorokhov and removed request for v-Golubev December 20, 2022 15:06

dmitry-gorokhov approved these changes Dec 21, 2022

View reviewed changes

v-Golubev approved these changes Dec 21, 2022

View reviewed changes

IvanNovoselov approved these changes Dec 21, 2022

View reviewed changes

a-sidorova added 2 commits December 21, 2022 14:21

Small fixes for tests

e7572f9

Added comment about offset propagation

8967b35

dmitry-gorokhov merged commit f86fd91 into feature/snippets/mha Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Snippets] Added Softmax support #57

[Snippets] Added Softmax support #57

a-sidorova commented Nov 24, 2022 •

edited

Loading

dmitry-gorokhov left a comment

dmitry-gorokhov commented Dec 14, 2022

dmitry-gorokhov commented Dec 14, 2022

v-Golubev Dec 14, 2022

a-sidorova Dec 15, 2022

v-Golubev Dec 15, 2022

a-sidorova Dec 15, 2022

v-Golubev Dec 15, 2022 •

edited

Loading

a-sidorova Dec 15, 2022 •

edited

Loading

IvanNovoselov left a comment

IvanNovoselov left a comment

IvanNovoselov Dec 19, 2022

a-sidorova Dec 19, 2022

IvanNovoselov Dec 20, 2022

a-sidorova Dec 21, 2022

IvanNovoselov Dec 19, 2022

IvanNovoselov Dec 19, 2022

a-sidorova Dec 19, 2022

dmitry-gorokhov left a comment

dmitry-gorokhov commented Dec 21, 2022

v-Golubev left a comment

v-Golubev Dec 21, 2022

a-sidorova Dec 21, 2022

v-Golubev Dec 21, 2022

a-sidorova Dec 21, 2022

IvanNovoselov left a comment

[Snippets] Added Softmax support #57

[Snippets] Added Softmax support #57

Conversation

a-sidorova commented Nov 24, 2022 • edited Loading

Details:

Tickets:

Blockers:

dmitry-gorokhov left a comment

Choose a reason for hiding this comment

dmitry-gorokhov commented Dec 14, 2022

dmitry-gorokhov commented Dec 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v-Golubev Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

a-sidorova Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

IvanNovoselov left a comment

Choose a reason for hiding this comment

IvanNovoselov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitry-gorokhov left a comment

Choose a reason for hiding this comment

dmitry-gorokhov commented Dec 21, 2022

v-Golubev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IvanNovoselov left a comment

Choose a reason for hiding this comment

a-sidorova commented Nov 24, 2022 •

edited

Loading

v-Golubev Dec 15, 2022 •

edited

Loading

a-sidorova Dec 15, 2022 •

edited

Loading