-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Snippets] Added Softmax support #57
[Snippets] Added Softmax support #57
Conversation
a025b91
to
026360c
Compare
026360c
to
f916fe2
Compare
f916fe2
to
70a7e56
Compare
70a7e56
to
53dc219
Compare
[Snippets] Added support for Reshape around Softmax applied comment part Added config parameter to disable MHA ops tokenization Buffer 2D Loops
53dc219
to
9d2d721
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first part
src/common/snippets/src/pass/load_movebroadcast_to_broadcastload.cpp
Outdated
Show resolved
Hide resolved
@v-Golubev could you please take a look on transformations? Maybe some generic comments |
@IvanNovoselov do you have any additional comments on latest version? |
/* ====== ReduceMax decomposition ====== */ | ||
|
||
const auto vector_buffer_max = std::make_shared<ngraph::snippets::op::VectorBuffer>(); | ||
const auto loop_max_begin = ngraph::snippets::op::insertLoopBegin(ngraph::OutputVector{data, data}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the data
was added to OutputVector intentionally but could you please leave some comment explaining why we do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I don't understand your comment))
Could you clarify what do you mean? Why LoopBegin
input is data
? Or why we add data
twice? Or why we use ngraph::OutputVector
Or something else?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, looks like I lost a couple of words while translating them from thoughts into a comment.
I meant that it isn't clear why we add data
twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's a long story... But I tried to explain it in the comment. Thank you!
4a1f00b
src/common/snippets/tests/src/pass/softmax_reshape_elimination.cpp
Outdated
Show resolved
Hide resolved
339a1b8
to
1cac5d5
Compare
auto reshape1 = std::make_shared<ov::op::v1::Reshape>(softmax_v1, shape1, false); | ||
function = std::make_shared<Function>(NodeVector{reshape1}, ParameterVector{data}); | ||
|
||
manager.register_pass<pass::InitNodeInfo>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: InitNodeInfo
pass is added by default in TransformationTestsF::SetUp()
method so we can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks!
4a1f00b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second part
// Propagate to down: in Load. Buffer can have several Load and Loops after himself. We should go through all target inputs | ||
{ | ||
std::function<void(const Input<Node>&)> propagate_down; | ||
propagate_down = [&](const Input<Node>& target_input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned that we intend to change a node below the matched one. If we'll have another transformation (after the present one - let's call it Next) that matches on Loads, there could be a following situation:
Present
transformation matches on noden
Next
matches onn
's childPresent
updatesn
's childNext
overwritesn
's child ignoring the update fromPresent
, since it was made after the match.
As far as I understand, Matcher pass mechanics guarantees that it's safe to modify parent nodes, and the present node (as long as you return true), but modification of child nodes could cause a transformation conflict as described.
Could you make sure that we won't have such problems here please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed example! But honestly I don't understand the possible problems. The pass matches on Buffer
node and updates the corresponding operations like Store
, Load
, Brgemm
. To find them we should go over all LoopBase
ops. I mean that Buffer
node has own MemoryAccess
nodes and they're different for each Buffer
node. Yeah, Buffer
nodes can have the same LoopBase
nodes but we don't touch them, only MemoryAccess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then subsequent matcher-pass that modifies MemoryAccess
can overwrite your changes (it can be completely different transformation not connected to the Buffer
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, I left the comment
8967b35
// Maximum | / | ||
// / LoopEnd | ||
// HorizonMax | | ||
// \ LoopBegin[Sub + Exp + ReduceSum] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good example on why do we need to move control-flow operations to a different IR.
As well as add_control_dependency
block on L157-164
// Eliminate Reshape before Softmax | ||
reshape0->output(0).replace(reshape0->input_value(0)); | ||
copy_runtime_info({reshape0->input_value(0).get_node_shared_ptr(), reshape0->output(0).get_node_shared_ptr()}, | ||
reshape0->input_value(0).get_node_shared_ptr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting why reshape0->input_value(0).get_node_shared_ptr()
both in from
and to
arguments. Why do we need to copy rt_info from a node to itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For second Reshape
I call replace_output_update_name
because we should save output names. When we remove first Reshape
we don't need to save the name so I just call reshape0->output(0).replace(reshape0->input_value(0))
. But to align with replace_output_update_name
I call this copy_rt_info
because there is the same code line in replace_output_update_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@IvanNovoselov, @v-Golubev any remaining comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comments from my side that could block the merge
auto data = std::make_shared<opset1::Parameter>(element::f32, Shape{1, 2, 340, 240}); | ||
auto shape0 = std::make_shared<ov::op::v0::Constant>(ov::element::i32, ov::Shape{2}, std::vector<int32_t>{2, 81600}); | ||
auto reshape0 = std::make_shared<ov::op::v1::Reshape>(data, shape0, false); | ||
auto softmax_v1 = std::make_shared<ov::op::v8::Softmax>(reshape0, -1); | ||
auto shape1 = std::make_shared<ov::op::v0::Constant>(ov::element::i32, ov::Shape{4}, std::vector<int32_t>{1, 2, 340, 240}); | ||
auto reshape1 = std::make_shared<ov::op::v1::Reshape>(softmax_v1, shape1, false); | ||
function_ref = std::make_shared<Function>(NodeVector{reshape1}, ParameterVector{data}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely forgot to say that in case of negative test-cases (when the transformation changes nothing) we don't need to define function_ref
field of TransformationTestsF class: this is handled automatically. Sorry for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry! Thank you very much for the small introduction into transformations tests! 😄
Removed
|
||
void SoftmaxTests::SetUp() { | ||
const size_t count = 10; | ||
manager.register_pass<ngraph::pass::InitNodeInfo>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to register InitNodeInfo
here and on L62 because it is registered in base class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that it's registered in base class but we just override this method (it doesn't work like ctor). But I understand your point and I added LoweringTests::SetUp()
call instead of explicit pass registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just discussed a tiny comment offline, but it shouldn't block the merge
Everything is as good as it could be
Details:
Softmax
support to Snippets partially: added the corresponding config parameter to disableSoftmax
in Snippets pipeline to avoid performance regressions and enable in tests for validationReshape
aroundSoftmax
viaSoftmaxReshapeElimination
pass that removes theReshape
opsTickets:
Blockers: