-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Snippets] Added support of BF16/I8/U8 for MatMul #15063
[Snippets] Added support of BF16/I8/U8 for MatMul #15063
Conversation
Wow, tons of the code:) |
The main part of the code is here #14327 😋 |
61c6b42
to
fe72da7
Compare
f3b50fd
to
40b569d
Compare
fa1bb56
to
80e945e
Compare
80e945e
to
255aa95
Compare
5c39a4e
to
b4364ee
Compare
b4364ee
to
5deb3c3
Compare
const PortDescriptor& get_output_port_descriptor(const size_t i) const; | ||
|
||
void set_input_count(size_t count, size_t idx); | ||
void set_output_count(size_t count, size_t idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It makes sense to have a default
idx = 0
value - We used to have only one count for both inputs and outputs, so I'm a little confused here now. If I want to set count for
Load
should I useset_input_count
orset_output_count
? And if onlyset_input_count
is legal for such operations, what happens if Iset_output_count
? I've seen that you createdset_count
forLoad
andStore
, and I think this is a right direction, but stillset_input_count
could be called on load, which is a bit confusing. I don't insist, but should we make a separate class for single-port operations maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Nice idea. let's do it, thanks
Load
works with data on input and with vector register on output. So you should work withset_input_count
. And opposite forStore
. I thought that it seems like logical things. We can discuss it
const bool is_f32 = intype_0 == element::f32 && intype_1 == element::f32; | ||
const bool is_int8 = (intype_0 == element::i8 || intype_0 == element::u8) && (intype_1 == element::i8); | ||
const bool is_bf16 = intype_0 == element::bf16 && intype_1 == element::bf16; | ||
return is_f32 || is_bf16 || is_int8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, it's not in plugin callback now, but in collapse subgraph
@@ -24,7 +24,8 @@ ngraph::snippets::pass::SetScalarCountForLoad::SetScalarCountForLoad() { | |||
if (!load) | |||
return false; | |||
|
|||
load->set_count(1lu); | |||
auto& desc = load->get_input_port_descriptor(0); | |||
desc.m_count = 1lu; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so now we have a choice set_input_count
/ set_output_count
or simply set_count
🙃 . Which one should I choose?)
586263e
to
c76f658
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve with minor leftovers
6b041f7
to
5c31c6c
Compare
if (m_type == Type::NewMemory) { | ||
OPENVINO_ASSERT(get_input_size() == 0, "Buffer with new allocated memory must to not have arguments!"); | ||
output_shape = m_shape; | ||
output_type = ov::element::u8; // 1Byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't it break precision propogation in case child operation expects different precision on input?
const bool is_f32 = intype_0 == element::f32 && intype_1 == element::f32; | ||
const bool is_int8 = (intype_0 == element::i8 || intype_0 == element::u8) && (intype_1 == element::i8); | ||
const bool is_bf16 = intype_0 == element::bf16 && intype_1 == element::bf16; | ||
return is_f32 || is_bf16 || is_int8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to avoid code duplication. It might static method of Brgemm op that returns undef element type if combination of iniputs is invalid.
Details:
Tickets:
Blockers:
TODO:
get_supported_precisions
forBrgemm