-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NPUW] New compute patterns #27618
[NPUW] New compute patterns #27618
Conversation
…into as/npuw_new_compute_patterns
@@ -723,6 +729,15 @@ std::shared_ptr<Repeated> Snapshot::tryMergeTriangles(const std::vector<Group::G | |||
return {}; | |||
} | |||
|
|||
if (prods.size() < m_ctx.keep_blocks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add a property to enable those 2 checks so it doesn't break the default config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmatveev thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've convinced myself that this check is harmless
} else if (isolate.pattern == "RMSNorm2") { | ||
rewr.add_matcher<ov::npuw::patterns::compute::RMSNorm2>(shared_from_this(), isolate.tag); | ||
handle_patterns = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this esle if
plague continues growing here - please file a follow-up item to clean this up & lets add it directly to the next sprint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a macros
if (prods.size() < m_ctx.keep_blocks) { | ||
// In some cases (specifically mixed precision) we could have merged | ||
// a small number of huge groups which consumed other legit repeated blocks | ||
// due to having a different repeated tag due to unique weights precision combination | ||
// from the rest of the model. | ||
// This check was added to prevent that and shouldn't affect other models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get how the comment matches the check itself. You refer to some different repeated tag but check size against threshold. How is this supposed to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets say we have 10 blocks with tag "aaa" and 3 blocks of "bbb". When we are trying to merge blocks further those 10 "aaa" blocks cannot be merged due to having only producers with the same "aaa" (we have this check which compares repeated tag with it's producer's tag somewhere in TryGrowRepeated() pass or somewhere near). But for 3 "bbb" blocks their producers have "aaa" block -> thus the merge is possible. We merge and assign a new tag for those "ccc". Then we can merge again and again due to the tag being different from the beginning. This would consume a large amount of blocks into 3 huge blocks.
This check prevent such case in the least invasive way - in the CleanupUniques() pass we would still drop those 3 blocks due to the same check. But at that point they are already huge and consumed legit blocks. I suggest we prevent such merges here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this explanation in the comment
src/plugins/intel_npu/src/plugin/npuw/partitioning/online/snapshot.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/compute.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/compute.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/plugin/npuw/partitioning/patterns/compute.cpp
Outdated
Show resolved
Hide resolved
auto hadd = opp::wrap_type<ov::op::v1::Add>({opp::any_input(), opp::any_input()}); | ||
auto power = opp::wrap_type<ov::op::v1::Power>({hadd, opp::any_input()}); | ||
auto reduce = opp::wrap_type<ov::op::v1::ReduceSum>({power, opp::any_input()}); | ||
auto sqrt = opp::wrap_type<ov::op::v0::Sqrt>({reduce}); | ||
auto div = opp::wrap_type<ov::op::v1::Divide>({hadd, sqrt}); | ||
auto multiply = opp::wrap_type<ov::op::v1::Multiply>({opp::any_input(), div}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How different it is compared to the normal RMSNorm? Can those be combined via optional or choice over add vs reduce, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not possible, the first Add connects to different nodes in the pattern
LOG_WARN("OPENVINO_NPUW_ISOLATE: unsupported pattern " << isolate.pattern << " is skipped!"); | ||
} | ||
#define HNDL(p) \ | ||
if (isolate.pattern == #p) { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we have 1:1 mapping here between the type names (p
) and their written mnemonics (.pattern
) in the config?
// In some cases (specifically mixed precision) during MergeUniques() pass we could be left with | ||
// E.g. 10 repeated blocks with tag AAA and 2 repeated blocks with tag BBB | ||
// TryMergeRepeating() pass checks that producer and consumer have a different tag to be merged further. | ||
// Let's say in our example 10 AAA blocks are finalized and cannot be merged further due to above check. | ||
// However we will proceed to merge 3 BBB blocks with 3 AAA blocks since the tags are different. | ||
// This will create a new tag CCC for the merged blocks and the merge will continue until those 3 blocks | ||
// consume a large amount of legit AAA blocks. | ||
// Later in CleanUpUniques() pass those repeated blocks will be stripped off repeated tag due to the same check | ||
// in this "if". To prevent such cases where we would end up with small number of huge blocks this check was | ||
// introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is the full copy of the text above, I'd just refer to one - like "see comment in...".
@@ -723,6 +715,20 @@ std::shared_ptr<Repeated> Snapshot::tryMergeTriangles(const std::vector<Group::G | |||
return {}; | |||
} | |||
|
|||
if (prods.size() < m_ctx.keep_blocks) { | |||
// In some cases (specifically mixed precision) during MergeUniques() pass we could be left with | |||
// E.g. 10 repeated blocks with tag AAA and 2 repeated blocks with tag BBB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The missing part for this great comment is to indicate that the structure may look like
AAA -> AAA -> AAA -> BBB -> AAA -> AAA -> AAA -> BBB
So our algorithm won't merge AAA + AAA
but will keep merging AAA + (AAA)BBB
so the RHS will grow enormously
No description provided.