-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Refactored weight compression for further unification. #2181
[WIP] Refactored weight compression for further unification. #2181
Conversation
089c389
to
3719612
Compare
const_shape = nncf_node.layer_attributes.constant_attributes[weight_port_id]["shape"] | ||
channel_axes = get_weight_channel_axes(nncf_node, weight_port_id) | ||
axes = get_channel_agnostic_reduction_axes(channel_axes, const_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@l-bat reduction axes are found differently using node and layer_attributes
weight_port_ids = nncf_node.layer_attributes.get_const_port_ids() | ||
for weight_port_id in weight_port_ids: | ||
weight_op_friendly_name = nncf_node.layer_attributes.constant_attributes[weight_port_id]["name"] | ||
weight_node = friendly_name_to_op_map[weight_op_friendly_name] | ||
if weight_node is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@l-bat please pay attention to the way of getting weight_node
and manipulation with port_id
previously, it was different:
allowed_metatypes_to_const_port = {OVEmbeddingMetatype: [0], OVMatMulMetatype: [0, 1]}
for node in model.get_ordered_ops():
for const_port_id in allowed_metatypes_to_const_port[metatype]:
weight_node = get_operation_const_op(node, const_port_id)
|
||
@staticmethod | ||
def is_node_with_weights(node: NNCFNode) -> bool: | ||
return node.layer_attributes and node.layer_attributes.constant_attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@l-bat this is similar to SmoothQuant
logic, but it was not presented in the original implementation of weight compression.
compression_algorithm = WeightCompression(mode, ratio, group_size) | ||
graph = NNCFGraphFactory.create(model) | ||
return compression_algorithm.apply(model, graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference, that current implementation requires the nncf_graph.
But it shouldn't induce a lot of overhead. Will provide time and memory measurements later.
cc: @alexsu52
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #2181 +/- ##
===========================================
+ Coverage 36.18% 36.36% +0.18%
===========================================
Files 479 478 -1
Lines 42960 42978 +18
===========================================
+ Hits 15546 15630 +84
+ Misses 27414 27348 -66
|
Changes
Refactored weight compression for OpenVINO and Torch to have a single entry point -
WeightCompression
algorithm.Now it's based on the NNCF graph and has a common method that finds nodes for compression.
Reason for changes
The refactoring allows to easily support ignored scope without code duplication
Related tickets
122223
Tests
weight compression tests