Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactored weight compression for further unification. #2181

Conversation

ljaljushkin
Copy link
Contributor

@ljaljushkin ljaljushkin commented Oct 9, 2023

Changes

Refactored weight compression for OpenVINO and Torch to have a single entry point - WeightCompression algorithm.
Now it's based on the NNCF graph and has a common method that finds nodes for compression.

Reason for changes

The refactoring allows to easily support ignored scope without code duplication

Related tickets

122223

Tests

weight compression tests

  • evaluate time and memory consumption

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ and removed NNCF ONNX Pull requests that updates NNCF ONNX labels Oct 9, 2023
@openvino-nncf-ci openvino-nncf-ci added the API Public API-impacting changes label Oct 9, 2023
@ljaljushkin ljaljushkin force-pushed the nl/compress_ingored_scope branch from 089c389 to 3719612 Compare October 9, 2023 13:52
@ljaljushkin ljaljushkin changed the title Nl/compress ingored scope Refactored weight compression for further unification. Oct 9, 2023
@ljaljushkin ljaljushkin marked this pull request as ready for review October 9, 2023 13:54
@ljaljushkin ljaljushkin requested a review from a team as a code owner October 9, 2023 13:54
@openvino-nncf-ci openvino-nncf-ci removed the API Public API-impacting changes label Oct 9, 2023
@ljaljushkin ljaljushkin requested review from l-bat, alexsu52, a team, andreyanufr and AlexKoff88 and removed request for a team October 9, 2023 13:54
Comment on lines +95 to +97
const_shape = nncf_node.layer_attributes.constant_attributes[weight_port_id]["shape"]
channel_axes = get_weight_channel_axes(nncf_node, weight_port_id)
axes = get_channel_agnostic_reduction_axes(channel_axes, const_shape)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l-bat reduction axes are found differently using node and layer_attributes

Comment on lines +82 to +86
weight_port_ids = nncf_node.layer_attributes.get_const_port_ids()
for weight_port_id in weight_port_ids:
weight_op_friendly_name = nncf_node.layer_attributes.constant_attributes[weight_port_id]["name"]
weight_node = friendly_name_to_op_map[weight_op_friendly_name]
if weight_node is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l-bat please pay attention to the way of getting weight_node and manipulation with port_id
previously, it was different:

allowed_metatypes_to_const_port = {OVEmbeddingMetatype: [0], OVMatMulMetatype: [0, 1]}

for node in model.get_ordered_ops():
    for const_port_id in allowed_metatypes_to_const_port[metatype]:
        weight_node = get_operation_const_op(node, const_port_id)


@staticmethod
def is_node_with_weights(node: NNCFNode) -> bool:
return node.layer_attributes and node.layer_attributes.constant_attributes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l-bat this is similar to SmoothQuant logic, but it was not presented in the original implementation of weight compression.

Comment on lines +269 to +271
compression_algorithm = WeightCompression(mode, ratio, group_size)
graph = NNCFGraphFactory.create(model)
return compression_algorithm.apply(model, graph)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference, that current implementation requires the nncf_graph.
But it shouldn't induce a lot of overhead. Will provide time and memory measurements later.
cc: @alexsu52

@codecov
Copy link

codecov bot commented Oct 9, 2023

Codecov Report

Merging #2181 (3719612) into develop (881cca6) will increase coverage by 0.18%.
Report is 1 commits behind head on develop.
The diff coverage is 42.30%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2181      +/-   ##
===========================================
+ Coverage    36.18%   36.36%   +0.18%     
===========================================
  Files          479      478       -1     
  Lines        42960    42978      +18     
===========================================
+ Hits         15546    15630      +84     
+ Misses       27414    27348      -66     
Files Coverage Δ
nncf/openvino/quantization/quantize_model.py 0.00% <ø> (ø)
nncf/torch/quantization/quantize_model.py 0.00% <ø> (ø)
nncf/quantization/quantize_model.py 49.25% <40.00%> (-3.38%) ⬇️
...ization/algorithms/weight_compression/algorithm.py 42.55% <42.55%> (ø)

... and 10 files with indirect coverage changes

@ljaljushkin ljaljushkin changed the title Refactored weight compression for further unification. [WIP] Refactored weight compression for further unification. Oct 9, 2023
@ljaljushkin ljaljushkin closed this Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants