-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Represent symmetrically quantized weights in signed data type #2434
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #2434 +/- ##
============================================
+ Coverage 47.70% 91.18% +43.47%
============================================
Files 483 483
Lines 46305 46363 +58
============================================
+ Hits 22090 42274 +20184
+ Misses 24215 4089 -20126
... and 296 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more.
|
e892fec
to
3b891ea
Compare
Thank you for this feature, just wondering is GPTQ model going to be automatically saved as i4 as well? |
No, this feature will only be enabled for weight compression via NNCF. |
are you planning to extend it and enable this for the GPTQ model? as it will be very helpful. Current GPTQ model has a per-tensor zero point and u4 weights, which will make sense to save as i4 as symmetric. |
I created ticket 131500 to support symmetrically quantized weights in signed data type for GPTQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to see the performance numbers similar to the ones from comment:
#2537 (comment)
Develop
l-bat:lt/wc_sym_signed
|
nncf/quantization/algorithms/weight_compression/openvino_backend.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/torch_backend.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
ci job: 23 |
return target, zero_mask | ||
|
||
|
||
def get_near_to_ideal_scale(weight, target, zero_mask, importance): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this funny name to something more earthly) For example, estimate_scales
, tune_scales
, etc.
@@ -165,8 +165,6 @@ def apply( | |||
original_weight = fns.zeros_like(weight) + weight | |||
|
|||
compressed_weights, scale, zp = do_integer_quantization(original_weight, reduction_axis, config) | |||
zp = zp.astype(scale.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if zp is not None:
zp = zp.astype(scale.dtype)
this conversion is important for performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Changes
Represent symmetrically quantized weights in signed data type with no zero point
Reason for changes
Related tickets
130625
Tests
Updated:
tests/torch/ptq/test_weights_compression.py
andtests/openvino/native/quantization/test_weights_compression.py
Merge after: openvinotoolkit/openvino#24457