Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New metrics for weight compression with dynamic quantization #2829

Closed

Conversation

ljaljushkin
Copy link
Contributor

@ljaljushkin ljaljushkin commented Jul 19, 2024

Changes

Use default runtime options for weight compression in conformance tests.
Dynamic Quantization of Activations is enabled by default at least since 2024.4: openvinotoolkit/openvino#25054
KV Cache compression to u8 since 2025: openvinotoolkit/openvino#27454

Reason for changes

Dynamic Quantization Activations and KV Cache compression in U8 should be default options for users since OpenVINO 2025.
Changed tests to validate weight compression with the option.

Related tickets

144846, 146143, 145701

Tests

51 build with openvino-nightly (2025.0.0-17353-acccb227fba)
image

252 build with current openvino (2024.4.0-16579-c3152d32c9c-releases/2024/4)
image

@ljaljushkin ljaljushkin requested a review from a team as a code owner July 19, 2024 15:55
@github-actions github-actions bot added the NNCF PTQ Pull requests that updates NNCF PTQ label Jul 19, 2024
@ljaljushkin ljaljushkin force-pushed the nl/dyn_quant_conform branch 2 times, most recently from 65db012 to eeb21a0 Compare August 6, 2024 17:23
@ljaljushkin
Copy link
Contributor Author

ljaljushkin commented Aug 7, 2024

@alexsu52, please review, but there are still not stable results for GTPQ - 148819
results on develop build 140
image
results on PR build 139
image

@ljaljushkin ljaljushkin marked this pull request as draft September 9, 2024 12:41
@ljaljushkin
Copy link
Contributor Author

Closing this PR, since dynamic quantization of activations may lead to more unstable accuracy results and there's no
performance gain improvement with enabled dynamic quantization and kv cache compression:
build 20 on the PR
image
vs build 21 on the develop
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants