Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Enable u8 kv cache by default #27454

Conversation

luo-cheng2021
Copy link
Contributor

@luo-cheng2021 luo-cheng2021 commented Nov 7, 2024

Details:

  • Enable u8 kv cache by default
  • ...

Tickets:

@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Nov 7, 2024
@luo-cheng2021 luo-cheng2021 marked this pull request as ready for review November 8, 2024 04:39
@luo-cheng2021 luo-cheng2021 requested review from a team as code owners November 8, 2024 04:39
@yuxu42 yuxu42 requested a review from zhangYiIntel November 8, 2024 07:58
@@ -411,6 +412,9 @@ void Config::readProperties(const ov::AnyMap& prop, const ModelType modelType) {
if (!fcDynamicQuantizationGroupSizeSetExplicitly) {
fcDynamicQuantizationGroupSize = 0;
}
if (!kvCachePrecisionSetExplicitly) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not set kvCachePrecision to u8 here ? To make the kvCachePrecision compatible in ACL platform ? If so, it's better to left a comment here to explain this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the kvCachePrecision in constructor should be more clearer and consistent with other properties such as DynamicQuantizationGroup, so my suggestion is to keep them in current style.

@luo-cheng2021 luo-cheng2021 force-pushed the luocheng/enable_int8_kvcache branch 2 times, most recently from 47823da to 16578fe Compare November 11, 2024 02:12
Copy link
Contributor

@zhangYiIntel zhangYiIntel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Nov 13, 2024
Merged via the queue into openvinotoolkit:master with commit 2d148ec Nov 13, 2024
161 checks passed
alexsu52 pushed a commit to openvinotoolkit/nncf that referenced this pull request Nov 22, 2024
### Changes

explicitly disable kv cache compression to u8, f16 precision is used
instead.

### Reason for changes

PTWC nightly has a different metrics (ticket 157594). 
It happens, because since
openvinotoolkit/openvino#27454 KV Cache
compressed to u8 by default and it affects accuracy of fp32 models
(ticket 157571).

Propose using kv cache in the f16 in order to handle issues in nncf
rather than in ov (there's still an open issue with kv cache
compression, and it can be modified in the nearest future)

### Related tickets

157571
157594

### Tests

- [x] openvino-nightly/job/post_training_weight_compression/56

![image](https://github.com/user-attachments/assets/0772a8e5-0f92-4f53-8ac0-e16841bd8193)
- [x] https://github.com/openvinotoolkit/nncf/actions/runs/11934079602
- [x] job/weekly/job/openvino-nightly/job/test_examples/77
NishantPrabhuFujitsu pushed a commit to NishantPrabhuFujitsu/openvino that referenced this pull request Nov 26, 2024
### Details:
 - *Enable u8 kv cache by default*
 - *...*

### Tickets:
 - *[152621](https://jira.devtools.intel.com/browse/CVS-152621)*
github-merge-queue bot pushed a commit that referenced this pull request Dec 4, 2024
### Details:
 - *Port for enabling u8 kv cache by default in #27454*

### Tickets:
 - *[152621](https://jira.devtools.intel.com/browse/CVS-152621)*

---------

Co-authored-by: Vladislav Golubev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants