Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PTQ][OV] BF16 support #2307

Merged
merged 38 commits into from
Jul 12, 2024
Merged

Conversation

KodiaqQ
Copy link
Collaborator

@KodiaqQ KodiaqQ commented Dec 7, 2023

Changes

  • Added BF16 type support
  • Added FQ parameters generation based on type
  • Extended the list of the supported types for OpenVINO input data with ov.Tensor

Reason for changes

  • BF16 support

Related tickets

  • 126782

Tests

  • Updated existing tests with BF16
  • manual/post_training_weight_compression/99 - no regressions (failure due to CI issue)
  • manual/post_training_quantization/421 - no regressions (failure due to CI issue)

@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Dec 7, 2023
Copy link

codecov bot commented Dec 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.20%. Comparing base (f4bd077) to head (9471ac8).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #2307   +/-   ##
========================================
  Coverage    91.19%   91.20%           
========================================
  Files          483      483           
  Lines        46443    46435    -8     
========================================
- Hits         42355    42351    -4     
+ Misses        4088     4084    -4     
Files Coverage Δ
nncf/openvino/engine.py 96.55% <ø> (ø)
nncf/openvino/graph/model_transformer.py 94.84% <100.00%> (+0.82%) ⬆️
nncf/openvino/graph/node_utils.py 98.80% <100.00%> (-0.02%) ⬇️
nncf/openvino/graph/transformations/commands.py 97.67% <100.00%> (+0.11%) ⬆️
nncf/openvino/quantization/quantize_ifmodel.py 100.00% <100.00%> (ø)
.../algorithms/weight_compression/openvino_backend.py 98.84% <100.00%> (-0.01%) ⬇️
Flag Coverage Δ
COMMON 41.93% <0.00%> (+<0.01%) ⬆️
ONNX 34.19% <0.00%> (+<0.01%) ⬆️
OPENVINO 40.98% <100.00%> (-0.01%) ⬇️
TENSORFLOW 29.39% <0.00%> (+<0.01%) ⬆️
TORCH 65.11% <7.54%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 93.54% <ø> (ø)
torch 93.65% <ø> (ø)
tensorflow 93.26% <ø> (ø)
onnx 93.06% <ø> (ø)
openvino 94.62% <100.00%> (+0.10%) ⬆️
ptq 90.50% <100.00%> (-0.01%) ⬇️

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Dec 12, 2023

@alexsu52, @l-bat, please, review.

@KodiaqQ KodiaqQ marked this pull request as draft December 13, 2023 11:04
@github-actions github-actions bot added the NNCF PTQ Pull requests that updates NNCF PTQ label Jan 23, 2024
@KodiaqQ KodiaqQ marked this pull request as ready for review June 25, 2024 07:17
@alexsu52
Copy link
Contributor

I would suggest to check accuracy and performance of weight compression algorithms for FP32 and FP16 precision.

Here are the results of weight compression validation on the local machine - i9-10980XE:
Model Backend Metric name Metric value (develop) Metric value (bf16 branch) Compr. Time (develop) Compr. Time (bf16 branch)
tinyllama_data_aware OV Similarity 0,83853 0,83853 00:01:26 00:01:26
tinyllama_data_free OV Similarity 0,72057 0,72057 00:00:44 00:00:44

Also, here are the numbers from the examples/llm_compression/openvino/tiny_llama. Develop: develop_weight_compression BF16 branch: bf16_weight_compression

There were no degradations observed. @alexsu52, how did you reproduce the issue?

I used model with FP16 weights.

@KodiaqQ KodiaqQ marked this pull request as draft June 25, 2024 14:51
@KodiaqQ KodiaqQ marked this pull request as ready for review July 10, 2024 09:24
@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Jul 10, 2024

@andrey-churkin, @l-bat, @kshpv, @alexsu52, @daniil-lyakhov, @andreyanufr, review, please.

@KodiaqQ KodiaqQ requested a review from kshpv July 11, 2024 09:36
nncf/openvino/engine.py Outdated Show resolved Hide resolved
@MaximProshin MaximProshin dismissed alexsu52’s stale review July 12, 2024 12:11

AS accepted the merge of this PR.

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Jul 12, 2024

FP16 model as input (bloomz-560m). Develop:
image
Branch:
image

@KodiaqQ KodiaqQ merged commit 6926cf1 into openvinotoolkit:develop Jul 12, 2024
12 checks passed
alexsu52 pushed a commit that referenced this pull request Jul 22, 2024
### Changes

- Fixed shared memory warnings issue.
- Reverted cast for data type while creating constant.

### Reason for changes

- Leftover from #2307.
- Bugfix.

### Related tickets

- 147255

### Tests

- N/A
KodiaqQ added a commit to KodiaqQ/nncf that referenced this pull request Jul 22, 2024
### Changes

- Fixed shared memory warnings issue.
- Reverted cast for data type while creating constant.

### Reason for changes

- Leftover from openvinotoolkit#2307.
- Bugfix.

### Related tickets

- 147255

### Tests

- N/A

(cherry picked from commit b328e4d)
alexsu52 pushed a commit that referenced this pull request Jul 23, 2024
### Changes

- Fixed `model_extraction_command` issue introduced in #2307.

### Reason for changes

- Bugfix.

### Related tickets

- 147065

### Tests

- Updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants