Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

Merged
merged 41 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
2d6fbe5
feat: Added to the test scope
AdiKsOnDev Apr 9, 2024
52e8180
feat: Added torch backend support
AdiKsOnDev Apr 12, 2024
fd48363
Merge branch 'openvinotoolkit:develop' into develop
AdiKsOnDev Apr 16, 2024
c024803
fix: Moved int8 conversion in _validate()
AdiKsOnDev Apr 18, 2024
6405c4e
git: Merge branch 'develop' of github.com:AdiKsOnDev/nncf into develop
AdiKsOnDev Apr 18, 2024
f48c148
fix: Returned initial implementation of _validate()
AdiKsOnDev Apr 22, 2024
f9505e4
chore: Temporary dummy data
AdiKsOnDev Apr 22, 2024
2bc73ec
fix: Model Preparation for TORCH backend
AdiKsOnDev Apr 22, 2024
927c38f
fix: Removed unsupported parameters for INT8
AdiKsOnDev Apr 22, 2024
f008103
chore: Comment on important addition
AdiKsOnDev Apr 22, 2024
eeade47
feat: Added correct metric value according to @aleksu52
AdiKsOnDev Apr 22, 2024
fc05eed
fix: Mode accurate check for the INT8 compression mode
AdiKsOnDev Apr 23, 2024
4aefa0d
feat: Problematic code for @aleksu52 to reproduce
AdiKsOnDev Apr 23, 2024
737c1a7
feat: Use AutoModelForCausalLM for TORCH models
AdiKsOnDev Apr 24, 2024
8066b76
fix: Added model specific parameters during preparation
AdiKsOnDev Apr 24, 2024
512aa63
Merge branch 'openvinotoolkit:develop' into develop
AdiKsOnDev Apr 24, 2024
0041998
refactor: Make a tokenizer during model preparation
AdiKsOnDev Apr 24, 2024
3a61ccf
feat: Tokenize an input string (Temporary) to feed in torch model
AdiKsOnDev Apr 24, 2024
ea0c4c4
fix: Added torch_dtype parameter to the model
AdiKsOnDev Apr 24, 2024
c346100
chore: Removed unnecessary compression parameters
AdiKsOnDev Apr 24, 2024
1cfccf9
refactor: Line spacing, preprocessor usage
AdiKsOnDev Apr 25, 2024
88dc901
Merge branch 'openvinotoolkit:develop' into develop
AdiKsOnDev Apr 26, 2024
5deba30
fix: Removing convert_model()
AdiKsOnDev Apr 27, 2024
40c5686
fix: The pipeline now runs for TORCH models
AdiKsOnDev Apr 27, 2024
d3989be
fix: Using model_hf for validation
AdiKsOnDev Apr 28, 2024
43aec31
fix: Changed the reference metric value
AdiKsOnDev Apr 28, 2024
a85ded2
refactor: Pre-Commit changes
AdiKsOnDev Apr 28, 2024
28af569
fix: Returned the original checks for int4/int8 values
AdiKsOnDev Apr 30, 2024
c3d5e2d
Merge branch 'openvinotoolkit:develop' into develop
AdiKsOnDev Apr 30, 2024
a72ae7e
chore: Pre-Commit changes
AdiKsOnDev Apr 30, 2024
2f6f69c
git: Merge main branch
AdiKsOnDev Apr 30, 2024
d90b356
Merge branch 'develop' into develop
AdiKsOnDev Apr 30, 2024
7d328c3
refactor: Pre-Commit Changes
AdiKsOnDev Apr 30, 2024
7e50cfa
fix: Removed the debugging line
AdiKsOnDev May 1, 2024
7c31d3d
fix: Corrected reference data for TORCH backend
AdiKsOnDev May 2, 2024
6899097
refactor: Code made cleaner
AdiKsOnDev May 2, 2024
86e91f9
fix: Utilized wikitext for TORCH models as well
AdiKsOnDev May 2, 2024
7f32430
feat: Implemented get_num_compressed
AdiKsOnDev May 2, 2024
7729867
fix: Dumping the fp32 model correctly
AdiKsOnDev May 2, 2024
70cd912
chore: Removed unneccesary model wrapping
AdiKsOnDev May 2, 2024
e5db8cc
fix: Changed _validate to match the modified pipeline
AdiKsOnDev May 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion tests/post_training/data/wc_reference_data.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,8 @@ tinyllama_data_aware_awq_backend_OV:
tinyllama_data_aware_awq_stateful_backend_OV:
metric_value: 0.81237
num_int4: 184
num_int8: 128
num_int8: 128
tinyllama_int8_data_free_backend_TORCH:
metric_value: 0.96283
AdiKsOnDev marked this conversation as resolved.
Show resolved Hide resolved
num_int4: 228
num_int8: 84
12 changes: 12 additions & 0 deletions tests/post_training/model_scope.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,18 @@
"backends": [BackendType.OV],
"is_batch_size_supported": False,
},
{
"reported_name": "tinyllama_int8_data_free",
"model_id": "tinyllama/tinyllama-1.1b-step-50k-105b",
"pipeline_cls": LMWeightCompression,
"compression_params": {
"mode": CompressWeightsMode.INT8_ASYM,
"all_layers": None,
"awq": None,
"sensitivity_metric": None,
},
"backends": [BackendType.TORCH],
},
]


Expand Down
42 changes: 11 additions & 31 deletions tests/post_training/pipelines/lm_weight_compression.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,21 +161,7 @@ def save_compressed_model(self) -> None:
self.model_hf._save_config(self.output_model_dir)

def get_num_compressed(self) -> None:
AdiKsOnDev marked this conversation as resolved.
Show resolved Hide resolved
"""
Get number of the i8, u8, i4, u4 ops in the compressed IR.
"""
num_int8 = 0
num_int4 = 0

for node in self.model.get_ops():
for i in range(node.get_output_size()):
if node.get_output_element_type(i).get_type_name() in ["i8", "u8"]:
num_int8 += 1
if node.get_output_element_type(i).get_type_name() in ["i4", "u4"]:
num_int4 += 1

self.run_info.num_compress_nodes.num_int8 = num_int8
self.run_info.num_compress_nodes.num_int4 = num_int4
pass

def run_bench(self) -> None:
pass
Expand All @@ -192,6 +178,16 @@ def _compress(self):
"""
Actual call of weight compression
"""
if self.backend == BackendType.TORCH:
"""If Backend is TORCH (Assuming that it's INT8 compression), don't use a dataset as it's Unsupported"""
self.compressed_model = nncf.compress_weights(
self.model,
AdiKsOnDev marked this conversation as resolved.
Show resolved Hide resolved
dataset=None,
**self.compression_params,
)

return

self.compressed_model = nncf.compress_weights(
self.model,
dataset=self.calibration_dataset,
Expand Down Expand Up @@ -233,19 +229,3 @@ def _validate(self):
similarity = all_metrics["similarity"][0]
self.run_info.metric_name = "Similarity"
self.run_info.metric_value = round(similarity, 5)

num_int4_reference = self.reference_data.get("num_int4")
num_int8_reference = self.reference_data.get("num_int8")

num_int4_value = self.run_info.num_compress_nodes.num_int4
num_int8_value = self.run_info.num_compress_nodes.num_int8

if num_int4_reference != num_int4_value:
status_msg = f"Regression: The number of int4 ops is different \
than reference {num_int4_reference} != {num_int4_value}"
raise ValueError(status_msg)

if num_int8_reference != num_int8_value:
status_msg = f"Regression: The number of int8 ops is different \
than reference {num_int8_reference} != {num_int8_value}"
raise ValueError(status_msg)
Loading