Intel® Neural Compressor validated examples with multiple compression techniques, including quantization, pruning, knowledge distillation and orchestration. Part of the validated cases can be found in the example tables, and the release data is available here.
Note: 2.0 API migration work in progress, the example marked with
*
means it still using 1.x API.
- tf_example1: quantize with built-in dataloader and metric.
- tf_example2: quantize keras model with customized metric and dataloader.
- tf_example3: convert model with mix precision.
- tf_example4: quantize checkpoint with dummy dataloader.
- tf_example5: config performance and accuracy measurement.
- tf_example6: use default user-facing APIs to quantize a pb model.
- tf_example7: quantize and benchmark with pure python API.
- *BERT Mini SST2 performance boost with INC: train a BERT-Mini model on SST-2 dataset through distillation, and leverage quantization to accelerate the inference while maintaining the accuracy using Intel® Neural Compressor.
- Performance of FP32 Vs. INT8 ResNet50 Model: compare existed FP32 & INT8 ResNet50 model directly.
- *Intel® Neural Compressor Sample for PyTorch*: an End-To-End pipeline to build up a CNN model by PyTorch to recognize fashion image and speed up AI model by Intel® Neural Compressor.
- *Intel® Neural Compressor Sample for TensorFlow*: an End-To-End pipeline to build up a CNN model by TensorFlow to recognize handwriting number and speed up AI model by Intel® Neural Compressor.
- *Accelerate VGG19 Inference on Intel® Gen4 Xeon® Sapphire Rapids: an End-To-End pipeline to train VGG19 model by transfer learning based on pre-trained model from TensorFlow Hub; quantize it by Intel® Neural Compressor on Intel® Gen4 Xeon® Sapphire Rapids.
Model | Domain | Approach | Examples |
---|---|---|---|
*ResNet50 V1.0 | Image Recognition | Post-Training Static Quantization | pb |
*ResNet50 V1.5 | Image Recognition | Post-Training Static Quantization | pb |
*ResNet101 | Image Recognition | Post-Training Static Quantization | pb |
*MobileNet V1 | Image Recognition | Post-Training Static Quantization | pb |
*MobileNet V2 | Image Recognition | Post-Training Static Quantization | pb / keras |
*MobileNet V3 | Image Recognition | Post-Training Static Quantization | pb |
*Inception V1 | Image Recognition | Post-Training Static Quantization | pb |
*Inception V2 | Image Recognition | Post-Training Static Quantization | pb |
*Inception V3 | Image Recognition | Post-Training Static Quantization | pb |
*Inception V4 | Image Recognition | Post-Training Static Quantization | pb |
*Inception ResNet V2 | Image Recognition | Post-Training Static Quantization | pb |
*VGG16 | Image Recognition | Post-Training Static Quantization | pb / keras |
*VGG19 | Image Recognition | Post-Training Static Quantization | pb / keras |
*ResNet V2 50 | Image Recognition | Post-Training Static Quantization | pb / keras |
*ResNet V2 101 | Image Recognition | Post-Training Static Quantization | pb / keras |
*ResNet V2 152 | Image Recognition | Post-Training Static Quantization | pb |
*DenseNet121 | Image Recognition | Post-Training Static Quantization | pb |
*DenseNet161 | Image Recognition | Post-Training Static Quantization | pb |
*DenseNet169 | Image Recognition | Post-Training Static Quantization | pb |
*EfficientNet B0 | Image Recognition | Post-Training Static Quantization | ckpt |
*MNIST | Image Recognition | Quantization-Aware Training | keras |
*ResNet50 | Image Recognition | Post-Training Static Quantization | keras |
*ResNet50 Fashion | Image Recognition | Post-Training Static Quantization | keras |
*ResNet101 | Image Recognition | Post-Training Static Quantization | keras |
*Inception V3 | Image Recognition | Post-Training Static Quantization | keras |
*Inception Resnet V2 | Image Recognition | Post-Training Static Quantization | keras |
*Xception | Image Recognition | Post-Training Static Quantization | keras |
*ResNet V2 | Image Recognition | Quantization-Aware Training | keras |
*EfficientNet V2 B0 | Image Recognition | Post-Training Static Quantization | SavedModel |
BERT base MRPC | Natural Language Processing | Post-Training Static Quantization | ckpt |
*BERT large SQuAD (Model Zoo) | Natural Language Processing | Post-Training Static Quantization | pb |
*BERT large SQuAD | Natural Language Processing | Post-Training Static Quantization | pb |
DistilBERT base | Natural Language Processing | Post-Training Static Quantization | pb |
Transformer LT | Natural Language Processing | Post-Training Static Quantization | pb |
Transformer LT MLPerf | Natural Language Processing | Post-Training Static Quantization | pb |
*SSD ResNet50 V1 | Object Detection | Post-Training Static Quantization | pb / ckpt |
*SSD MobileNet V1 | Object Detection | Post-Training Static Quantization | pb / ckpt |
*Faster R-CNN Inception ResNet V2 | Object Detection | Post-Training Static Quantization | pb / SavedModel |
*Faster R-CNN ResNet101 | Object Detection | Post-Training Static Quantization | pb / SavedModel |
*Faster R-CNN ResNet50 | Object Detection | Post-Training Static Quantization | pb |
*Mask R-CNN Inception V2 | Object Detection | Post-Training Static Quantization | pb / ckpt |
*SSD ResNet34 | Object Detection | Post-Training Static Quantization | pb |
*YOLOv3 | Object Detection | Post-Training Static Quantization | pb |
Wide & Deep | Recommendation | Post-Training Static Quantization | pb |
*Arbitrary Style Transfer | Style Transfer | Post-Training Static Quantization | ckpt |
Model | Domain | Pruning Type | Approach | Examples |
---|---|---|---|---|
*Inception V3 | Image Recognition | Unstructured | Magnitude | pb |
*ResNet V2 | Image Recognition | Unstructured | Magnitude | pb |
*ViT | Image Recognition | Unstructured | Magnitude | ckpt |
Student Model | Teacher Model | Domain | Approach | Examples |
---|---|---|---|---|
*MobileNet | DenseNet201 | Image Recognition | Knowledge Distillation | pb |
Model | Domain | Approach | Examples |
---|---|---|---|
*ResNet18 | Image Recognition | Post-Training Static Quantization | eager / fx / ipex |
*ResNet18 | Image Recognition | Quantization-Aware Training | eager / fx / distributed |
*ResNet50 | Image Recognition | Post-Training Static Quantization | eager / fx / ipex |
*ResNet50 | Image Recognition | Quantization-Aware Training | eager / fx / distributed |
ResNeXt101_32x16d_wsl | Image Recognition | Post-Training Static Quantization | ipex |
*ResNeXt101_32x8d | Image Recognition | Post-Training Static Quantization | eager / fx |
*Se_ResNeXt50_32x4d | Image Recognition | Post-Training Static Quantization | eager |
*Inception V3 | Image Recognition | Post-Training Static Quantization | eager / fx |
*MobileNet V2 | Image Recognition | Post-Training Static Quantization | eager / fx |
*PeleeNet | Image Recognition | Post-Training Static Quantization | eager |
*ResNeSt50 | Image Recognition | Post-Training Static Quantization | eager |
*3D-UNet | Image Recognition | Post-Training Static Quantization | eager |
SSD ResNet34 | Object Detection | Post-Training Static Quantization | fx / ipex |
*Mask R-CNN | Object Detection | Post-Training Static Quantization | fx |
*YOLOv3 | Object Detection | Post-Training Static Quantization | eager |
*DLRM | Recommendation | Post-Training Static Quantization | eager / ipex / fx |
*RNN-T | Speech Recognition | Post-Training Dynamic Quantization | eager |
*Wav2Vec2 | Speech Recognition | Post-Training Dynamic | eager |
*HuBERT | Speech Recognition | Post-Training Dynamic /Static Quantization | eager / fx |
*BlendCNN | Natural Language Processing | Post-Training Static Quantization | eager |
bert-large-uncased-whole-word-masking-finetuned-squad | Natural Language Processing | Post-Training Static Quantization | fx / ipex |
distilbert-base-uncased-distilled-squad | Natural Language Processing | Post-Training Static Quantization | ipex |
*t5-small | Natural Language Processing | Post-Training Dynamic Quantization | eager |
*Helsinki-NLP/opus-mt-en-ro | Natural Language Processing | Post-Training Dynamic Quantization | eager |
*lvwerra/pegasus-samsum | Natural Language Processing | Post-Training Dynamic Quantization | eager |
GPTJ | Natural Language Processing | Post-Training Static Quantization | fx |
SD Diffusion | Text to Image | Post-Training Static Quantization | fx |
Model | Domain | Pruning Type | Approach | Examples |
---|---|---|---|---|
Distilbert-base-uncased | Natural Language Processing (text classification) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Bert-mini | Natural Language Processing (text classification) | Structured (4x1, 2in4, per channel), Unstructured | Snip-momentum | eager |
Distilbert-base-uncased | Natural Language Processing (question answering) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Bert-mini | Natural Language Processing (question answering) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Bert-base-uncased | Natural Language Processing (question answering) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Bert-large | Natural Language Processing (question answering) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
YOLOv5s6 | Object Detection | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
ResNet18 | Image Recognition | Unstructured | Magnitude | eager |
ResNet34 | Image Recognition | Unstructured | Magnitude | eager |
ResNet50 | Image Recognition | Unstructured | Magnitude | eager |
ResNet101 | Image Recognition | Unstructured | Magnitude | eager |
*BERT large | Natural Language Processing | Structured (2x1) | Group Lasso | eager |
*Intel/bert-base-uncased-sparse-70-unstructured | Natural Language Processing (question-answering) | Unstructured | Prune once for all | eager |
*bert-base-uncased | Natural Language Processing | Structured (Filter/Channel-wise) | Gradient Sensitivity | eager |
*DistilBERT | Natural Language Processing | Unstructured | Magnitude | eager |
*Intel/bert-base-uncased-sparse-70-unstructured | Natural Language Processing (text-classification) | Unstructured | Prune once for all | eager |
Student Model | Teacher Model | Domain | Approach | Examples |
---|---|---|---|---|
CNN-2 | CNN-10 | Image Recognition | Knowledge Distillation | eager |
MobileNet V2-0.35 | WideResNet40-2 | Image Recognition | Knowledge Distillation | eager |
ResNet18|ResNet34|ResNet50|ResNet101 | ResNet18|ResNet34|ResNet50|ResNet101 | Image Recognition | Knowledge Distillation | eager |
ResNet18|ResNet34|ResNet50|ResNet101 | ResNet18|ResNet34|ResNet50|ResNet101 | Image Recognition | Self Distillation | eager |
VGG-8 | VGG-13 | Image Recognition | Knowledge Distillation | eager |
BlendCNN | BERT-Base | Natural Language Processing | Knowledge Distillation | eager |
DistilBERT | BERT-Base | Natural Language Processing | Knowledge Distillation | eager |
BiLSTM | RoBERTa-Base | Natural Language Processing | Knowledge Distillation | eager |
TinyBERT | BERT-Base | Natural Language Processing | Knowledge Distillation | eager |
BERT-3 | BERT-Base | Natural Language Processing | Knowledge Distillation | eager |
DistilRoBERTa | RoBERTa-Large | Natural Language Processing | Knowledge Distillation | eager |
Model | Domain | Approach | Examples |
---|---|---|---|
ResNet50 | Image Recognition | Multi-shot: Pruning and PTQ |
link |
ResNet50 | Image Recognition | One-shot: QAT during Pruning |
link |
Intel/bert-base-uncased-sparse-90-unstructured-pruneofa | Natural Language Processing (question-answering) | One-shot: Pruning, Distillation and QAT |
link |
Intel/bert-base-uncased-sparse-90-unstructured-pruneofa | Natural Language Processing (text-classification) | One-shot: Pruning, Distillation and QAT |
link |
BERT-mini | Natural Language Processing (text-classification) | One-shot: Pruning, Distillation |
link |
Model | Domain | Approach | Examples |
---|---|---|---|
*ResNet50 V1.5 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*ResNet50 V1.5 MLPerf | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*VGG16 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*MobileNet V2 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*MobileNet V3 MLPerf | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*AlexNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*CaffeNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*DenseNet | Image Recognition | Post-Training Static Quantization | qlinearops |
*EfficientNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
FCN | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*GoogleNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*Inception V1 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*MNIST | Image Recognition | Post-Training Static Quantization | qlinearops |
*MobileNet V2 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*ResNet50 V1.5 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*ShuffleNet V2 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
SqueezeNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*VGG16 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
*ZFNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ArcFace | Image Recognition | Post-Training Static Quantization | qlinearops |
*BERT base MRPC | Natural Language Processing | Post-Training Static Quantization | integerops / qdq |
*BERT base MRPC | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
*DistilBERT base MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
*Mobile bert MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
*Roberta base MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
BERT SQuAD | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
GPT2 lm head WikiText | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
MobileBERT SQuAD MLPerf | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
BiDAF | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
BERT base uncased MRPC (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
Roberta base MRPC (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
XLM Roberta base MRPC (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
Camembert base MRPC (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
MiniLM L12 H384 uncased MRPC (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
DistilBERT base uncased SST-2 (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
Albert base v2 SST-2 (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
MiniLM L6 H384 uncased SST-2 (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
Spanbert SQuAD (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
Bert base multilingual cased SQuAD (HuggingFace) | Natural Language Processing | Post-Training Static Quantization | qdq |
*SSD MobileNet V1 | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
*SSD MobileNet V2 | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
*SSD MobileNet V1 (ONNX Model Zoo) | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
DUC | Object Detection | Post-Training Static Quantization | qlinearops |
*Faster R-CNN | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
*Mask R-CNN | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
*SSD | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
*Tiny YOLOv3 | Object Detection | Post-Training Static Quantization | qlinearops |
*YOLOv3 | Object Detection | Post-Training Static Quantization | qlinearops |
*YOLOv4 | Object Detection | Post-Training Static Quantization | qlinearops |
Emotion FERPlus | Body Analysis | Post-Training Static Quantization | qlinearops |
Ultra Face | Body Analysis | Post-Training Static Quantization | qlinearops |