Here we present the results achieved using our sample scripts, example patches to third-party repositories and NNCF configuration files.
The applied quantization compression algorithms are divided into two broad categories: Quantization-Aware Training (QAT) and Post-Training Quantization (PTQ). Here we mainly report the QAT results and the PTQ results may be found on an OpenVino Performance Benchmarks page.
Model | Compression algorithm | Dataset | Accuracy (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
GoogLeNet | - | ImageNet | 69.77 | Config | - |
GoogLeNet | • Filter pruning: 40%, geometric median criterion | ImageNet | 69.47 (0.30) | Config | Download |
Inception V3 | - | ImageNet | 77.33 | Config | - |
Inception V3 | • QAT: INT8 | ImageNet | 77.45 (-0.12) | Config | Download |
Inception V3 | • QAT: INT8 • Sparsity: 61% (RB) |
ImageNet | 76.36 (0.97) | Config | Download |
MobileNet V2 | - | ImageNet | 71.87 | Config | - |
MobileNet V2 | • QAT: INT8 | ImageNet | 71.07 (0.80) | Config | Download |
MobileNet V2 | • QAT: INT8 (per-tensor only) | ImageNet | 71.24 (0.63) | Config | Download |
MobileNet V2 | • QAT: Mixed, 58.88% INT8 / 41.12% INT4 | ImageNet | 70.95 (0.92) | Config | Download |
MobileNet V2 | • QAT: INT8 • Sparsity: 52% (RB) |
ImageNet | 71.09 (0.78) | Config | Download |
MobileNet V3 (Small) | - | ImageNet | 67.66 | Config | - |
MobileNet V3 (Small) | • QAT: INT8 | ImageNet | 66.98 (0.68) | Config | Download |
ResNet-18 | • Filter pruning: 40%, magnitude criterion | ImageNet | 69.27 (0.49) | Config | Download |
ResNet-18 | • Filter pruning: 40%, geometric median criterion | ImageNet | 69.31 (0.45) | Config | Download |
ResNet-18 | • Accuracy-aware compressed training • Filter pruning: 60%, geometric median criterion |
ImageNet | 69.2 (-0.6) | Config | - |
ResNet-34 | - | ImageNet | 73.30 | Config | - |
ResNet-34 | • Filter pruning: 50%, geometric median criterion • Knowledge distillation |
ImageNet | 73.11 (0.19) | Config | Download |
ResNet-50 | - | ImageNet | 76.15 | Config | - |
ResNet-50 | • QAT: INT8 | ImageNet | 76.46 (-0.31) | Config | Download |
ResNet-50 | • QAT: INT8 (per-tensor only) | ImageNet | 76.39 (-0.24) | Config | Download |
ResNet-50 | • QAT: Mixed, 43.12% INT8 / 56.88% INT4 | ImageNet | 76.05 (0.10) | Config | Download |
ResNet-50 | • QAT: INT8 • Sparsity: 61% (RB) |
ImageNet | 75.42 (0.73) | Config | Download |
ResNet-50 | • QAT: INT8 • Sparsity: 50% (RB) |
ImageNet | 75.50 (0.65) | Config | Download |
ResNet-50 | • Filter pruning: 40%, geometric median criterion | ImageNet | 75.57 (0.58) | Config | Download |
ResNet-50 | • Accuracy-aware compressed training • Filter pruning: 52.5%, geometric median criterion |
ImageNet | 75.23 (0.93) | Config | - |
SqueezeNet V1.1 | - | ImageNet | 58.19 | Config | - |
SqueezeNet V1.1 | • QAT: INT8 | ImageNet | 58.22 (-0.03) | Config | Download |
SqueezeNet V1.1 | • QAT: INT8 (per-tensor only) | ImageNet | 58.11 (0.08) | Config | Download |
SqueezeNet V1.1 | • QAT: Mixed, 52.83% INT8 / 47.17% INT4 | ImageNet | 57.57 (0.62) | Config | Download |
Model | Compression algorithm | Dataset | mAP (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
SSD300‑MobileNet | - | VOC12+07 train, VOC07 eval | 62.23 | Config | Download |
SSD300‑MobileNet | • QAT: INT8 • Sparsity: 70% (Magnitude) |
VOC12+07 train, VOC07 eval | 62.95 (-0.72) | Config | Download |
SSD300‑VGG‑BN | - | VOC12+07 train, VOC07 eval | 78.28 | Config | Download |
SSD300‑VGG‑BN | • QAT: INT8 | VOC12+07 train, VOC07 eval | 77.81 (0.47) | Config | Download |
SSD300‑VGG‑BN | • QAT: INT8 • Sparsity: 70% (Magnitude) |
VOC12+07 train, VOC07 eval | 77.66 (0.62) | Config | Download |
SSD300‑VGG‑BN | • Filter pruning: 40%, geometric median criterion | VOC12+07 train, VOC07 eval | 78.35 (-0.07) | Config | Download |
SSD512-VGG‑BN | - | VOC12+07 train, VOC07 eval | 80.26 | Config | Download |
SSD512-VGG‑BN | • QAT: INT8 | VOC12+07 train, VOC07 eval | 80.04 (0.22) | Config | Download |
SSD512-VGG‑BN | • QAT: INT8 • Sparsity: 70% (Magnitude) |
VOC12+07 train, VOC07 eval | 79.68 (0.58) | Config | Download |
Model | Compression algorithm | Dataset | mIoU (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
ICNet | - | CamVid | 67.89 | Config | Download |
ICNet | • QAT: INT8 | CamVid | 67.89 (0.00) | Config | Download |
ICNet | • QAT: INT8 • Sparsity: 60% (Magnitude) |
CamVid | 67.16 (0.73) | Config | Download |
UNet | - | CamVid | 71.95 | Config | Download |
UNet | • QAT: INT8 | CamVid | 71.89 (0.06) | Config | Download |
UNet | • QAT: INT8 • Sparsity: 60% (Magnitude) |
CamVid | 72.46 (-0.51) | Config | Download |
UNet | - | Mapillary | 56.24 | Config | Download |
UNet | • QAT: INT8 | Mapillary | 56.09 (0.15) | Config | Download |
UNet | • QAT: INT8 • Sparsity: 60% (Magnitude) |
Mapillary | 55.69 (0.55) | Config | Download |
UNet | • Filter pruning: 25%, geometric median criterion | Mapillary | 55.64 (0.60) | Config | Download |
Model | Compression algorithm | Dataset | Accuracy (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
Inception V3 | - | ImageNet | 77.91 | Config | - |
Inception V3 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) | ImageNet | 78.39 (-0.48) | Config | Download |
Inception V3 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 61% (RB) |
ImageNet | 77.52 (0.39) | Config | Download |
Inception V3 | • Sparsity: 54% (Magnitude) | ImageNet | 77.86 (0.05) | Config | Download |
MobileNet V2 | - | ImageNet | 71.85 | Config | - |
MobileNet V2 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) | ImageNet | 71.63 (0.22) | Config | Download |
MobileNet V2 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 52% (RB) |
ImageNet | 70.94 (0.91) | Config | Download |
MobileNet V2 | • Sparsity: 50% (RB) | ImageNet | 71.34 (0.51) | Config | Download |
MobileNet V2 (TensorFlow Hub MobileNet V2) | • Sparsity: 35% (Magnitude) | ImageNet | 71.87 (-0.02) | Config | Download |
MobileNet V3 (Large) | - | ImageNet | 75.80 | Config | - |
MobileNet V3 (Large) | • QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) | ImageNet | 75.04 (0.76) | Config | Download |
MobileNet V3 (Large) | • QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 42% (RB) |
ImageNet | 75.24 (0.56) | Config | Download |
MobileNet V3 (Small) | - | ImageNet | 68.38 | Config | - |
MobileNet V3 (Small) | • QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) | ImageNet | 67.79 (0.59) | Config | Download |
MobileNet V3 (Small) | • QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 42% (Magnitude) |
ImageNet | 67.44 (0.94) | Config | Download |
ResNet-50 | - | ImageNet | 75.05 | Config | - |
ResNet-50 | • QAT: INT8 | ImageNet | 74.99 (0.06) | Config | Download |
ResNet-50 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Sparsity: 65% (RB) |
ImageNet | 74.36 (0.69) | Config | Download |
ResNet-50 | • Sparsity: 80% (RB) | ImageNet | 74.38 (0.67) | Config | Download |
ResNet-50 | • Filter pruning: 40%, geometric median criterion | ImageNet | 74.96 (0.09) | Config | Download |
ResNet-50 | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Filter pruning: 40%, geometric median criterion |
ImageNet | 75.09 (-0.04) | Config | Download |
ResNet50 | • Accuracy-aware compressed training • Sparsity: 65% (Magnitude) |
ImageNet | 74.37 (0.67) | Config | - |
Model | Compression algorithm | Dataset | mAP (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
RetinaNet | - | COCO 2017 | 33.43 | Config | Download |
RetinaNet | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) | COCO 2017 | 33.12 (0.31) | Config | Download |
RetinaNet | • Sparsity: 50% (Magnitude) | COCO 2017 | 33.10 (0.33) | Config | Download |
RetinaNet | • Filter pruning: 40% | COCO 2017 | 32.72 (0.71) | Config | Download |
RetinaNet | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) • Filter pruning: 40% |
COCO 2017 | 32.67 (0.76) | Config | Download |
YOLO v4 | - | COCO 2017 | 47.07 | Config | Download |
YOLO v4 | • QAT: INT8 (per-channel symmetric for weights, per-tensor asymmetric half-range for activations) | COCO 2017 | 46.20 (0.87) | Config | Download |
YOLO v4 | • Sparsity: 50% (Magnitude) | COCO 2017 | 46.49 (0.58) | Config | Download |
Model | Compression algorithm | Dataset | mAP (drop) % | Configuration | Checkpoint |
---|---|---|---|---|---|
Mask‑R‑CNN | - | COCO 2017 | bbox: 37.33 segm: 33.56 |
Config | Download |
Mask‑R‑CNN | • QAT: INT8 (per-tensor symmetric for weights, per-tensor asymmetric half-range for activations) | COCO 2017 | bbox: 37.19 (0.14) segm: 33.54 (0.02) |
Config | Download |
Mask‑R‑CNN | • Sparsity: 50% (Magnitude) | COCO 2017 | bbox: 36.94 (0.39) segm: 33.23 (0.33) |
Config | Download |
ONNX Model | Compression algorithm | Dataset | Accuracy (drop) % |
---|---|---|---|
DenseNet-121 | PTQ | ImageNet | 60.16 (0.8) |
GoogleNet | PTQ | ImageNet | 66.36 (0.3) |
MobileNet V2 | PTQ | ImageNet | 71.38 (0.49) |
ResNet-50 | PTQ | ImageNet | 74.63 (0.21) |
ShuffleNet | PTQ | ImageNet | 47.25 (0.18) |
SqueezeNet V1.0 | PTQ | ImageNet | 54.3 (0.54) |
VGG‑16 | PTQ | ImageNet | 72.02 (0.0) |
ONNX Model | Compression algorithm | Dataset | mAP (drop) % |
---|---|---|---|
SSD1200 | PTQ | COCO2017 | 20.17 (0.17) |
Tiny-YOLOv2 | PTQ | VOC12 | 29.03 (0.23) |