This tutorial demonstrates how to apply INT8
quantization to the speech recognition models,
using the Post-Training Optimization Tool API (POT API)
(part of OpenVINO Toolkit).
Supported models:
107-speech-recognition-wav2vec2.ipynb
demonstrates how to apply post-training INT8 quantization on a fine-tuned Wav2Vec2-Base-960h PyTorch model, trained on the LibriSpeech ASR corpus.107-speech-recognition-data2vec.ipynb
demonstrates how to apply post-training INT8 quantization on a fine-tuned Data2Vec-Audio-Base-960h PyTorch model, trained on the LibriSpeech ASR corpus.
The code of the tutorials are designed to be extendable to custom models and datasets.
The tutorial consists of the following steps:
- Downloading and preparing the model and dataset.
- Defining data loading and accuracy validation functionality.
- Preparing the model for quantization.
- Running optimization pipeline.
- Comparing performance of the original and quantized models.
- Compare accuracy of the original and quantized models.