-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[Compression] Add bias correction feature for PTQ quantizer #5603
Conversation
could you briefly explain what is bias correction feature in this pr's description? |
Bias correction in post-training quantization refers to the process of adjusting and fine-tuning the quantized model's weights and biases to reduce the discrepancies between the quantized model's predictions and the original full-precision model's predictions. Post-training quantization is a technique used to reduce the memory and computation requirements of deep learning models by converting the model's parameters, such as weights and biases, into lower-precision representations (e.g., from 32-bit floating-point numbers to 8-bit integers). This can lead to some loss of accuracy due to the reduced numerical precision. Bias correction aims to mitigate this accuracy loss by correcting the systematic errors introduced during the quantization process, ultimately improving the overall performance of the quantized model. |
Description
Bias correction in post-training quantization refers to the process of adjusting and fine-tuning the quantized model's weights and biases to reduce the discrepancies between the quantized model's predictions and the original full-precision model's predictions. Post-training quantization is a technique used to reduce the memory and computation requirements of deep learning models by converting the model's parameters, such as weights and biases, into lower-precision representations (e.g., from 32-bit floating-point numbers to 8-bit integers). This can lead to some loss of accuracy due to the reduced numerical precision. Bias correction aims to mitigate this accuracy loss by correcting the systematic errors introduced during the quantization process, ultimately improving the overall performance of the quantized model.
Test Options
Checklist
How to test