Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Compression] Add bias correction feature for PTQ quantizer #5603

Merged
merged 17 commits into from
Jun 29, 2023

Conversation

Bonytu
Copy link
Contributor

@Bonytu Bonytu commented Jun 9, 2023

Description

Bias correction in post-training quantization refers to the process of adjusting and fine-tuning the quantized model's weights and biases to reduce the discrepancies between the quantized model's predictions and the original full-precision model's predictions. Post-training quantization is a technique used to reduce the memory and computation requirements of deep learning models by converting the model's parameters, such as weights and biases, into lower-precision representations (e.g., from 32-bit floating-point numbers to 8-bit integers). This can lead to some loss of accuracy due to the reduced numerical precision. Bias correction aims to mitigate this accuracy loss by correcting the systematic errors introduced during the quantization process, ultimately improving the overall performance of the quantized model.

Test Options

  • fast test
  • full test - HPO
  • full test - NAS
  • full test - compression

Checklist

  • test case
  • doc

How to test

@QuanluZhang
Copy link
Contributor

could you briefly explain what is bias correction feature in this pr's description?

@Bonytu
Copy link
Contributor Author

Bonytu commented Jun 26, 2023

could you briefly explain what is bias correction feature in this pr's description?

Bias correction in post-training quantization refers to the process of adjusting and fine-tuning the quantized model's weights and biases to reduce the discrepancies between the quantized model's predictions and the original full-precision model's predictions. Post-training quantization is a technique used to reduce the memory and computation requirements of deep learning models by converting the model's parameters, such as weights and biases, into lower-precision representations (e.g., from 32-bit floating-point numbers to 8-bit integers). This can lead to some loss of accuracy due to the reduced numerical precision. Bias correction aims to mitigate this accuracy loss by correcting the systematic errors introduced during the quantization process, ultimately improving the overall performance of the quantized model.

@Bonytu Bonytu merged commit 5e22f49 into microsoft:master Jun 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants