Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
laugh12321 committed Apr 23, 2024
1 parent d8e8662 commit 8610838
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 0 deletions.
22 changes: 22 additions & 0 deletions tools/README.en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
English | [简体中文](README.md)

# PTQ INT8 Quantization

This is a script for fast PTQ (Post Training Quantization) INT8 quantization using TensorRT, supporting both dynamic and static batching.

## Usage

First, configure the model you want to quantize in `calibration.yaml`.

`calibrator.data` is the path to the data used for calibration, and `calibrator.cache` is the location to save the generated calibration files.

> If you choose **dynamic batching**, ensure that the dimensions of **`batch_shape`** match **`shapes.opt`**. If you choose **static batching**, set **`dynamic`** to **`False`**, and ignore **`shapes`**.
After configuring `calibration.yaml`, run the following command to perform quantization:

```bash
cd tools
python ptq_calibration.py
```

The precision and latency after PTQ quantization vary depending on the model. For maximum precision, it is recommended to use QAT quantization.
22 changes: 22 additions & 0 deletions tools/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[English](README.en.md) | 简体中文

# PTQ INT8 量化

这是一个使用 TensorRT 进行快速 PTQ(Post Training Quantization)INT8 量化的脚本,支持动态和静态 Batch。

## 使用方法

首先,在 `calibration.yaml` 中配置你要量化的模型。

`calibrator.data` 是用于校准的数据路径,而 `calibrator.cache` 则是保存生成的校准文件的位置。

> 如果你选择 **动态 Batch**,务必确保 **`batch_shape`** 的维度与 **`shapes.opt`** 一致;如果你选择 **静态 Batch**,将 **`dynamic`** 设为 **`False`**,并**忽略 `shapes`**
配置好 `calibration.yaml` 后,运行以下命令进行量化:

```bash
cd tools
python ptq_calibration.py
```

PTQ 量化后的精度与延时因模型而异,如果追求最高精度,建议使用 QAT 量化。

0 comments on commit 8610838

Please sign in to comment.