Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(project): sync en and zh docs #842

Merged
merged 21 commits into from
Aug 15, 2022
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ Models can be exported and run in the following backends, and more will be compa

All kinds of modules in the SDK can be extended, such as `Transform` for image processing, `Net` for Neural Network inference, `Module` for postprocessing and so on

## Get Started
## [Documentation](https://mmdeploy.readthedocs.io/en/latest/)

Please read [getting_started.md](docs/en/get_started.md) for the basic usage of MMDeploy. We also provide tutoials about:
Please read [getting_started](docs/en/get_started.md) for the basic usage of MMDeploy. We also provide tutoials about:

- [Build](docs/en/01-how-to-build/build_from_source.md)
- [Build from Docker](docs/en/01-how-to-build/build_from_docker.md)
Expand All @@ -77,11 +77,20 @@ Please read [getting_started.md](docs/en/get_started.md) for the basic usage of
- User Guide
- [How to convert model](docs/en/02-how-to-run/convert_model.md)
- [How to write config](docs/en/02-how-to-run/write_config.md)
- [How to evaluate deployed models](docs/en/02-how-to-run/how_to_evaluate_a_model.md)
- [How to measure performance of deployed models](docs/en/02-how-to-run/how_to_measure_performance_of_models.md)
- [How to profile model](docs/en/02-how-to-run/profile_model.md)
- [How to quantize model](docs/en/02-how-to-run/quantize_model.md)
- [Useful tools](docs/en/02-how-to-run/useful_tools.md)
- Developer Guide
- [How to support new models](docs/en/06-developer-guide/support_new_model.md)
- [How to support new backends](docs/en/06-developer-guide/support_new_backend.md)
- [How to support new models](docs/en/07-developer-guide/support_new_model.md)
- [How to support new backends](docs/en/07-developer-guide/support_new_backend.md)
- [How to partition model](docs/en/07-developer-guide/partition_model.md)
- [How to test rewritten model](docs/en/07-developer-guide/test_rewritten_models.md)
- [How to test backend ops](docs/en/07-developer-guide/add_backend_ops_unittest.md)
- [How to do regression test](docs/en/07-developer-guide/regression_test.md)
- Custom Backend Ops
- [ncnn](docs/en/06-custom-ops/ncnn.md)
- [onnxruntime](docs/en/06-custom-ops/onnxruntime.md)
- [tensorrt](docs/en/06-custom-ops/tensorrt.md)
- [FAQ](docs/en/faq.md)
- [Contributing](.github/CONTRIBUTING.md)

Expand Down
24 changes: 18 additions & 6 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,9 @@ MMDeploy 是 [OpenMMLab](https://openmmlab.com/) 模型部署工具箱,**为
- Net 推理
- Module 后处理

## [快速上手](docs/zh_cn/get_started.md)
## [中文文档](https://mmdeploy.readthedocs.io/zh_CN/latest/)

- [快速上手](docs/zh_cn/get_started.md)
- [编译](docs/zh_cn/01-how-to-build/build_from_source.md)
- [Build from Docker](docs/zh_cn/01-how-to-build/build_from_docker.md)
- [Build for Linux](docs/zh_cn/01-how-to-build/linux-x86_64.md)
Expand All @@ -77,17 +78,28 @@ MMDeploy 是 [OpenMMLab](https://openmmlab.com/) 模型部署工具箱,**为
- [配置转换参数](docs/zh_cn/02-how-to-run/write_config.md)
- [量化](docs/zh_cn/02-how-to-run/quantize_model.md)
- [测试转换完成的模型](docs/zh_cn/02-how-to-run/profile_model.md)
- [工具集介绍](docs/zh_cn/02-how-to-run/useful_tools.md)
- 开发指南
- [支持新模型](docs/zh_cn/04-developer-guide/support_new_model.md)
- [增加推理 Backend](docs/zh_cn/04-developer-guide/support_new_backend.md)
- [回归测试](docs/zh_cn/04-developer-guide/do_regression_test.md)
- [支持新模型](docs/zh_cn/07-developer-guide/support_new_model.md)
- [增加推理 backend](docs/zh_cn/07-developer-guide/support_new_backend.md)
- [模型分块](docs/zh_cn/07-developer-guide/partition_model.md)
- [测试重写模型](docs/zh_cn/07-developer-guide/test_rewritten_models.md)
- [backend 算子测试](docs/zh_cn/07-developer-guide/add_backend_ops_unittest.md)
- [回归测试](docs/zh_cn/07-developer-guide/regression_test.md)
- 各 backend 自定义算子列表
- [ncnn](docs/zh_cn/06-custom-ops/ncnn.md)
- [onnxruntime](docs/zh_cn/06-custom-ops/onnxruntime.md)
- [tensorrt](docs/zh_cn/06-custom-ops/tensorrt.md)
- [FAQ](docs/zh_cn/faq.md)
- [贡献者手册](.github/CONTRIBUTING.md)

## 新人解说

- [01 术语解释、加载第一个模型](docs/zh_cn/05-tutorial/01_introduction_to_model_deployment.md)
- [02 转成 onnx](docs/zh_cn/05-tutorial/02_challenges.md)
- [01 术语解释、加载第一个模型](docs/zh_cn/tutorial/01_introduction_to_model_deployment.md)
- [02 部署常见问题](docs/zh_cn/tutorial/02_challenges.md)
- [03 torch转onnx](docs/zh_cn/tutorial/03_pytorch2onnx.md)
- [04 让torch支持更多onnx算子](docs/zh_cn/tutorial/04_onnx_custom_op.md)
- [05 调试onnx模型](docs/zh_cn/tutorial/05_onnx_model_editing.md)

## 基准与模型库

Expand Down
2 changes: 1 addition & 1 deletion docs/en/01-how-to-build/jetsons.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ export MMDEPLOY_DIR=$(pwd)
### Install Model Converter

Since some operators adopted by OpenMMLab codebases are not supported by TensorRT, we build the custom TensorRT plugins to make it up, such as `roi_align`, `scatternd`, etc.
You can find a full list of custom plugins from [here](../ops/tensorrt.md).
You can find a full list of custom plugins from [here](../06-custom-ops/tensorrt.md).

```shell
# build TensorRT custom operators
Expand Down
2 changes: 1 addition & 1 deletion docs/en/02-how-to-run/convert_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ python ./tools/deploy.py \

## How to evaluate the exported models

You can try to evaluate model, referring to [how_to_evaluate_a_model](./how_to_evaluate_a_model.md).
You can try to evaluate model, referring to [how_to_evaluate_a_model](./profile_model.md).

## List of supported models exportable to other backends

Expand Down
44 changes: 0 additions & 44 deletions docs/en/02-how-to-run/how_to_measure_performance_of_models.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ ${MODEL_CFG} \
[--metric-options ${METRIC_OPTIONS}]
[--log2file work_dirs/output.txt]
[--batch-size ${BATCH_SIZE}]
[--speed-test] \
[--warmup ${WARM_UP}] \
[--log-interval ${LOG_INTERVERL}] \
```

## Description of all arguments
Expand All @@ -44,6 +47,9 @@ ${MODEL_CFG} \
format will be kwargs for dataset.evaluate() function.
- `--log2file`: log evaluation results (and speed) to file.
- `--batch-size`: the batch size for inference, which would override `samples_per_gpu` in data config. Default is `1`. Note that not all models support `batch_size>1`.
- `--speed-test`: Whether to activate speed test.
- `--warmup`: warmup before counting inference elapse, require setting speed-test first.
- `--log-interval`: The interval between each log, require setting speed-test first.

\* Other arguments in `tools/test.py` are used for speed test. They have no concern with evaluation.

Expand All @@ -55,7 +61,8 @@ python tools/test.py \
{MMCLS_DIR}/configs/resnet/resnet50_b32x8_imagenet.py \
--model model.onnx \
--out out.pkl \
--device cuda:0
--device cpu \
--speed-test
```

## Note
Expand Down
67 changes: 67 additions & 0 deletions docs/en/02-how-to-run/quantize_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Quantize model

## Why quantization ?

The fixed-point model has many advantages over the fp32 model:

- Smaller size, 8-bit model reduces file size by 75%
- Benefit from the smaller model, the Cache hit rate is improved and inference would be faster
- Chips tend to have corresponding fixed-point acceleration instructions which are faster and less energy consumed (int8 on a common CPU requires only about 10% of energy)

The size of the installation package and the heat generation are the key indicators of the mobile terminal evaluation APP;
On the server side, quantization means that you can maintain the same QPS and improve model precision in exchange for improved accuracy.

## Post training quantization scheme

Taking ncnn backend as an example, the complete workflow is as follows:

<div align="center">
<img src="../_static/image/quant_model.png"/>
</div>

mmdeploy generates quantization table based on static graph (onnx) and uses backend tools to convert fp32 model to fixed point.

Currently mmdeploy support ncnn with PTQ.

## How to convert model

[After mmdeploy installation](../01-how-to-build/build_from_source.md), install ppq

```bash
git clone https://github.com/openppl-public/ppq.git
cd ppq
git checkout edbecf4 # import some feature
pip install -r requirements.txt
python3 setup.py install
```

Back in mmdeploy, enable quantization with the option 'tools/deploy.py --quant'.

```bash
cd /path/to/mmdeploy
export MODEL_PATH=/path/to/mmclassification/configs/resnet/resnet18_8xb16_cifar10.py
export MODEL_CONFIG=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth

python3 tools/deploy.py configs/mmcls/classification_ncnn-int8_static.py ${MODEL_CONFIG} ${MODEL_PATH} /path/to/self-test.png --work-dir work_dir --device cpu --quant --quant-image-dir /path/to/images
...
```

Description

| Parameter | Meaning |
| :---------------: | :--------------------------------------------------------------: |
| --quant | Enable quantization, the default value is False |
| --quant-image-dir | Calibrate dataset, use Validation Set in MODEL_CONFIG by default |

## Custom calibration dataset

Calibration set is used to calculate quantization layer parameters. Some DFQ (Data Free Quantization) methods do not even require a dataset.

- Create a new folder, just put in the picture (no directory structure required, no negative example required, no filename format required)
- The image needs to be the data comes from real scenario otherwise the accuracy would be drop
- You can not quantize model with test dataset
| Type | Train dataset | Validation dataset | Test dataset | Calibration dataset |
| ----- | ------------- | ------------------ | ------------- | ------------------- |
| Usage | QAT | PTQ | Test accuracy | PTQ |

It is highly recommended that \[verifying model precision\] (./profile_model.md) after quantization. \[Here\] (.. /03-benchmark/quantization.md) is some quantization model test result.
tpoisonooo marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# Useful Tools

Apart from `deploy.py`, there are other useful tools under the `tools/` directory.

## torch2onnx
Expand Down Expand Up @@ -96,7 +98,8 @@ python tools/onnx2tensorrt.py \
${ONNX_PATH} \
${OUTPUT} \
--device-id 0 \
--log-level INFO
--log-level INFO \
--calib-file /path/to/file
```

### Description of all arguments
Expand Down
4 changes: 2 additions & 2 deletions docs/en/03-benchmark/benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ GPU: ncnn, TensorRT, PPLNN
- Warm up. For ncnn, we warm up 30 iters for all codebases. As for other backends: for classification, we warm up 1010 iters; for other codebases, we warm up 10 iters.
- Input resolution varies for different datasets of different codebases. All inputs are real images except for `mmediting` because the dataset is not large enough.

Users can directly test the speed through [model profiling](../02-how-to-run/how_to_measure_performance_of_models.md). And here is the benchmark in our environment.
Users can directly test the speed through [model profiling](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.

<div style="margin-left: 25px;">
<table class="docutils">
Expand Down Expand Up @@ -407,7 +407,7 @@ Users can directly test the speed through [model profiling](../02-how-to-run/how

## Performance benchmark

Users can directly test the performance through [how_to_evaluate_a_model.md](../02-how-to-run/how_to_evaluate_a_model.md). And here is the benchmark in our environment.
Users can directly test the performance through [how_to_evaluate_a_model.md](../02-how-to-run/profile_model.md). And here is the benchmark in our environment.

<div style="margin-left: 25px;">
<table class="docutils">
Expand Down
2 changes: 1 addition & 1 deletion docs/en/03-benchmark/benchmark_edge.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Test on embedded device

Here are the test conclusions of our edge devices. You can directly obtain the results of your own environment with [model profiling](../02-how-to-run/how_to_evaluate_a_model.md).
Here are the test conclusions of our edge devices. You can directly obtain the results of your own environment with [model profiling](../02-how-to-run/profile_model.md).

## Software and hardware environment

Expand Down
27 changes: 27 additions & 0 deletions docs/en/03-benchmark/quantization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Quantization test result

Currently mmdeploy support ncnn quantization

## Quantize with ncnn

### mmcls

| model | dataset | fp32 top-1 (%) | int8 top-1 (%) |
| :--------------------------------------------------------------------------------------------------------------------------: | :---------: | :------------: | :------------: |
| [ResNet-18](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet18_8xb16_cifar10.py) | Cifar10 | 94.82 | 94.83 |
| [ResNeXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | ImageNet-1k | 77.90 | 78.20\* |
| [MobileNet V2](https://github.com/open-mmlab/mmclassification/blob/master/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | ImageNet-1k | 71.86 | 71.43\* |
| [HRNet-W18\*](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/hrnet-w18_4xb32_in1k.py) | ImageNet-1k | 76.75 | 76.25\* |

Note:

- Because of the large amount of imagenet-1k data and ncnn has not released Vulkan int8 version, only part of the test set (4000/50000) is used.
- The accuracy will vary after quantization, and it is normal for the classification model to increase by less than 1%.

### OCR detection

| model | dataset | fp32 hmean | int8 hmean |
| :---------------------------------------------------------------------------------------------------------------: | :-------: | :--------: | :------------: |
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) | ICDAR2015 | 0.795 | 0.792 @thr=0.9 |

Note: \[mmocr\] (https://github.com/open-mmlab/mmocr) Uses 'shapely' to compute IoU, which results in a slight difference in accuracy
tpoisonooo marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion docs/en/04-supported-codebases/mmocr.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Please refer to [install.md](https://mmocr.readthedocs.io/en/latest/install.html

Note that ncnn, pplnn, and OpenVINO only support the configs of DBNet18 for DBNet.

For the PANet with the [checkpoint](https://download.openmmlab.com/mmocr/textdet/panet/panet_r18_fpem_ffm_sbn_600e_icdar2015_20210219-42dbe46a.pth) pretrained on ICDAR dateset, if you want to convert the model to TensorRT with 16 bits float point, please try the following script.
For the PANet with the [checkpoint](https://download.openmmlab.com/mmocr/textdet/panet/panet_r18_fpem_ffm_sbn_600e_icdar2015_20210219-42dbe46a.pth) pretrained on ICDAR dataset, if you want to convert the model to TensorRT with 16 bits float point, please try the following script.

```python
# Copyright (c) OpenMMLab. All rights reserved.
Expand Down
Loading