ONNX to TensorRT method #278

zhiqwang · 2022-07-22T14:52:36Z

Very glad to see that #61 was merged, and I see that there is a new instructions say that ONNX can export to TensorRT via Lin's repo, but this doesn't seem like a good way to do it from a maintenance perspective, is it possible that Lin can pull this export method to this repository as a command line tools?

git clone https://github.com/Linaom1214/tensorrt-python.git
cd tensorrt-python
python export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

BTW we can also use the trtexct provided by TensorRT to export the serialized engine as below (for TensorRT 8.2.4+, seems that TensorRT 8.4 deprecates workspace argument):

trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --workspace=8192 --fp16

Perhaps this approach would be simpler.

The text was updated successfully, but these errors were encountered:

AlexeyAB · 2022-07-22T15:33:21Z

@zhiqwang Hi,

Will trtexec generate more optimized faster TRT model?

Does pip install nvidia-tensorrt install trtexec file?

Yes, perhaps [export.py](https://github.com/Linaom1214/tensorrt-python/blob/main/export.py) and image_batch.py from https://github.com/Linaom1214/tensorrt-python can be added to this repo.

Additionally we can use:

More Convenience export from Add end2end yolov7 onnx export for TensorRT8.0+ and onnxruntime(testing now) #273
and Calibrator for INT8 quantization from Add TensorRT infer support #57

triple-Mu · 2022-07-22T15:37:59Z

This repository https://github.com/triple-Mu/YOLO-TensorRT8 enables easier end2end registration.
In contrast, there is no need to modify the code of this repository.
I think this pr #273 is more suitable for yolov7, because it is very convenient to add new content and adapt to new modules.
We can also add preprocess into onnx for tensorrt.
In addition to that, I have the inspiration to add letterbox to onnx, which I am implementing recently. It will also sync to this pr.

triple-Mu · 2022-07-22T15:39:35Z

The trtexec tool avoids writing the code to generate the engine, and it has built-in performance analysis tools such as inference speed test, cuda kernel startup time and so on. I have related cases in notebook.

zhiqwang · 2022-07-22T15:52:57Z

Hi @AlexeyAB ,

Will trtexec generate more optimized faster TRT model?

TRT generated by trtexec is same with the Python API, export.py in Lin's repo.

Does pip install nvidia-tensorrt install trtexec file?

I guess not, trtexec is the C++ command line tools. I guess most users will use the C++ version of TRT.

More Convenience export from #273

I am relatively neutral on this one. onnx-graphsurgeon is provided by TensorRT official, and it's cross-platform (We can even use it on the Apple M1, which only relies on ONNX Runtime), it provide a more friendly API to edit ONNX, see official example https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon/examples and custom example https://github.com/PINTO0309/simple-onnx-processing-tools for onnxgs.

and Calibrator for INT8 quantization from #57

This PR is great!

Just FYI seems trtexec can partially handle this user case.

trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16 --int8 --calib=<file>

triple-Mu · 2022-07-22T16:01:04Z

Hi @AlexeyAB ,

Will trtexec generate more optimized faster TRT model?

TRT generated by trtexec is same with the Python API, export.py in Lin's repo.

Does pip install nvidia-tensorrt install trtexec file?

I guess not, trtexec is the C++ command line tools. I guess most users will use the C++ version of TRT.

More Convenience export from #273

I am relatively neutral on this one. onnx-graphsurgeon is provided by TensorRT official, and it's cross-platform, it provide a more friendly API to edit ONNX, see official example https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon/examples and custom example https://github.com/PINTO0309/simple-onnx-processing-tools for onnxgs.

and Calibrator for INT8 quantization from #57

trtexec can also handle this user case.

It's true that graphsurgeon can do a lot, but this builds on people's familiarity with the API. If you need to add functions such as preprocessing, it will be difficult to use graphsurgeon if a lot of op changes are involved. And the difficulty of reading the code will also increase.

philipp-schmidt · 2022-07-22T16:06:21Z

+1 for using trtexec to convert.
It is not included in pip, but it is very easy to run via docker:

$ docker run -it --rm --gpus=all nvcr.io/nvidia/tensorrt:22.04-py3
root@0a998d3fb769:/workspace# cd tensorrt/bin/
root@0a998d3fb769:/workspace/tensorrt/bin# ls -al
total 464
drwxrwxrwx 2 root root   4096 Apr  4 16:44 .
drwxr-xr-x 4 root root   4096 Apr  4 16:44 ..
-rwxrwxrwx 1 root root 465712 Apr  4 16:44 trtexec
root@0a998d3fb769:/workspace/tensorrt/bin#

philipp-schmidt · 2022-07-22T16:09:19Z

Also, the ONNX-TRT script provided by @Linaom1214 unfortunatly lacks a lot of options, e.g. implicit batching versus explicit batching settings, optimization profiles, etc...
E.g. the script from #57 is more complete and offers most flags that trtexec offers.

philipp-schmidt · 2022-07-22T17:02:55Z

ONNX to TensorRT with docker for reference. Docker with GPU is the only dependency for this.
From PR #280

ONNX to TensorRT with docker

docker run -it --rm --gpus=all nvcr.io/nvidia/tensorrt:22.04-py3
# from new shell copy onnx to container
docker cp yolov7-tiny.onnx 898c16f38c99:/workspace/tensorrt/bin
# in container now
cd /workspace/tensorrt/bin
# convert onnx to tensorrt with min batch size 1, opt batch size 8 and max batch size 16
./trtexec --onnx=yolov7-tiny.onnx --minShapes=input:1x3x640x640 --optShapes=input:8x3x640x640 --maxShapes=input:16x3x640x640 --fp16 --workspace=4096 --saveEngine=yolov7-tiny.engine

Linaom1214 · 2022-07-23T01:10:04Z

此外，由提供的 ONNX-TRT 脚本@Linaom1214不幸的是缺少很多选项，例如隐式批处理与显式批处理设置、优化配置文件等...例如， #57中的脚本更完整，并提供了 trtexec 提供的大多数标志。
My test environment is colab. Obviously, it is difficult to install a complete TensorRT environment. trtexec is more recommended for local environment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX to TensorRT method #278

ONNX to TensorRT method #278

zhiqwang commented Jul 22, 2022 •

edited

Loading

AlexeyAB commented Jul 22, 2022

triple-Mu commented Jul 22, 2022

triple-Mu commented Jul 22, 2022

zhiqwang commented Jul 22, 2022 •

edited

Loading

triple-Mu commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022 •

edited

Loading

Linaom1214 commented Jul 23, 2022

ONNX to TensorRT method #278

ONNX to TensorRT method #278

Comments

zhiqwang commented Jul 22, 2022 • edited Loading

AlexeyAB commented Jul 22, 2022

triple-Mu commented Jul 22, 2022

triple-Mu commented Jul 22, 2022

zhiqwang commented Jul 22, 2022 • edited Loading

triple-Mu commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022

philipp-schmidt commented Jul 22, 2022 • edited Loading

ONNX to TensorRT with docker

Linaom1214 commented Jul 23, 2022

zhiqwang commented Jul 22, 2022 •

edited

Loading

zhiqwang commented Jul 22, 2022 •

edited

Loading

philipp-schmidt commented Jul 22, 2022 •

edited

Loading