Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX to TensorRT method #278

Open
zhiqwang opened this issue Jul 22, 2022 · 9 comments
Open

ONNX to TensorRT method #278

zhiqwang opened this issue Jul 22, 2022 · 9 comments

Comments

@zhiqwang
Copy link

zhiqwang commented Jul 22, 2022

Hi @AlexeyAB and @Linaom1214

Very glad to see that #61 was merged, and I see that there is a new instructions say that ONNX can export to TensorRT via Lin's repo, but this doesn't seem like a good way to do it from a maintenance perspective, is it possible that Lin can pull this export method to this repository as a command line tools?

git clone https://github.com/Linaom1214/tensorrt-python.git
cd tensorrt-python
python export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

BTW we can also use the trtexct provided by TensorRT to export the serialized engine as below (for TensorRT 8.2.4+, seems that TensorRT 8.4 deprecates workspace argument):

trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --workspace=8192 --fp16

Perhaps this approach would be simpler.

@AlexeyAB
Copy link
Collaborator

@zhiqwang Hi,

Will trtexec generate more optimized faster TRT model?

Does pip install nvidia-tensorrt install trtexec file?


Additionally we can use:

@triple-Mu
Copy link
Contributor

This repository https://github.com/triple-Mu/YOLO-TensorRT8 enables easier end2end registration.
In contrast, there is no need to modify the code of this repository.
I think this pr #273 is more suitable for yolov7, because it is very convenient to add new content and adapt to new modules.
We can also add preprocess into onnx for tensorrt.
In addition to that, I have the inspiration to add letterbox to onnx, which I am implementing recently. It will also sync to this pr.

@triple-Mu
Copy link
Contributor

The trtexec tool avoids writing the code to generate the engine, and it has built-in performance analysis tools such as inference speed test, cuda kernel startup time and so on. I have related cases in notebook.

@zhiqwang
Copy link
Author

zhiqwang commented Jul 22, 2022

Hi @AlexeyAB ,

Will trtexec generate more optimized faster TRT model?

TRT generated by trtexec is same with the Python API, export.py in Lin's repo.

Does pip install nvidia-tensorrt install trtexec file?

I guess not, trtexec is the C++ command line tools. I guess most users will use the C++ version of TRT.

More Convenience export from #273

I am relatively neutral on this one. onnx-graphsurgeon is provided by TensorRT official, and it's cross-platform (We can even use it on the Apple M1, which only relies on ONNX Runtime), it provide a more friendly API to edit ONNX, see official example https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon/examples and custom example https://github.com/PINTO0309/simple-onnx-processing-tools for onnxgs.

and Calibrator for INT8 quantization from #57

This PR is great!

Just FYI seems trtexec can partially handle this user case.

trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16 --int8 --calib=<file>

@triple-Mu
Copy link
Contributor

Hi @AlexeyAB ,

Will trtexec generate more optimized faster TRT model?

TRT generated by trtexec is same with the Python API, export.py in Lin's repo.

Does pip install nvidia-tensorrt install trtexec file?

I guess not, trtexec is the C++ command line tools. I guess most users will use the C++ version of TRT.

More Convenience export from #273

I am relatively neutral on this one. onnx-graphsurgeon is provided by TensorRT official, and it's cross-platform, it provide a more friendly API to edit ONNX, see official example https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon/examples and custom example https://github.com/PINTO0309/simple-onnx-processing-tools for onnxgs.

and Calibrator for INT8 quantization from #57

trtexec can also handle this user case.

It's true that graphsurgeon can do a lot, but this builds on people's familiarity with the API. If you need to add functions such as preprocessing, it will be difficult to use graphsurgeon if a lot of op changes are involved. And the difficulty of reading the code will also increase.

@philipp-schmidt
Copy link
Contributor

+1 for using trtexec to convert.
It is not included in pip, but it is very easy to run via docker:

$ docker run -it --rm --gpus=all nvcr.io/nvidia/tensorrt:22.04-py3
root@0a998d3fb769:/workspace# cd tensorrt/bin/
root@0a998d3fb769:/workspace/tensorrt/bin# ls -al
total 464
drwxrwxrwx 2 root root   4096 Apr  4 16:44 .
drwxr-xr-x 4 root root   4096 Apr  4 16:44 ..
-rwxrwxrwx 1 root root 465712 Apr  4 16:44 trtexec
root@0a998d3fb769:/workspace/tensorrt/bin#

@philipp-schmidt
Copy link
Contributor

Also, the ONNX-TRT script provided by @Linaom1214 unfortunatly lacks a lot of options, e.g. implicit batching versus explicit batching settings, optimization profiles, etc...
E.g. the script from #57 is more complete and offers most flags that trtexec offers.

@philipp-schmidt
Copy link
Contributor

philipp-schmidt commented Jul 22, 2022

ONNX to TensorRT with docker for reference. Docker with GPU is the only dependency for this.
From PR #280

ONNX to TensorRT with docker

docker run -it --rm --gpus=all nvcr.io/nvidia/tensorrt:22.04-py3
# from new shell copy onnx to container
docker cp yolov7-tiny.onnx 898c16f38c99:/workspace/tensorrt/bin
# in container now
cd /workspace/tensorrt/bin
# convert onnx to tensorrt with min batch size 1, opt batch size 8 and max batch size 16
./trtexec --onnx=yolov7-tiny.onnx --minShapes=input:1x3x640x640 --optShapes=input:8x3x640x640 --maxShapes=input:16x3x640x640 --fp16 --workspace=4096 --saveEngine=yolov7-tiny.engine

@Linaom1214
Copy link
Contributor

此外,由提供的 ONNX-TRT 脚本@Linaom1214不幸的是缺少很多选项,例如隐式批处理与显式批处理设置、优化配置文件等...例如, #57中 的脚本更完整,并提供了 trtexec 提供的大多数标志。
My test environment is colab. Obviously, it is difficult to install a complete TensorRT environment. trtexec is more recommended for local environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants