GitHub - apple2333cream/t5-trt-cpp: T5 tensorrt cpp

t5-trt-cpp

本项目是在英伟达的GPU显卡环境对谷歌的t5模型用tensorrt C++的api进行推理加速，本项目包含了推理demo和http的API服务，支持并发、支持linux/win 若需要CPU环境的加速，请移步至另外一个子项目https://github.com/apple2333cream/t5-ort-cpp.git
原始模型仓库：https://huggingface.co/google-t5/t5-base

步骤

1.环境准备

  tensorrt 10.0.1 (理论上8.9.6以上版本即可)
  cudnn 8.9.4
  cuda 12.4
  python3.10
  pip install  -r HuggingFace/requirements.txt

2.模型转换

    2.1 huggingface->onnx   
    2.2 onnx->tensrrt engine   
    cd ./HuggingFace
    bash gen_t5_bs1_beam2.sh   (fp16)
    说明，encoder和decoder分开转换，若合成一个模型导出在转trt时需要手写BeamSearch（下个版本中会导出一个模型进行推理）

3.代码编译

mkdir build && cd build make -j8

4.运行示例

- demo ./t5_engine --use_mode=0
- test ./t5_engine --use_mode=1
- api ./t5_engine --use_mode=2
 服务请求示例： 
curl -X POST -d "{ "RequestID": "65423221", "InputText": "translate English to French: I was a victim of a series of accidents." }" http://127.0.0.1:17653/T5/register

代码目录结构

├── CMakeLists.txt
├── HuggingFace  转换模型代码，参考 (https://github.com/kshitizgupta21/triton-trt-oss.git)
├── main.cpp
├── onnx2tensorrt.sh
├── readme.md
├── src 
├── third_party

benchmark

以t5-base为例,tensorrt(fp16)相较于原生的pytorch有2.78倍的加速

CPU 内存占用2.2G  
| 推理框架 | 显存(Gb) | 时间(ms) |
|---------------|----------|--------|
| torch         | 1.4      |513     |
| tensorrt(fp32)| 1.2      |275     |
| tensorrt(fp16)| 1.2      |184     |

测试环境 V100 ,tensorrt 10.0.1,cudnn8.9.4, cuda12.4

TODO List

📣更新日志

20240728 v1.0.0 update:
- 提交t5 tensorrt C++ api推理代码

Contact Author

qq:807876904

参考

[triton-trt-oss]https://github.com/kshitizgupta21/triton-trt-oss.git
[t5-ort-cpp]https://github.com/apple2333cream/t5-ort-cpp.git
https://github.com/onnx/models/tree/main/validated/text/machine_comprehension/t5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

t5-trt-cpp

步骤

1.环境准备

2.模型转换

3.代码编译

4.运行示例

代码目录结构

benchmark

TODO List

📣更新日志

Contact Author

参考

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.hypothesis/unicode_data/13.0.0		.hypothesis/unicode_data/13.0.0
HuggingFace		HuggingFace
src		src
third_party		third_party
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
main.cpp		main.cpp
onnx2tensorrt.sh		onnx2tensorrt.sh
readme.md		readme.md

apple2333cream/t5-trt-cpp

Folders and files

Latest commit

History

Repository files navigation

t5-trt-cpp

步骤

1.环境准备

2.模型转换

3.代码编译

4.运行示例

代码目录结构

benchmark

TODO List

📣更新日志

Contact Author

参考

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages