-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backend] TRT backend & PP-Infer backend support pinned memory #403
Conversation
After integrating pinned memory, the inference latency is comparable with trtexec. It mainly optimized the D2H latency. The latency includes H2D, GPU inference and D2H. Tested on P40. The reason why yolov6s has only a little improvement is the output tensor shape of yolov6s(8400x85) is much smaller than yolov5s(2x25200x85) and yolov7(2x25200x85).
|
Also verified pinned memory in Paddle Inference GPU backend, tested on PPYoloE. The latency is almost the same as before. Because the output size is small(7200 + 4). |
* [Backend] Add override flag to lite backend * update doc * [Docs] Add Android C++ SDK build docs * [Doc] fix android_build_docs typos * Update CMakeLists.txt * Update android.md * Update quantize.md * Update quantize.md * Update quantize.md * Update README.md * [Doc] Add Android C++ SDK build docs (PaddlePaddle#409) * [Backend] Add override flag to lite backend * [Docs] Add Android C++ SDK build docs * [Doc] fix android_build_docs typos * Update CMakeLists.txt * Update android.md * [Backend] TRT backend & PP-Infer backend support pinned memory (PaddlePaddle#403) * TRT backend use pinned memory * refine fd tensor pinned memory logic * TRT enable pinned memory configurable * paddle inference support pinned memory * pinned memory pybindings Co-authored-by: Jason <[email protected]> * [Doc] Add PicoDet Android demo docs * [Bug Fix] release task scripts (PaddlePaddle#411) * Update py_run.bat * Update cpp_run.bat * Update compare_with_gt.py Increase score_diff and boxes_diff_ratio threshold * Update cpp_run.bat * Update release task scripts according to diffrent platforms * Delete CMAKE_CXX_COMPILER in cpp_run.bat * [Doc] add contributor for js application (PaddlePaddle#413) add contributor * [Doc] Update PicoDet Andorid demo docs * [Doc] Update PaddleClasModel Android demo docs * [Doc] Update fastdeploy android jni docs * [Doc] Update fastdeploy android jni usage docs Co-authored-by: jiangjiajun <[email protected]> Co-authored-by: leiqing <[email protected]> Co-authored-by: Wang Xinyu <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Double_V <[email protected]>
* [Backend] TRT backend & PP-Infer backend support pinned memory (#403) * TRT backend use pinned memory * refine fd tensor pinned memory logic * TRT enable pinned memory configurable * paddle inference support pinned memory * pinned memory pybindings Co-authored-by: Jason <[email protected]> * [Bug Fix] release task scripts (#411) * Update py_run.bat * Update cpp_run.bat * Update compare_with_gt.py Increase score_diff and boxes_diff_ratio threshold * Update cpp_run.bat * Update release task scripts according to diffrent platforms * Delete CMAKE_CXX_COMPILER in cpp_run.bat * [Doc] add contributor for js application (#413) add contributor * [Other] Refactor js submodule (#415) * Refactor js submodule * Remove change-log * Update ocr module * Update ocr-detection module * Update ocr-detection module * Remove change-log * [Doc] Add PicoDet & PaddleClas Android demo docs (#412) * [Backend] Add override flag to lite backend * [Docs] Add Android C++ SDK build docs * [Doc] fix android_build_docs typos * Update CMakeLists.txt * Update android.md * [Doc] Add PicoDet Android demo docs * [Doc] Update PicoDet Andorid demo docs * [Doc] Update PaddleClasModel Android demo docs * [Doc] Update fastdeploy android jni docs * [Doc] Update fastdeploy android jni usage docs Co-authored-by: Jason <[email protected]> * Update README.md * Update README_CN.md * Update README_CN.md * Update README_EN.md * [Doc] Add tutorial of supporting new models (#418) * first commit for yolov7 * pybind for yolov7 * CPP README.md * CPP README.md * modified yolov7.cc * README.md * python file modify * delete license in fastdeploy/ * repush the conflict part * README.md modified * README.md modified * file path modified * file path modified * file path modified * file path modified * file path modified * README modified * README modified * move some helpers to private * add examples for yolov7 * api.md modified * api.md modified * api.md modified * YOLOv7 * yolov7 release link * yolov7 release link * yolov7 release link * copyright * change some helpers to private * change variables to const and fix documents. * gitignore * Transfer some funtions to private member of class * Transfer some funtions to private member of class * Merge from develop (#9) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * first commit for yolor * for merge * Develop (#11) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * Yolor (#16) * Develop (#11) (#12) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * Develop (#13) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * Develop (#14) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> * add is_dynamic for YOLO series (#22) * modify ppmatting backend and docs * modify ppmatting docs * fix the PPMatting size problem * fix LimitShort's log * retrigger ci * modify PPMatting docs * modify the way for dealing with LimitShort * change develop_a_new_model.md dir Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> * [Doc] add readme for js packages (#421) * add contributor * add package readme * refine ocr readme * refine ocr readme Co-authored-by: Wang Xinyu <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Double_V <[email protected]> Co-authored-by: chenqianhe <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: leiqing <[email protected]> Co-authored-by: ziqi-jin <[email protected]> Co-authored-by: root <[email protected]>
* [Backend] TRT backend & PP-Infer backend support pinned memory (#403) * TRT backend use pinned memory * refine fd tensor pinned memory logic * TRT enable pinned memory configurable * paddle inference support pinned memory * pinned memory pybindings Co-authored-by: Jason <[email protected]> * [Bug Fix] release task scripts (#411) * Update py_run.bat * Update cpp_run.bat * Update compare_with_gt.py Increase score_diff and boxes_diff_ratio threshold * Update cpp_run.bat * Update release task scripts according to diffrent platforms * Delete CMAKE_CXX_COMPILER in cpp_run.bat * [Doc] add contributor for js application (#413) add contributor * [Other] Refactor js submodule (#415) * Refactor js submodule * Remove change-log * Update ocr module * Update ocr-detection module * Update ocr-detection module * Remove change-log * [Doc] Add PicoDet & PaddleClas Android demo docs (#412) * [Backend] Add override flag to lite backend * [Docs] Add Android C++ SDK build docs * [Doc] fix android_build_docs typos * Update CMakeLists.txt * Update android.md * [Doc] Add PicoDet Android demo docs * [Doc] Update PicoDet Andorid demo docs * [Doc] Update PaddleClasModel Android demo docs * [Doc] Update fastdeploy android jni docs * [Doc] Update fastdeploy android jni usage docs Co-authored-by: Jason <[email protected]> * Update README.md * Update README_CN.md * Update README_CN.md * Update README_EN.md * [Doc] Add tutorial of supporting new models (#418) * first commit for yolov7 * pybind for yolov7 * CPP README.md * CPP README.md * modified yolov7.cc * README.md * python file modify * delete license in fastdeploy/ * repush the conflict part * README.md modified * README.md modified * file path modified * file path modified * file path modified * file path modified * file path modified * README modified * README modified * move some helpers to private * add examples for yolov7 * api.md modified * api.md modified * api.md modified * YOLOv7 * yolov7 release link * yolov7 release link * yolov7 release link * copyright * change some helpers to private * change variables to const and fix documents. * gitignore * Transfer some funtions to private member of class * Transfer some funtions to private member of class * Merge from develop (#9) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * first commit for yolor * for merge * Develop (#11) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * Yolor (#16) * Develop (#11) (#12) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * Develop (#13) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * documents * Develop (#14) * Fix compile problem in different python version (#26) * fix some usage problem in linux * Fix compile problem Co-authored-by: root <[email protected]> * Add PaddleDetetion/PPYOLOE model support (#22) * add ppdet/ppyoloe * Add demo code and documents * add convert processor to vision (#27) * update .gitignore * Added checking for cmake include dir * fixed missing trt_backend option bug when init from trt * remove un-need data layout and add pre-check for dtype * changed RGB2BRG to BGR2RGB in ppcls model * add model_zoo yolov6 c++/python demo * fixed CMakeLists.txt typos * update yolov6 cpp/README.md * add yolox c++/pybind and model_zoo demo * move some helpers to private * fixed CMakeLists.txt typos * add normalize with alpha and beta * add version notes for yolov5/yolov6/yolox * add copyright to yolov5.cc * revert normalize * fixed some bugs in yolox * fixed examples/CMakeLists.txt to avoid conflicts * add convert processor to vision * format examples/CMakeLists summary * Fix bug while the inference result is empty with YOLOv5 (#29) * Add multi-label function for yolov5 * Update README.md Update doc * Update fastdeploy_runtime.cc fix variable option.trt_max_shape wrong name * Update runtime_option.md Update resnet model dynamic shape setting name from images to x * Fix bug when inference result boxes are empty * Delete detection.py Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> * add is_dynamic for YOLO series (#22) * modify ppmatting backend and docs * modify ppmatting docs * fix the PPMatting size problem * fix LimitShort's log * retrigger ci * modify PPMatting docs * modify the way for dealing with LimitShort * change develop_a_new_model.md dir Co-authored-by: Jason <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Jason <[email protected]> * [Doc] add readme for js packages (#421) * add contributor * add package readme * refine ocr readme * refine ocr readme * [Bug Fix] Update Fastdeploy.cmake with find opencv package NO_DEFAULT_PATH (#424) Update OpenCV Cmake with NO_DEFAULT_PATH Co-authored-by: Wang Xinyu <[email protected]> Co-authored-by: huangjianhui <[email protected]> Co-authored-by: Double_V <[email protected]> Co-authored-by: chenqianhe <[email protected]> Co-authored-by: DefTruth <[email protected]> Co-authored-by: leiqing <[email protected]> Co-authored-by: ziqi-jin <[email protected]> Co-authored-by: root <[email protected]>
PR types(PR类型)
Backend
Describe
FDDeviceHostBuffer
FDDeviceHostBuffer
intoFDTensor
FDDeviceHostBuffer