Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use EfficientNMS_TRT plugin when exporting TensorRT #288

Merged
merged 9 commits into from
Jan 24, 2022

Conversation

zhiqwang
Copy link
Owner

@zhiqwang zhiqwang commented Jan 24, 2022

The previous example works ok after this change.

import os
import torch

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

assert torch.cuda.is_available()
device = torch.device('cuda')

from yolort.utils import get_image_from_url, read_image_to_tensor
from yolort.v5 import letterbox, attempt_download
from yolort.runtime import PredictorTRT
from yolort.runtime.trt_helper import EngineBuilder
from yolort.runtime.yolo_graphsurgeon import YOLOGraphSurgeon

# Define some parameters
img_size = 640
stride = 64
score_thresh = 0.35
iou_thresh = 0.45
detections_per_img = 100
half = False

# yolov5s6.pt is downloaded from 'https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5n6.pt'
model_path = "yolov5n6.pt"

checkpoint_path = attempt_download(model_path)
onnx_path = "yolov5n6.onnx"
engine_path = "yolov5n6.engine"

img_source = "https://huggingface.co/spaces/zhiqwang/assets/resolve/main/bus.jpg"
# img_source = "https://huggingface.co/spaces/zhiqwang/assets/resolve/main/zidane.jpg"
img_raw = get_image_from_url(img_source)

# Pre Processing
image = letterbox(img_raw, new_shape=(img_size, img_size), stride=stride)[0]
image = read_image_to_tensor(image)
image = image[None]
image = image.to(device)
image = image.contiguous()

# Export to ONNX models
yolo_gs = YOLOGraphSurgeon(model_path, input_sample=image, version="r6.0", enable_dynamic=False)
# Embed the `EfficientNMS_TRT` at the end of `LogitsDecoder`.
yolo_gs.register_nms(score_thresh=score_thresh, nms_thresh=iou_thresh, detections_per_img=detections_per_img)

yolo_gs.save(onnx_path)

# Build TensorRT Engine
engine_builder = EngineBuilder()
engine_builder.create_network(onnx_path)
engine_builder.create_engine(engine_path, precision="fp32")

# Inference on TensorRT
engine = PredictorTRT(engine_path, device)
engine.warmup(img_size=image.shape, half=half)

# Inferencing
detections = engine.run_on_image(image)

Known cons

We have to update the TensorRT to 8.2 to call the EfficientNMS_TRT plugin. And seems that there is a bug about the float16 of this plugin: NVIDIA/TensorRT#1758 (comment) and was fixed since version 8.2.4.

@zhiqwang zhiqwang added enhancement New feature or request code quality Code format and unit tests labels Jan 24, 2022
@CLAassistant
Copy link

CLAassistant commented Jan 24, 2022

CLA assistant check
All committers have signed the CLA.

@zhiqwang zhiqwang force-pushed the EfficientNMS_TRT_PLUGIN branch from 41f1e02 to a1f9059 Compare January 24, 2022 18:26
@zhiqwang zhiqwang force-pushed the EfficientNMS_TRT_PLUGIN branch from ce13ca5 to c72dd6a Compare January 24, 2022 18:28
@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #288 (ce13ca5) into main (d2db932) will not change coverage.
The diff coverage is n/a.

❗ Current head ce13ca5 differs from pull request most recent head a3f80fb. Consider uploading reports for the commit a3f80fb to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##             main     #288   +/-   ##
=======================================
  Coverage   94.01%   94.01%           
=======================================
  Files          11       11           
  Lines         718      718           
=======================================
  Hits          675      675           
  Misses         43       43           
Flag Coverage Δ
unittests 94.01% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d2db932...a3f80fb. Read the comment docs.

@zhiqwang zhiqwang merged commit 51f9d41 into main Jan 24, 2022
@zhiqwang zhiqwang deleted the EfficientNMS_TRT_PLUGIN branch January 24, 2022 18:59
@zhiqwang zhiqwang added the deployment Inference acceleration for production label Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code quality Code format and unit tests deployment Inference acceleration for production enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants