Skip to content

Commit

Permalink
Minor fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
zhiqwang committed Jan 26, 2022
1 parent 1e87ae3 commit 85cd952
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions notebooks/onnx-graphsurgeon-inference-tensorrt.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"id": "ab5ff80c",
"metadata": {},
"source": [
"# TensorRT Python Inference for yolort"
"# Deploying yolort on TensorRT"
]
},
{
"cell_type": "markdown",
"id": "ed6e21c1",
"metadata": {},
"source": [
"Unlike other TensorRT examples that deal with yolov5, we embed the whole post-processing into the Graph with `onnx-graghsurgeon`. We gain a lot with this whole pipeline. The ablation experiment results are below. The first one is the result without running `EfficientNMS_TRT `, and the second one is the result with `EfficientNMS_TRT` embedded. As you can see, the inference time is even reduced, we guess it is because the data copied to the device will be much less after doing `EfficientNMS_TRT`. (The mean Latency of D2H is reduced from `0.868048 ms` to `0.0102295 ms`, running on Nivdia Geforce GTX 1080ti, using TensorRT 8.2 with yolov5n6 and scaling images to `512x640`.)\n",
"Unlike other TensorRT examples that deal with yolov5, we embed the whole post-processing into the Graph with `onnx-graghsurgeon`. We gain a lot with this whole pipeline. The ablation experiment results are below. The first one is the result without running `EfficientNMS_TRT`, and the second one is the result with `EfficientNMS_TRT` embedded. As you can see, the inference time is even reduced, we guess it is because the data copied to the device will be much less after doing `EfficientNMS_TRT`. (The mean Latency of D2H is reduced from `0.868048 ms` to `0.0102295 ms`, running on Nivdia Geforce GTX 1080ti, using TensorRT 8.2 with yolov5n6 and scaling images to `512x640`.)\n",
"\n",
"And `onnx-graphsurgeon` is easy to install, you can just use the prebuilt wheels\n",
"And `onnx-graphsurgeon` is easy to install, you can just use their prebuilt wheels:\n",
"\n",
"```\n",
"python3 -m pip install onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com\n",
Expand All @@ -24,7 +24,7 @@
"The detailed results:\n",
"\n",
"```\n",
"[I] === Performance summary w/o NMS plugin ===\n",
"[I] === Performance summary w/o EfficientNMS_TRT plugin ===\n",
"[I] Throughput: 383.298 qps\n",
"[I] Latency: min = 3.66479 ms, max = 5.41199 ms, mean = 4.00543 ms, median = 3.99316 ms, percentile(99%) = 4.23831 ms\n",
"[I] End-to-End Host Latency: min = 3.76599 ms, max = 6.45874 ms, mean = 5.08597 ms, median = 5.07544 ms, percentile(99%) = 5.50839 ms\n",
Expand All @@ -40,7 +40,7 @@
"```\n",
"\n",
"```\n",
"[I] === Performance summary w/ NMS plugin ===\n",
"[I] === Performance summary w/ EfficientNMS_TRT plugin ===\n",
"[I] Throughput: 389.234 qps\n",
"[I] Latency: min = 2.81482 ms, max = 9.77234 ms, mean = 3.1062 ms, median = 3.07642 ms, percentile(99%) = 3.33548 ms\n",
"[I] End-to-End Host Latency: min = 2.82202 ms, max = 11.6749 ms, mean = 4.939 ms, median = 4.95587 ms, percentile(99%) = 5.45207 ms\n",
Expand Down

0 comments on commit 85cd952

Please sign in to comment.