diff --git a/notebooks/onnx-graphsurgeon-inference-tensorrt.ipynb b/notebooks/onnx-graphsurgeon-inference-tensorrt.ipynb
index d99ab6fb4..f93752a9d 100644
--- a/notebooks/onnx-graphsurgeon-inference-tensorrt.ipynb
+++ b/notebooks/onnx-graphsurgeon-inference-tensorrt.ipynb
@@ -5,7 +5,7 @@
    "id": "ab5ff80c",
    "metadata": {},
    "source": [
-    "# TensorRT Python Inference for yolort"
+    "# Deploying yolort on TensorRT"
    ]
   },
   {
@@ -13,9 +13,9 @@
    "id": "ed6e21c1",
    "metadata": {},
    "source": [
-    "Unlike other TensorRT examples that deal with yolov5, we embed the whole post-processing into the Graph with `onnx-graghsurgeon`. We gain a lot with this whole pipeline. The ablation experiment results are below. The first one is the result without running `EfficientNMS_TRT `, and the second one is the result with `EfficientNMS_TRT` embedded. As you can see, the inference time is even reduced, we guess it is because the data copied to the device will be much less after doing `EfficientNMS_TRT`. (The mean Latency of D2H is reduced from `0.868048 ms` to `0.0102295 ms`, running on Nivdia Geforce GTX 1080ti, using TensorRT 8.2 with yolov5n6 and scaling images to `512x640`.)\n",
+    "Unlike other TensorRT examples that deal with yolov5, we embed the whole post-processing into the Graph with `onnx-graghsurgeon`. We gain a lot with this whole pipeline. The ablation experiment results are below. The first one is the result without running `EfficientNMS_TRT`, and the second one is the result with `EfficientNMS_TRT` embedded. As you can see, the inference time is even reduced, we guess it is because the data copied to the device will be much less after doing `EfficientNMS_TRT`. (The mean Latency of D2H is reduced from `0.868048 ms` to `0.0102295 ms`, running on Nivdia Geforce GTX 1080ti, using TensorRT 8.2 with yolov5n6 and scaling images to `512x640`.)\n",
     "\n",
-    "And `onnx-graphsurgeon` is easy to install, you can just use the prebuilt wheels\n",
+    "And `onnx-graphsurgeon` is easy to install, you can just use their prebuilt wheels:\n",
     "\n",
     "```\n",
     "python3 -m pip install onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com\n",
@@ -24,7 +24,7 @@
     "The detailed results:\n",
     "\n",
     "```\n",
-    "[I] === Performance summary w/o NMS plugin ===\n",
+    "[I] === Performance summary w/o EfficientNMS_TRT plugin ===\n",
     "[I] Throughput: 383.298 qps\n",
     "[I] Latency: min = 3.66479 ms, max = 5.41199 ms, mean = 4.00543 ms, median = 3.99316 ms, percentile(99%) = 4.23831 ms\n",
     "[I] End-to-End Host Latency: min = 3.76599 ms, max = 6.45874 ms, mean = 5.08597 ms, median = 5.07544 ms, percentile(99%) = 5.50839 ms\n",
@@ -40,7 +40,7 @@
     "```\n",
     "\n",
     "```\n",
-    "[I] === Performance summary w/ NMS plugin ===\n",
+    "[I] === Performance summary w/ EfficientNMS_TRT plugin ===\n",
     "[I] Throughput: 389.234 qps\n",
     "[I] Latency: min = 2.81482 ms, max = 9.77234 ms, mean = 3.1062 ms, median = 3.07642 ms, percentile(99%) = 3.33548 ms\n",
     "[I] End-to-End Host Latency: min = 2.82202 ms, max = 11.6749 ms, mean = 4.939 ms, median = 4.95587 ms, percentile(99%) = 5.45207 ms\n",