From 5b6bd4c920f19343554acd1f661dbdedd8224ac8 Mon Sep 17 00:00:00 2001
From: TRTorch Github Bot
-
+
-
+
-
+
TRTorch Getting Started - ResNet 50
diff --git a/docs/_notebooks/lenet-getting-started.html b/docs/_notebooks/lenet-getting-started.html
index a2f3243b0e..6fb0bd1af1 100644
--- a/docs/_notebooks/lenet-getting-started.html
+++ b/docs/_notebooks/lenet-getting-started.html
@@ -784,7 +784,7 @@
TRTorch Getting Started - LeNet
diff --git a/docs/_notebooks/ssd-object-detection-demo.html b/docs/_notebooks/ssd-object-detection-demo.html
index f8353a24ba..8d8786976d 100644
--- a/docs/_notebooks/ssd-object-detection-demo.html
+++ b/docs/_notebooks/ssd-object-detection-demo.html
@@ -804,7 +804,7 @@
Object Detection with TRTorch (SSD)
diff --git a/docs/_sources/tutorials/ptq.rst.txt b/docs/_sources/tutorials/ptq.rst.txt
index 28d60acec3..48f42c3604 100644
--- a/docs/_sources/tutorials/ptq.rst.txt
+++ b/docs/_sources/tutorials/ptq.rst.txt
@@ -14,14 +14,17 @@ the TensorRT calibrator. With TRTorch we look to leverage existing infrastructur
calibrators easier.
LibTorch provides a ``DataLoader`` and ``Dataset`` API which steamlines preprocessing and batching input data.
-This section of the PyTorch documentation has more information https://pytorch.org/tutorials/advanced/cpp_frontend.html#loading-data.
+These APIs are exposed via both C++ and Python interface which makes it easier for the end user.
+For C++ interface, we use ``torch::Dataset`` and ``torch::data::make_data_loader`` objects to construct and perform pre-processing on datasets.
+The equivalent functionality in python interface uses ``torch.utils.data.Dataset`` and ``torch.utils.data.DataLoader``.
+This section of the PyTorch documentation has more information https://pytorch.org/tutorials/advanced/cpp_frontend.html#loading-data and https://pytorch.org/tutorials/recipes/recipes/loading_data_recipe.html.
TRTorch uses Dataloaders as the base of a generic calibrator implementation. So you will be able to reuse or quickly
implement a ``torch::Dataset`` for your target domain, place it in a DataLoader and create a INT8 Calibrator
which you can provide to TRTorch to run INT8 Calibration during compliation of your module.
-.. _writing_ptq:
+.. _writing_ptq_cpp:
-How to create your own PTQ application
+How to create your own PTQ application in C++
----------------------------------------
Here is an example interface of a ``torch::Dataset`` class for CIFAR10:
@@ -132,14 +135,76 @@ Then all thats required to setup the module for INT8 calibration is to set the f
auto trt_mod = trtorch::CompileGraph(mod, compile_spec);
If you have an existing Calibrator implementation for TensorRT you may directly set the ``ptq_calibrator`` field with a pointer to your calibrator and it will work as well.
-
From here not much changes in terms of how to execution works. You are still able to fully use LibTorch as the sole interface for inference. Data should remain
in FP32 precision when it's passed into `trt_mod.forward`. There exists an example application in the TRTorch demo that takes you from training a VGG16 network on
CIFAR10 to deploying in INT8 with TRTorch here: https://github.com/NVIDIA/TRTorch/tree/master/cpp/ptq
+.. _writing_ptq_python:
+
+How to create your own PTQ application in Python
+----------------------------------------
+
+TRTorch Python API provides an easy and convenient way to use pytorch dataloaders with TensorRT calibrators. ``DataLoaderCalibrator`` class can be used to create
+a TensorRT calibrator by providing desired configuration. The following code demonstrates an example on how to use it
+
+.. code-block:: python
+
+ testing_dataset = torchvision.datasets.CIFAR10(root='./data',
+ train=False,
+ download=True,
+ transform=transforms.Compose([
+ transforms.ToTensor(),
+ transforms.Normalize((0.4914, 0.4822, 0.4465),
+ (0.2023, 0.1994, 0.2010))
+ ]))
+
+ testing_dataloader = torch.utils.data.DataLoader(testing_dataset,
+ batch_size=1,
+ shuffle=False,
+ num_workers=1)
+ calibrator = trtorch.ptq.DataLoaderCalibrator(testing_dataloader,
+ cache_file='./calibration.cache',
+ use_cache=False,
+ algo_type=trtorch.ptq.CalibrationAlgo.ENTROPY_CALIBRATION_2,
+ device=torch.device('cuda:0'))
+
+ compile_spec = {
+ "input_shapes": [[1, 3, 32, 32]],
+ "op_precision": torch.int8,
+ "calibrator": calibrator,
+ "device": {
+ "device_type": trtorch.DeviceType.GPU,
+ "gpu_id": 0,
+ "dla_core": 0,
+ "allow_gpu_fallback": False,
+ "disable_tf32": False
+ }
+ }
+ trt_mod = trtorch.compile(model, compile_spec)
+
+In the cases where there is a pre-existing calibration cache file that users want to use, ``CacheCalibrator`` can be used without any dataloaders. The following example demonstrates how
+to use ``CacheCalibrator`` to use in INT8 mode.
+
+.. code-block:: python
+
+ calibrator = trtorch.ptq.CacheCalibrator("./calibration.cache")
+
+ compile_settings = {
+ "input_shapes": [[1, 3, 32, 32]],
+ "op_precision": torch.int8,
+ "calibrator": calibrator,
+ "max_batch_size": 32,
+ }
+
+ trt_mod = trtorch.compile(model, compile_settings)
+
+If you already have an existing calibrator class (implemented directly using TensorRT API), you can directly set the calibrator field to your class which can be very convenient.
+For a demo on how PTQ can be performed on a VGG network using TRTorch API, you can refer to https://github.com/NVIDIA/TRTorch/blob/master/tests/py/test_ptq_dataloader_calibrator.py
+and https://github.com/NVIDIA/TRTorch/blob/master/tests/py/test_ptq_trt_calibrator.py
+
Citations
^^^^^^^^^^^
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.
-Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
+Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
\ No newline at end of file
diff --git a/docs/objects.inv b/docs/objects.inv
index 5a05171d6d745bdbef096df5b6f181c4aacc6e9c..b33fd94bb89efd5f1bb3161839fee7eddcb0d534 100644
GIT binary patch
delta 2133
zcmV-b2&(u0j{&lf0kAbQf5eLs+eVa`NE2S~S6`yVnNi+}%sUZzCqnN@E^o~U(;RsF
z7_$TVgrU9))iGzP#aj4)JV!Y4f$8U*4@kjE%nEW74{O}ITg?)AuV|tdt8cm43Q^-%
z4;%EnZQv?f{Fow}NnIxtIs)6s&9S6gVw3)_7JxEMx{)^N9yf`~f6&H(Ns^~Y+m)8!
zb`Gh^X&n4za(Msqc9n;f&Ste0a+W9m37j->Y3ukDI7J!jxb3%WCOg@)WPx)Vwo2Y)
z=V89&<0}xvfG_H!C(E5x0*kY7Q=0IId;dv3BT!llje*N~CAvWECQ38Kwe-V9Ba1Wf
zSC37UQ?kmgZ-}l;f0@axrL*D
- From here not much changes in terms of how to execution works. You are still able to fully use LibTorch as the sole interface for inference. Data should remain
+From here not much changes in terms of how to execution works. You are still able to fully use LibTorch as the sole interface for inference. Data should remain
in FP32 precision when it’s passed into
trt_mod.forward
@@ -722,6 +761,103 @@
https://github.com/NVIDIA/TRTorch/tree/master/cpp/ptq
+ TRTorch Python API provides an easy and convenient way to use pytorch dataloaders with TensorRT calibrators.
+
+
+ DataLoaderCalibrator
+
+
+ class can be used to create
+a TensorRT calibrator by providing desired configuration. The following code demonstrates an example on how to use it
+
testing_dataset = torchvision.datasets.CIFAR10(root='./data',
+ train=False,
+ download=True,
+ transform=transforms.Compose([
+ transforms.ToTensor(),
+ transforms.Normalize((0.4914, 0.4822, 0.4465),
+ (0.2023, 0.1994, 0.2010))
+ ]))
+
+testing_dataloader = torch.utils.data.DataLoader(testing_dataset,
+ batch_size=1,
+ shuffle=False,
+ num_workers=1)
+calibrator = trtorch.ptq.DataLoaderCalibrator(testing_dataloader,
+ cache_file='./calibration.cache',
+ use_cache=False,
+ algo_type=trtorch.ptq.CalibrationAlgo.ENTROPY_CALIBRATION_2,
+ device=torch.device('cuda:0'))
+
+compile_spec = {
+ "input_shapes": [[1, 3, 32, 32]],
+ "op_precision": torch.int8,
+ "calibrator": calibrator,
+ "device": {
+ "device_type": trtorch.DeviceType.GPU,
+ "gpu_id": 0,
+ "dla_core": 0,
+ "allow_gpu_fallback": False,
+ "disable_tf32": False
+ }
+ }
+trt_mod = trtorch.compile(model, compile_spec)
+
+
+ In the cases where there is a pre-existing calibration cache file that users want to use,
+
+
+ CacheCalibrator
+
+
+ can be used without any dataloaders. The following example demonstrates how
+to use
+
+
+ CacheCalibrator
+
+
+ to use in INT8 mode.
+
calibrator = trtorch.ptq.CacheCalibrator("./calibration.cache")
+
+compile_settings = {
+ "input_shapes": [[1, 3, 32, 32]],
+ "op_precision": torch.int8,
+ "calibrator": calibrator,
+ "max_batch_size": 32,
+ }
+
+trt_mod = trtorch.compile(model, compile_settings)
+
+ + If you already have an existing calibrator class (implemented directly using TensorRT API), you can directly set the calibrator field to your class which can be very convenient. +For a demo on how PTQ can be performed on a VGG network using TRTorch API, you can refer to + + https://github.com/NVIDIA/TRTorch/blob/master/tests/py/test_ptq_dataloader_calibrator.py + + and + + https://github.com/NVIDIA/TRTorch/blob/master/tests/py/test_ptq_trt_calibrator.py + +