diff --git a/docsrc/index.rst b/docsrc/index.rst index eaf547d60e..b3d90055cf 100644 --- a/docsrc/index.rst +++ b/docsrc/index.rst @@ -54,6 +54,11 @@ User Guide tutorials/_rendered_examples/dynamo/vgg16_ptq tutorials/_rendered_examples/dynamo/engine_caching_example tutorials/_rendered_examples/dynamo/refit_engine_example + tutorials/serving_torch_tensorrt_with_triton + tutorials/_rendered_examples/dynamo/torch_export_cudagraphs + tutorials/_rendered_examples/dynamo/converter_overloading + tutorials/_rendered_examples/dynamo/custom_kernel_plugins + tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example Dynamo Frontend ---------------- @@ -99,30 +104,28 @@ FX Frontend fx/getting_started_with_fx_path -Tutorials +Model Zoo ------------ -* :ref:`torch_tensorrt_tutorials` -* :ref:`serving_torch_tensorrt_with_triton` +* :ref:`torch_compile_resnet` +* :ref:`torch_compile_transformer` +* :ref:`torch_compile_stable_diffusion` +* :ref:`torch_export_gpt2` +* :ref:`torch_export_llama2` * :ref:`notebooks` .. toctree:: - :caption: Tutorials + :caption: Model Zoo :maxdepth: 3 :hidden: - - tutorials/serving_torch_tensorrt_with_triton - tutorials/notebooks + tutorials/_rendered_examples/dynamo/torch_compile_resnet_example tutorials/_rendered_examples/dynamo/torch_compile_transformers_example tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion - tutorials/_rendered_examples/dynamo/torch_export_cudagraphs - tutorials/_rendered_examples/dynamo/converter_overloading - tutorials/_rendered_examples/dynamo/custom_kernel_plugins tutorials/_rendered_examples/distributed_inference/data_parallel_gpt2 tutorials/_rendered_examples/distributed_inference/data_parallel_stable_diffusion - tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example tutorials/_rendered_examples/dynamo/torch_export_gpt2 tutorials/_rendered_examples/dynamo/torch_export_llama2 + tutorials/notebooks Python API Documentation ------------------------ diff --git a/docsrc/tutorials/notebooks.rst b/docsrc/tutorials/notebooks.rst index 14737a8f63..509676d83a 100644 --- a/docsrc/tutorials/notebooks.rst +++ b/docsrc/tutorials/notebooks.rst @@ -1,10 +1,9 @@ .. _notebooks: -Example notebooks +Legacy notebooks =================== -There exists a number of notebooks which cover specific using specific features and models -with Torch-TensorRT +There exists a number of notebooks which demonstrate different model conversions / features / frontends available within Torch-TensorRT Notebooks ------------ diff --git a/examples/README.rst b/examples/README.rst index 7c21aad732..be67c27e61 100644 --- a/examples/README.rst +++ b/examples/README.rst @@ -1,7 +1,4 @@ .. _torch_tensorrt_tutorials: Torch-TensorRT Tutorials -=========================== - -The user guide covers the basic concepts and usage of Torch-TensorRT. -We also provide a number of tutorials to explore specific usecases and advanced concepts +=========================== \ No newline at end of file diff --git a/examples/dynamo/README.rst b/examples/dynamo/README.rst index 12590ab4b5..60f1969be2 100644 --- a/examples/dynamo/README.rst +++ b/examples/dynamo/README.rst @@ -1,10 +1,6 @@ -.. _torch_compile: +.. _torch_tensorrt_examples: -Torch-TensorRT Examples -==================================== - -Please refer to the following examples which demonstrate the usage of different features of Torch-TensorRT. We also provide -examples of Torch-TensorRT compilation of select computer vision and language models. +Here we provide examples of Torch-TensorRT compilation of popular computer vision and language models. Dependencies ------------------------------------ @@ -16,18 +12,6 @@ Please install the following external dependencies (assuming you already have co pip install -r requirements.txt -Compiler Features ------------------------------------- -* :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API -* :ref:`torch_export_cudagraphs`: Using the Cudagraphs integration with `ir="dynamo"` -* :ref:`converter_overloading`: How to write custom converters and overload existing ones -* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines -* :ref:`refit_engine_example`: Refitting a compiled TensorRT Graph Module with updated weights -* :ref:`mutable_torchtrt_module_example`: Compile, use, and modify TensorRT Graph Module with MutableTorchTensorRTModule -* :ref:`vgg16_fp8_ptq`: Compiling a VGG16 model with FP8 and PTQ using ``torch.compile`` -* :ref:`engine_caching_example`: Utilizing engine caching to speed up compilation times -* :ref:`engine_caching_bert_example`: Demonstrating engine caching on BERT - Model Zoo ------------------------------------ * :ref:`torch_compile_resnet`: Compiling a ResNet model using the Torch Compile Frontend for ``torch_tensorrt.compile`` diff --git a/examples/dynamo/torch_compile_resnet_example.py b/examples/dynamo/torch_compile_resnet_example.py index 420c5390d3..f852d60158 100644 --- a/examples/dynamo/torch_compile_resnet_example.py +++ b/examples/dynamo/torch_compile_resnet_example.py @@ -1,7 +1,7 @@ """ .. _torch_compile_resnet: -Compiling ResNet using the Torch-TensorRT `torch.compile` Backend +Compiling ResNet with dynamic shapes using the `torch.compile` backend ========================================================== This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model.""" diff --git a/examples/dynamo/torch_compile_stable_diffusion.py b/examples/dynamo/torch_compile_stable_diffusion.py index a0b725572b..fe49da74d1 100644 --- a/examples/dynamo/torch_compile_stable_diffusion.py +++ b/examples/dynamo/torch_compile_stable_diffusion.py @@ -1,7 +1,7 @@ """ .. _torch_compile_stable_diffusion: -Torch Compile Stable Diffusion +Compiling Stable Diffusion model using the `torch.compile` backend ====================================================== This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a Stable Diffusion model. A sample output is featured below: diff --git a/examples/dynamo/torch_compile_transformers_example.py b/examples/dynamo/torch_compile_transformers_example.py index 01d46e96f6..221ecd4fd1 100644 --- a/examples/dynamo/torch_compile_transformers_example.py +++ b/examples/dynamo/torch_compile_transformers_example.py @@ -1,10 +1,10 @@ """ .. _torch_compile_transformer: -Compiling a Transformer using torch.compile and TensorRT +Compiling BERT using the `torch.compile` backend ============================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a transformer-based model.""" +This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a BERT model.""" # %% # Imports and Model Definition diff --git a/examples/dynamo/torch_export_gpt2.py b/examples/dynamo/torch_export_gpt2.py index f9229e420c..cea0f3adf2 100644 --- a/examples/dynamo/torch_export_gpt2.py +++ b/examples/dynamo/torch_export_gpt2.py @@ -1,10 +1,10 @@ """ .. _torch_export_gpt2: -Compiling GPT2 using the Torch-TensorRT with dynamo backend +Compiling GPT2 using the dynamo backend ========================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a GPT2 model.""" +This script illustrates Torch-TensorRT workflow with dynamo backend on popular GPT2 model.""" # %% # Imports and Model Definition @@ -88,13 +88,10 @@ tokenizer.decode(trt_gen_tokens[0], skip_special_tokens=True), ) -# %% -# The output sentences should look like -# ============================= -# Pytorch model generated text: What is parallel programming ? +# Prompt : What is parallel programming ? -# The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that # ============================= -# TensorRT model generated text: What is parallel programming ? +# Pytorch model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that -# The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that +# ============================= +# TensorRT model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that diff --git a/examples/dynamo/torch_export_llama2.py b/examples/dynamo/torch_export_llama2.py index 11a0c93276..5cfd1ed61c 100644 --- a/examples/dynamo/torch_export_llama2.py +++ b/examples/dynamo/torch_export_llama2.py @@ -1,10 +1,10 @@ """ .. _torch_export_llama2: -Compiling Llama2 using the Torch-TensorRT with dynamo backend +Compiling Llama2 using the dynamo backend ========================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a Llama2 model.""" +This script illustrates Torch-TensorRT workflow with dynamo backend on popular Llama2 model.""" # %% # Imports and Model Definition @@ -91,9 +91,11 @@ )[0], ) -# %% -# The output sentences should look like + +# Prompt : What is dynamic programming? + # ============================= -# Pytorch model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and +# Pytorch model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and + # ============================= -# TensorRT model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and +# TensorRT model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and