Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selectively enable different frontends #2693

Merged
merged 13 commits into from
Apr 17, 2024
Merged

Selectively enable different frontends #2693

merged 13 commits into from
Apr 17, 2024

Conversation

narendasan
Copy link
Collaborator

Description

Allows users to configure which frontends/features they want to include in the torch-tensorrt builds.

Features can be enabled as so:

NO_TORCHSCRIPT=1 pip install -e . ... # Includes the C++ runtime but no Torchscript frontend
PYTHON_ONLY=1 pip install -e . ... # No C++ dependencies at all, a pure python package 

A builds feature set can be accessed via the following struct

import torch_tensorrt

torch_tensorrt.ENABLED_FEATURES

where ENABLED_FEATURES is a namedtuple FeatureSet:

namedtuple('FeatureSet', [
    "torchscript_frontend",
    "torch_tensorrt_runtime",
    "dynamo_frontend",
    "fx_frontend"
])

In order to support optional features, a number of core types have been abstracted:

  • torch_tensorrt.Device has no direct dependencies on the torchscript core and can be translated to torch_tensorrt.ts.Device to access those features
  • Similarly torch_tensorrt.Input behaves the same way
  • enums for dtype,DeviceType, memory_format have been defined and can translate from numpy, tensorrt, torch and torch_tensorrt._C (assuming torch_tensorrt.ENABLED_FEATURES.torchscript_frontend is True)

Translating between different library enums now can take the form

import torch
import tensorrt as trt
import torch_tensorrt

# Convert, throw error if no conversion exists
trt_dtype = torch_tensorrt.dtype._from(torch.float16).to(trt.DataType) # returns trt.DataType.HALF

#Try to convert, do not throw an error if there is no conversion, just return None
trt_dtype = torch_tensorrt.dtype.try_from(torch.float16).to(trt.DataType) # returns trt.DataType.HALF

# Alternatively, use a fallback type
trt_dtype = torch_tensorrt.dtype.unknown.to(trt.DataType, use_default=True) #returns trt.DataType.FLOAT

Fixes #1943
Fixes #2379

Type of change

Please delete options that are not relevant and/or add your own.

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@github-actions github-actions bot added component: tests Issues re: Tests component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: build system Issues re: Build system component: api [Python] Issues re: Python API component: runtime component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Mar 15, 2024
@github-actions github-actions bot requested a review from peri044 March 15, 2024 23:03
github-actions[bot]

This comment was marked as outdated.

@github-actions github-actions bot added the component: converters Issues re: Specific op converters label Mar 20, 2024
Copy link
Collaborator

@gs-olive gs-olive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great; overhaul of dtype system and frontend selection at install is very helpful. I just need to verify on local installs as well. Left a few small comments

py/torch_tensorrt/__init__.py Outdated Show resolved Hide resolved
or torch_tensorrt.dtype.float in enabled_precisions
):
precision = torch.float32
if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype.float16 and dtype.half seem to point to the same enum object. Should this be:

if dtype.float16 in enabled_precisions or torch.float16 in enabled_precisions:

precision = torch.float32
if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:
precision = dtype.float16
elif dtype.float32 in enabled_precisions or dtype.float in enabled_precisions:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment

Comment on lines 572 to 575
if dtype.float16 in enabled_precisions or dtype.half in enabled_precisions:
precision = dtype.float16
elif dtype.float32 in enabled_precisions or dtype.float in enabled_precisions:
precision = dtype.float32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

py/torch_tensorrt/dynamo/_settings.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Outdated Show resolved Hide resolved
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
…pes in the python package to decouple frontends

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
torch.fx.passes.splitter_base._SplitterBase

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
logger.warning(
"Input graph is a Torchscript module but the ir provided is default (dynamo). Please set ir=torchscript to suppress the warning. Compiling the module with ir=torchscript"
"Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning."
)
return _IRType.ts
elif module_is_exportable:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add feature check

@@ -261,7 +281,7 @@ def convert_method_to_trt_engine(
method_name: str = "forward",
inputs: Optional[Sequence[Input | torch.Tensor]] = None,
ir: str = "default",
enabled_precisions: Optional[Set[torch.dtype | dtype]] = None,
enabled_precisions: Optional[Set[torch.dtype]] = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct type annotation

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Copy link
Collaborator

@gs-olive gs-olive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great - added some small style/clarification/logging comments

py/torch_tensorrt/_compile.py Outdated Show resolved Hide resolved
py/torch_tensorrt/_compile.py Outdated Show resolved Hide resolved
py/torch_tensorrt/_enums.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/conversion/converter_utils.py Outdated Show resolved Hide resolved
@gs-olive
Copy link
Collaborator

Additionally, tested on Windows E2E models and appears to cleanly dispatch to the correct runtime without needing to modify the use_python_runtime flag

@gs-olive
Copy link
Collaborator

A few examples, such as the one below, use precision still and would fail with some of the changes.

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Added minor comments

use_default: bool,
) -> Optional[Union[torch.dtype, trt.DataType, np.dtype, dtype]]:
try:
print(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove this statement

Comment on lines 692 to 693
elif t == DeviceType:
return self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious when do we need to cast EngineCapability to DeviceType. Any examples ?

use_fast_partitioner: bool = USE_FAST_PARTITIONER,
enable_experimental_decompositions: bool = ENABLE_EXPERIMENTAL_DECOMPOSITIONS,
enabled_precisions: Set[torch.dtype | dtype] | Tuple[torch.dtype | dtype] = (
dtype.float32,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be _defaults.ENABLED_PRECISIONS ?


compilation_options = {
"precision": precision,
"enabled_precisions": enabled_precisions,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you've handled enabled_precisions to be empty scenario in compile function. We can use the same here

"enabled_precisions": (
            enabled_precisions if enabled_precisions else _defaults.ENABLED_PRECISIONS

@@ -64,7 +64,7 @@ include-package-data = false

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would this file be used ?

@peri044
Copy link
Collaborator

peri044 commented Apr 17, 2024

Will docs be updated with the instructions as a different PR ?

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 17, 2024
@narendasan narendasan merged commit 9cf3356 into main Apr 17, 2024
7 checks passed
peri044 pushed a commit that referenced this pull request Apr 18, 2024
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
peri044 added a commit that referenced this pull request Apr 18, 2024
…2761)

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Co-authored-by: Naren Dasan <[email protected]>
zewenli98 pushed a commit that referenced this pull request Apr 26, 2024
…2761)

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Co-authored-by: Naren Dasan <[email protected]>
laikhtewari pushed a commit that referenced this pull request May 24, 2024
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: build system Issues re: Build system component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: runtime component: tests Issues re: Tests documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decoupling the torchscript and dynamo subpackages in python ✨[Feature] Remove C++ Dependency in Dynamo
4 participants