Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented basic pipeline for Refitting #2886

Merged
merged 71 commits into from
Jul 2, 2024
Merged

Implemented basic pipeline for Refitting #2886

merged 71 commits into from
Jul 2, 2024

Conversation

cehongwang
Copy link
Collaborator

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jun 4, 2024


def refit_trt_engine_from_module(
exported_program: ExportedProgram,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the settings that dont do anything for refit

def refit_trt_engine_from_module(
exported_program: ExportedProgram,
inputs: Tuple[Any, ...],
engine: object,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually will become the compiled exported program

@@ -609,3 +610,126 @@ def convert_module_to_trt_engine(
engine_bytearray = engine_bytes.getvalue()

return engine_bytearray


def refit_trt_engine_from_module(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually something like

def refit_module_weights(
    compiled_module: ExportedProgram,
    new_weight_module: ExportedProgram
) -> torch.fx.GraphModule: 


enabled_precisions = {dtype._from(e) for e in enabled_precisions}

compilation_options = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can store the compilation settings as metadata in the returned graph (then we can just read the compiled program to fill these settings in to match the original lowering)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ask Dheeraj where to put the meta data for lowering


mapping = get_refit_mapping(gm, input_list, settings)

TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move this stuff into a submodule or other file (torch_tensorrt/dynamo/_refit.py)


mapping = get_refit_mapping(gm, input_list, settings)

TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reuse global logger

TRT_LOGGER = _TRTLogger()

@@ -88,6 +89,61 @@ def interpret_module_to_result(
return interpreter_result


def get_refit_mapping(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the refit file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def construct_refit_weight_mapping(
   new_weights_mod: torch.fx.GraphModule,
   compile_settings: CompilationSettings # Info from the metadata of the compiled module
):

serialized_engine, self._input_names, self._output_names, serialized_cache
)

def get_network_to_refit(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets call this something like

def _construct_trt_network_def()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user would do something like

interpreter._construct_trt_network_def()
net = interpreter.ctx.net

@narendasan
Copy link
Collaborator

1 AI: take

and copy it into https://github.com/pytorch/TensorRT/blob/3422c41f165c3cf0833468b0cb548149ca78e057/py/torch_tensorrt/dynamo/conversion/converter_utils.py. Modify to use the naming scheme that you have designed to work with refit.

Replace all uses of set_layer_name inside torch_tensorrt/dynamo/conversion/impl and torch_tensorrt/fx/converters with your implementation.

cc: @peri044

compiled_module = copy.copy(compiled_module)
# Iterate over all components that can be accelerated
# Generate the corresponding TRT Module for those
for name, _ in partitioned_module.named_children():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to ensure that the new module's partition is the same as the compiled modules partition.

  1. We can check the number of subgraphs, perhaps the names as well (if deterministic)
  2. At compile time, compute the hash of the source fx graph (https://github.com/pytorch/pytorch/blob/fba21edf5b9aa14babb9c0bc860dc9c597eb8010/torch/_inductor/codecache.py#L670) and store as attribute in the TRTModule. The compare the hash of the new graph to the one stored in the compiled subgraph module

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zewenli98 You might be interested in reusing this part for engine caching

py/torch_tensorrt/dynamo/_refit.py Show resolved Hide resolved
py/torch_tensorrt/dynamo/_refit.py Show resolved Hide resolved
py/torch_tensorrt/dynamo/_refit.py Outdated Show resolved Hide resolved
tests/py/dynamo/models/test_model_refit.py Outdated Show resolved Hide resolved
py/torch_tensorrt/dynamo/_refit.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@narendasan narendasan merged commit 9f46d39 into main Jul 2, 2024
55 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: runtime component: tests Issues re: Tests documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants