Adding `trt_executed_modules` interface #1119

Njuapp · 2022-06-15T05:43:41Z

Njuapp
Jun 15, 2022

According to the current design, every module is by default compiled to TensorRT execpt those modules included in torch_executed_modules.

However, in some cases, for example, a multi-modal model where ResNet and BERT are present, BERT is optimized by custom op implementation (FasterTransformer). Developers may only want to compile ResNet, and leave everything in BERT (embedding, custom op, FC) run in torch. Although it is fine to explicitly assign values for torch_executed_modules, it will be much more convenient to have an interface like trt_executed_modules, and a switch-mode interface like default_torch_execution.

When the switch is turned on, every module would by default run in Torch, and only modules explicitly included in trt_executed_modules would be compiled.

narendasan · 2022-06-17T00:00:02Z

narendasan
Jun 17, 2022
Collaborator

Lets dig deeper into the internals. For example we now expose the potential for conflicting settings in torch_executed_x and tensorrt_executed_x. To me there seems to be a hierarchy of settings that are all related and need to be mutually exclusive:

require_full_compilation, default_torch_execution
torch_executed_x, tensorrt_executed_x

How does this UX compare to going into the model code and doing an inplace replacement vs trying to get Torch-TensorRT to accept the full model? One concern I have is part of the process for Torch-TensorRT is to go through lowering which will change the model, for partial compilation this changes both TensorRT and Torch blocks. At the very least it would seem we would need to go in and pull out the subgraphs corresponding to selected modules to maximize compatibility.

0 replies

narendasan · 2022-06-17T00:03:28Z

narendasan
Jun 17, 2022
Collaborator

cc: @bowang007 @peri044

0 replies

Njuapp · 2022-06-17T06:12:57Z

Njuapp
Jun 17, 2022
Author

@narendasan Yes, things may get messed up if we can specify torch_executed_x or tensorrt_executed_x at the same time.

Interface exclusiveness issue

To avoid that situation, maybe it is more clear to make them mutually exclusive, with the default_torch_execution as the global switch:

When default_torch_execution is turned off, everything would be just as before. This ensures the previous Torch-TensorRT design would not be affected: everything would be compiled by default, unless uncovertable or forced to fallback.
When default_torch_execution is turned on, everything execept those included by trt_executed_modules would not be compiled. To ensure exclusiveness, we have to make the following three hard restrictions:
2.1. The list of trt_executed_modules must not be empty, otherwise this compilation is meaningless.
2.2. The list of torch_executed_modules must be empty, because now everything is by default on torch, no need to explicitly specify.
2.3. require_full_compilation must be false, since it is the opposite direction of default_torch_execution

How to maximize compatibility when doing lowering

When doing lowering, whether default_torch_execution is turned on or not, only things running in torch would be marked with [to_compile=0]. Things that will be compiled would not be changed. (Changing them, e.g., marking with [to_compile=1] would cause failure in succeding lowering passes because the pattern with [to_compiled=1] could not be recognized.

Here is how to do lowering pass when default_torch_execution is on: NotateModuleForFallback() is not changed, and to-be-compiled modules are wrapped by prim::Enter[compilation_edge=start] and prim::Enter[compilation_edge=end]. When doing MarkNodesForFallback(), prim::Enter ops would be pushed onto a stack, everything encountered while stack is empty would be run in torch, otherwise on tensorRT. This is opposite to the situation when default_torch_execution is off.

} else if ((!mark.top() && default_torch_execution) or (mark.top() && !default_torch_execution)) {
      LOG_GRAPH("Marking " << util::node_info(n) << " to run in PyTorch");
      n->i_(c10::Symbol::attr("to_compile"), (int64_t) false);
}

You can check details in PR #1122

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding `trt_executed_modules` interface #1119

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Adding trt_executed_modules interface #1119

Njuapp Jun 15, 2022

Replies: 3 comments

narendasan Jun 17, 2022 Collaborator

narendasan Jun 17, 2022 Collaborator

Njuapp Jun 17, 2022 Author

Interface exclusiveness issue

How to maximize compatibility when doing lowering

Adding `trt_executed_modules` interface #1119

Njuapp
Jun 15, 2022

narendasan
Jun 17, 2022
Collaborator

narendasan
Jun 17, 2022
Collaborator

Njuapp
Jun 17, 2022
Author