feat: Add handling for ITensor mean and var in batch_norm #3099

chohk88 · 2024-08-19T16:39:56Z

Description

Support ITensor type running_mean and running_var arguments for Batch Norm converter.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

zewenli98 · 2024-08-19T22:22:30Z

py/torch_tensorrt/dynamo/conversion/impl/normalization/ops.py

    weight: Optional[Union[torch.Tensor, np.ndarray]],
    bias: Optional[Union[torch.Tensor, np.ndarray]],
-    running_mean: Optional[Union[torch.Tensor, np.ndarray]],
-    running_var: Optional[Union[torch.Tensor, np.ndarray]],
+    running_mean: Union[TRTTensor, Optional[Union[torch.Tensor, np.ndarray]]],
+    running_var: Union[TRTTensor, Optional[Union[torch.Tensor, np.ndarray]]],


Per the schema, these types seem to be:

weight: Optional[Union[TRTensor, torch.Tensor, np.ndarray]], bias: Optional[Union[TRTensor, torch.Tensor, np.ndarray]], running_mean: Union[TRTensor, torch.Tensor, np.ndarray], running_var: Union[TRTensor, torch.Tensor, np.ndarray],

I just noticed that torch.ops.aten.native_batch_norm.default and torch.ops.aten.batch_norm.default reuse the function. Since they require running_mean and running_var to be optional, you can put all these types to Optional[Union[TRTensor, torch.Tensor, np.ndarray]]

Thank you for your comments! I've resolved the issue based on your feedback.

zewenli98 · 2024-08-19T23:03:56Z

py/torch_tensorrt/dynamo/conversion/impl/normalization/ops.py

+    if isinstance(running_mean, TRTTensor) or isinstance(running_var, TRTTensor):
+        # Default values if weight, bias, running_mean, running_var are None
+        if weight is None:
+            weight = get_trt_tensor(ctx, 1.0, f"{name}_weight", input.dtype)
+        if bias is None:
+            bias = get_trt_tensor(ctx, 0.0, f"{name}_bias", input.dtype)
+        if running_mean is None:
+            running_mean = get_trt_tensor(ctx, 0.0, f"{name}_running_mean", input.dtype)
+        if running_var is None:
+            running_var = get_trt_tensor(ctx, 1.0, f"{name}_running_var", input.dtype)


Do we need to cast these parameters to the type of input? I think there's probably a case that input is int type while weight, bias, running_mean, and/or running_var are float type. It seems problematic to force cast float to int. The dtype is optional so you can just leave it blank.

weight and bias could be ITensor as well right?

->>Do we need to cast these parameters to the type of input?
I thinks this is required for strongly-typed-networks, different type can be allowed for weak typed network.
I noticed it by enabling strongly typed networks with some model. Here is some changes to keep same type in ops. I only saw float32 and half float, I'm now sure if float and int tensor in same layer is possible.

->>Do we need to cast these parameters to the type of input?
If input_val is a TRTTensor, the input is returned unchanged (code), so the type stays the same. If input_val is something else, create_constant converts the type using to_numpy (code). However, if the value is an np.ndarray or torch.Tensor, it keeps the original type (code and code). This means setting input.dtype has no effect.

I confirmed this behavior when I removed the input.dtype argument.

As for the issue @keehyuna mentioned, it's new to me, so I'll investigate.

->> weight and bias could be ITensor as well right?
Additionally, according to the schema, weight and bias can't be ITensor, but the converter works fine. I've tested it, and it works successfully.

->> weight and bias could be ITensor as well right?
Additionally, according to the schema, weight and bias can't be ITensor, but the converter works fine. I've tested it, and it works successfully.

@chohk88 Did I miss something? The schema is:

- func: batch_norm(Tensor input, Tensor? weight, Tensor? bias, Tensor? running_mean, Tensor? running_var, bool training, float momentum, float eps, bool cudnn_enabled) -> Tensor - func: _native_batch_norm_legit_no_training(Tensor input, Tensor? weight, Tensor? bias, Tensor running_mean, Tensor running_var, float momentum, float eps) -> (Tensor, Tensor, Tensor)

My understanding to Tensor? weight, Tensor? bias is that they could be None, ITensor, torch.Tensor, or np.ndarray. Please correct me if I'm wrong.

Oh, I misunderstood that. You're right—weight, bias, running_mean, and running_var can all be TRTTensor. I’ve combined the separate converter and added some test cases for this.

peri044

LGTM

chohk88 · 2024-08-21T17:21:48Z

LGTM

If the CI results show no more batch_norm errors, it's good to merge.

chohk88 · 2024-08-22T03:25:01Z

@zewenli98 Although it's beyond the scope of this PR, I noticed something regarding the converters for layer_norm, group_norm, and native_group_norm.

Currently, weight and bias are defined as:

weight: Optional[Union[torch.Tensor, np.ndarray]],
bias: Optional[Union[torch.Tensor, np.ndarray]],

Is this incorrect? Fortunately, unlike batch norm, there's no issue with adding eps or applying to_numpy on the weight or bias of TRTTensor (or ITensor), so it doesn't seem to cause any problems with the converter. However, it seems like the case where the default value for weight in layer_norm is None might not be handled properly.

Here are schema:

peri044 · 2024-08-22T16:17:36Z

Merging this as the other failures are unrelated.

zewenli98 · 2024-08-22T18:00:46Z

@chohk88 Thanks for pointing out the issue. I think your understanding is correct. Although there's no errors out now, they should be Optional[Union[TRTTensor, torch.Tensor, np.ndarray]]. Since the Issue #3114 has been opened above to track it, could you modify them to Optional[Union[TRTTensor, torch.Tensor, np.ndarray]] like what you did for the batch_norm?

chohk88 requested review from narendasan, peri044 and zewenli98 August 19, 2024 16:39

chohk88 self-assigned this Aug 19, 2024

facebook-github-bot added the cla signed label Aug 19, 2024

zewenli98 reviewed Aug 19, 2024

View reviewed changes

chohk88 force-pushed the converter_batch_norm_with_TRTTensor_mean_var branch from e1773b4 to 9133df8 Compare August 20, 2024 03:42

peri044 approved these changes Aug 21, 2024

View reviewed changes

chohk88 added 3 commits August 22, 2024 14:14

feat: Add handling for ITensor mean and var in batch_norm

0a46984

chore : minor fix

3c06a9f

chore: Ensure input arguments are based on ITensor (TRTTensor)

5d6cf81

chohk88 force-pushed the converter_batch_norm_with_TRTTensor_mean_var branch from 4d8ff0b to 5d6cf81 Compare August 22, 2024 05:14

peri044 mentioned this pull request Aug 22, 2024

❓ [Question] Revisit the argument types of normalization converters #3114

Open

peri044 merged commit 66511da into main Aug 22, 2024
49 of 67 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add handling for ITensor mean and var in batch_norm #3099

feat: Add handling for ITensor mean and var in batch_norm #3099

chohk88 commented Aug 19, 2024

zewenli98 Aug 19, 2024

zewenli98 Aug 19, 2024

chohk88 Aug 21, 2024

zewenli98 Aug 19, 2024

keehyuna Aug 20, 2024 •

edited

Loading

chohk88 Aug 21, 2024

zewenli98 Aug 21, 2024

chohk88 Aug 22, 2024

peri044 left a comment

chohk88 commented Aug 21, 2024

chohk88 commented Aug 22, 2024

peri044 commented Aug 22, 2024

zewenli98 commented Aug 22, 2024

feat: Add handling for ITensor mean and var in batch_norm #3099

feat: Add handling for ITensor mean and var in batch_norm #3099

Conversation

chohk88 commented Aug 19, 2024

Description

Type of change

Checklist:

zewenli98 Aug 19, 2024

Choose a reason for hiding this comment

zewenli98 Aug 19, 2024

Choose a reason for hiding this comment

chohk88 Aug 21, 2024

Choose a reason for hiding this comment

zewenli98 Aug 19, 2024

Choose a reason for hiding this comment

keehyuna Aug 20, 2024 • edited Loading

Choose a reason for hiding this comment

chohk88 Aug 21, 2024

Choose a reason for hiding this comment

zewenli98 Aug 21, 2024

Choose a reason for hiding this comment

chohk88 Aug 22, 2024

Choose a reason for hiding this comment

peri044 left a comment

Choose a reason for hiding this comment

chohk88 commented Aug 21, 2024

chohk88 commented Aug 22, 2024

peri044 commented Aug 22, 2024

zewenli98 commented Aug 22, 2024

keehyuna Aug 20, 2024 •

edited

Loading