Weird Results After Exporting to TensorRT FP16 #104

Gabriellm2003 · 2023-10-26T20:12:17Z

I trained a model with a custom dataset using the PyTorch code from this repository. The training went well, and the Torch model worked as expected. After this test, I tried to export the model to ONNX. Again, everything went well, and the model worked as expected. Lastly, I tried to export the model to TensorRT. I exported two models, one using FP16 precision and the second using FP32 precision. There were no error logs during the export procedure.

When I tested the models, the FP32 one generated the same results as the ONNX model, while the FP16 one generated very distinct results compared to them. I noticed that the results from the FP16 model contained multiple (quite a few) bounding boxes for the same object. I found these differences between the models quite strange, considering that I did this procedure on a variety of different models, and the impact on the results was minimal. I suppose those differences could be removed by applying non-maximum suppression (NMS), but I didn't want to do that.

Does anyone know what might be causing this? Or at least how to fix it?

About some of the configurations that I used:

ONNX:

Exported using opset=16
onnx==1.14.0
onnxruntime==1.15.1
onnxsim==0.4.33
torch==2.0.1
torchvision==0.15.2

TensorRT:
I used the container nvcr.io/nvidia/tensorrt:23.01-py3, which includes:
TensorRT==8.5.2.2

lyuwenyu · 2023-10-27T10:48:40Z

I convert onnx to tensort engine, then using this code to do infernece, it works fine for me.

You also can try third-part tools to check your model firstly, see some resource in this discussion #95

Gabriellm2003 · 2023-10-27T17:29:22Z

I ran additional tests here. Considering the models that were generated in the same training procedure, the issue of multiple detections using the FP16 TensorRT model does not occur with the models generated after the initial epochs.
I am curious if this might be related to the torch.amp.
Upon checking, I found that the parameters I used for it during training are:

use_amp: False

scaler:
  type: GradScaler
  enabled: True

I wonder if this problem would be solved if I activate the use_amp parameter.

lyuwenyu · 2023-10-30T08:06:35Z

use_amp only relate to traing processing.

Gabriellm2003 · 2023-10-30T13:45:28Z

Yes. In the case, I wonder if this problem would be solved if I activate the use_amp and re-train the model.

lyuwenyu · 2023-10-31T05:32:03Z

I think it does not.

Gabriellm2003 · 2023-11-17T17:01:47Z

Sorry for the delay.
I double-checked the logs from the TensorRT (trtexec) conversion and found these warnings.

[W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] [TRT] parsers/onnx/onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[W] Dynamic dimensions required for input: orig_target_sizes, but no shapes were provided. Automatically overriding shape to: 1x2
[W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[W] [TRT] Check verbose logs for the list of affected weights.
[W] [TRT] - 1 weights are affected by this issue: Detected FP32 infinity values and converted them to corresponding FP16 infinity.
[W] [TRT] - 208 weights are affected by this issue: Detected subnormal FP16 values.
[W] [TRT] - 51 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
[W] [TRT] - 6 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value.
[W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
[W]   If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.

The warnings are probably related to the problem that I am facing.
Do you have any idea how I can solve this issue?

shreejalt · 2023-12-18T08:12:51Z

I convert onnx to tensort engine, then using this code to do infernece, it works fine for me.

You also can try third-part tools to check your model firstly, see some resource in this discussion #95

How can we do inference using the tensorrt model?

shreejalt · 2024-01-19T05:34:40Z

@lyuwenyu
How to do inference with the TRTInferer? Do we need to directly pass the image or do we need to preprocess the image before passing the blob?

Also if we need to preprocess the image can you give me the exact code which we need to use before passing?

Gabriellm2003 closed this as completed Oct 27, 2023

Gabriellm2003 reopened this Oct 27, 2023

Peterande mentioned this issue Nov 9, 2024

tensorrt 转fp16精度严重下降 Peterande/D-FINE#44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird Results After Exporting to TensorRT FP16 #104

Weird Results After Exporting to TensorRT FP16 #104

Gabriellm2003 commented Oct 26, 2023

lyuwenyu commented Oct 27, 2023

Gabriellm2003 commented Oct 27, 2023 •

edited

Loading

lyuwenyu commented Oct 30, 2023

Gabriellm2003 commented Oct 30, 2023

lyuwenyu commented Oct 31, 2023

Gabriellm2003 commented Nov 17, 2023

shreejalt commented Dec 18, 2023

shreejalt commented Jan 19, 2024

Weird Results After Exporting to TensorRT FP16 #104

Weird Results After Exporting to TensorRT FP16 #104

Comments

Gabriellm2003 commented Oct 26, 2023

lyuwenyu commented Oct 27, 2023

Gabriellm2003 commented Oct 27, 2023 • edited Loading

lyuwenyu commented Oct 30, 2023

Gabriellm2003 commented Oct 30, 2023

lyuwenyu commented Oct 31, 2023

Gabriellm2003 commented Nov 17, 2023

shreejalt commented Dec 18, 2023

shreejalt commented Jan 19, 2024

Gabriellm2003 commented Oct 27, 2023 •

edited

Loading