Classification model `resnet18_8xb32_in1k` convert failed. #201

del-zhenwu · 2022-03-02T03:48:14Z

Thanks for your bug report. We appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug

2022-03-02 11:35:10 [ WARNING] 2022-03-02 11:34:15,181 - mmdeploy - INFO - torch2onnx start.
2022-03-02:11:34:24,matplotlib.font_manager INFO     [font_manager.py:1443] generated new fontManager
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448224956/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2022-03-02 11:34:41,338 - mmdeploy - INFO - torch2onnx success.
2022-03-02 11:34:42,115 - mmdeploy - INFO - onnx2tensorrt of resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx start.
2022-03-02 11:34:45,018 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +226, GPU +0, now: CPU 288, GPU 481 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 376 MiB, GPU 481 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +68, now: CPU 534, GPU 549 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +132, GPU +86, now: CPU 666, GPU 635 (MiB)
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.2.0
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 976, GPU 743 (MiB)
[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)
2022-03-02:11:35:08,root ERROR    [utils.py:43] Failed to create TensorRT engine
2022-03-02 11:35:10,113 - mmdeploy - ERROR - onnx2tensorrt of resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx failed.
 (util.py:25)
2022-03-02 11:35:10 [    INFO] ('2022-03-02 11:34:15,181 - mmdeploy - INFO - torch2onnx start.\n'
 '2022-03-02:11:34:24,matplotlib.font_manager INFO     [font_manager.py:1443] '
 'generated new fontManager\n'
 '/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:718: '
 'UserWarning: Named tensors and all their associated APIs are an experimental '
 'feature and subject to change. Please do not use them for anything important '
 'until they are released as stable. (Triggered internally at  '
 '/opt/conda/conda-bld/pytorch_1623448224956/work/c10/core/TensorImpl.h:1156.)\n'
 '  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, '
 'ceil_mode)\n'
 '2022-03-02 11:34:41,338 - mmdeploy - INFO - torch2onnx success.\n'
 '2022-03-02 11:34:42,115 - mmdeploy - INFO - onnx2tensorrt of '
 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx start.\n'
 '2022-03-02 11:34:45,018 - mmdeploy - INFO - Successfully loaded tensorrt '
 'plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so\n'
 '[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +226, GPU +0, now: CPU 288, '
 'GPU 481 (MiB)\n'
 '[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been '
 'generated with INT64 weights, while TensorRT does not natively support '
 'INT64. Attempting to cast down to INT32.\n'
 '[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 376 MiB, GPU 481 MiB\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +68, '
 'now: CPU 534, GPU 549 (MiB)\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +132, GPU +86, now: CPU '
 '666, GPU 635 (MiB)\n'
 '[TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN '
 '8.2.0\n'
 '[TensorRT] WARNING: Detected invalid timing cache, setup a local cache '
 'instead\n'
 '[TensorRT] INFO: Some tactics do not have sufficient workspace memory to '
 'run. Increasing workspace size may increase performance, please check '
 'verbose output.\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: '
 'CPU 976, GPU 743 (MiB)\n'
 '[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: '
 'Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)\n'
 '2022-03-02:11:35:08,root ERROR    [utils.py:43] Failed to create TensorRT '
 'engine\n'
 '2022-03-02 11:35:10,113 - mmdeploy - ERROR - onnx2tensorrt of '
 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx failed.\n') (test_convert.py:59)

Reproduction

What command or script did you run?

A placeholder for the command.

Did you make any modifications on the code or config? Did you understand what you have modified?

Environment

2022-03-02 11:47:56,295 - mmdeploy - INFO - **********Environmental information**********
2022-03-02 11:47:58,692 - mmdeploy - INFO - sys.platform: linux
2022-03-02 11:47:58,693 - mmdeploy - INFO - Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
2022-03-02 11:47:58,693 - mmdeploy - INFO - CUDA available: True
2022-03-02 11:47:58,693 - mmdeploy - INFO - GPU 0: Tesla V100-PCIE-16GB
2022-03-02 11:47:58,693 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2022-03-02 11:47:58,693 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 10.2, V10.2.89
2022-03-02 11:47:58,693 - mmdeploy - INFO - GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
2022-03-02 11:47:58,693 - mmdeploy - INFO - PyTorch: 1.9.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2022-03-02 11:47:58,694 - mmdeploy - INFO - TorchVision: 0.10.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - OpenCV: 4.5.4
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV: 1.4.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMDeployment: 0.3.0+34879e6
2022-03-02 11:47:58,695 - mmdeploy - INFO - 

2022-03-02 11:47:58,695 - mmdeploy - INFO - **********Backend information**********
[2022-03-02 11:47:59.106] [mmdeploy] [info] Register 'DirectoryModel'
2022-03-02 11:47:59,141 - mmdeploy - INFO - onnxruntime: 1.10.0 ops_is_avaliable : True
2022-03-02 11:47:59,142 - mmdeploy - INFO - tensorrt: 8.0.3.4 ops_is_avaliable : True
2022-03-02 11:47:59,144 - mmdeploy - INFO - ncnn: None ops_is_avaliable : False
2022-03-02 11:47:59,146 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-03-02 11:47:59,149 - mmdeploy - INFO - openvino_is_avaliable: True
2022-03-02 11:47:59,149 - mmdeploy - INFO - 

2022-03-02 11:47:59,149 - mmdeploy - INFO - **********Codebase information**********
2022-03-02 11:47:59,151 - mmdeploy - INFO - mmcls: 0.19.0
2022-03-02 11:47:59,152 - mmdeploy - INFO - mmdet: 2.20.0
2022-03-02 11:47:59,153 - mmdeploy - INFO - mmedit: 0.12.0
2022-03-02 11:47:59,155 - mmdeploy - INFO - mmocr: 0.4.1
2022-03-02 11:47:59,156 - mmdeploy - INFO - mmseg: 0.21.1

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

The text was updated successfully, but these errors were encountered:

del-zhenwu · 2022-03-02T03:55:05Z

The same as configs/seresnet/seresnet50_8xb32_in1k.py in Classification

RunningLeon · 2022-03-02T07:06:47Z

Ok. we'll test it.

RunningLeon · 2022-03-03T03:33:24Z

@del-zhenwu Tested ok on my machine. Maybe something wrong with the env.

'[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: '
'Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)\n'

RunningLeon · 2022-03-03T06:22:30Z

python tools/deploy.py
configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py
../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py
../mmclassification/checkpoints/se-resnet50_batch256_imagenet_20200804-ae206104.pth
../mmdetection/demo/demo.jpg
--work-dir
./work-dirs/mmcls/seresnet
--show
--device
cuda

del-zhenwu · 2022-03-03T06:43:15Z

configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py
../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py

Tried with configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py ../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py, still failed to convert.

RunningLeon · 2022-03-08T11:34:52Z

@del-zhenwu Do you still have this issue?

RunningLeon · 2022-03-30T09:40:23Z

fixed in #215

* update mmengine and mmdet version * update

RunningLeon self-assigned this Mar 2, 2022

RunningLeon closed this as completed Mar 30, 2022

hanrui1sensetime pushed a commit to hanrui1sensetime/mmdeploy that referenced this issue Nov 25, 2022

update mmengine/mmdet version and article (open-mmlab#201)

89588b6

* update mmengine and mmdet version * update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification model `resnet18_8xb32_in1k` convert failed. #201

Classification model `resnet18_8xb32_in1k` convert failed. #201

del-zhenwu commented Mar 2, 2022

del-zhenwu commented Mar 2, 2022

RunningLeon commented Mar 2, 2022

RunningLeon commented Mar 3, 2022

RunningLeon commented Mar 3, 2022

del-zhenwu commented Mar 3, 2022

RunningLeon commented Mar 8, 2022

RunningLeon commented Mar 30, 2022

Classification model resnet18_8xb32_in1k convert failed. #201

Classification model resnet18_8xb32_in1k convert failed. #201

Comments

del-zhenwu commented Mar 2, 2022

del-zhenwu commented Mar 2, 2022

RunningLeon commented Mar 2, 2022

RunningLeon commented Mar 3, 2022

RunningLeon commented Mar 3, 2022

del-zhenwu commented Mar 3, 2022

RunningLeon commented Mar 8, 2022

RunningLeon commented Mar 30, 2022

Classification model `resnet18_8xb32_in1k` convert failed. #201

Classification model `resnet18_8xb32_in1k` convert failed. #201