Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification model resnet18_8xb32_in1k convert failed. #201

Closed
del-zhenwu opened this issue Mar 2, 2022 · 7 comments
Closed

Classification model resnet18_8xb32_in1k convert failed. #201

del-zhenwu opened this issue Mar 2, 2022 · 7 comments
Assignees

Comments

@del-zhenwu
Copy link
Collaborator

Thanks for your bug report. We appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug

2022-03-02 11:35:10 [ WARNING] 2022-03-02 11:34:15,181 - mmdeploy - INFO - torch2onnx start.
2022-03-02:11:34:24,matplotlib.font_manager INFO     [font_manager.py:1443] generated new fontManager
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /opt/conda/conda-bld/pytorch_1623448224956/work/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2022-03-02 11:34:41,338 - mmdeploy - INFO - torch2onnx success.
2022-03-02 11:34:42,115 - mmdeploy - INFO - onnx2tensorrt of resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx start.
2022-03-02 11:34:45,018 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +226, GPU +0, now: CPU 288, GPU 481 (MiB)
[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 376 MiB, GPU 481 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +68, now: CPU 534, GPU 549 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +132, GPU +86, now: CPU 666, GPU 635 (MiB)
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.2.0
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 976, GPU 743 (MiB)
[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)
2022-03-02:11:35:08,root ERROR    [utils.py:43] Failed to create TensorRT engine
2022-03-02 11:35:10,113 - mmdeploy - ERROR - onnx2tensorrt of resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx failed.
 (util.py:25)
2022-03-02 11:35:10 [    INFO] ('2022-03-02 11:34:15,181 - mmdeploy - INFO - torch2onnx start.\n'
 '2022-03-02:11:34:24,matplotlib.font_manager INFO     [font_manager.py:1443] '
 'generated new fontManager\n'
 '/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:718: '
 'UserWarning: Named tensors and all their associated APIs are an experimental '
 'feature and subject to change. Please do not use them for anything important '
 'until they are released as stable. (Triggered internally at  '
 '/opt/conda/conda-bld/pytorch_1623448224956/work/c10/core/TensorImpl.h:1156.)\n'
 '  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, '
 'ceil_mode)\n'
 '2022-03-02 11:34:41,338 - mmdeploy - INFO - torch2onnx success.\n'
 '2022-03-02 11:34:42,115 - mmdeploy - INFO - onnx2tensorrt of '
 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx start.\n'
 '2022-03-02 11:34:45,018 - mmdeploy - INFO - Successfully loaded tensorrt '
 'plugins from /opt/mmdeploy/build/lib/libmmdeploy_tensorrt_ops.so\n'
 '[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +226, GPU +0, now: CPU 288, '
 'GPU 481 (MiB)\n'
 '[TensorRT] WARNING: onnx2trt_utils.cpp:364: Your ONNX model has been '
 'generated with INT64 weights, while TensorRT does not natively support '
 'INT64. Attempting to cast down to INT32.\n'
 '[TensorRT] INFO: [MemUsageSnapshot] Builder begin: CPU 376 MiB, GPU 481 MiB\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +68, '
 'now: CPU 534, GPU 549 (MiB)\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +132, GPU +86, now: CPU '
 '666, GPU 635 (MiB)\n'
 '[TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN '
 '8.2.0\n'
 '[TensorRT] WARNING: Detected invalid timing cache, setup a local cache '
 'instead\n'
 '[TensorRT] INFO: Some tactics do not have sufficient workspace memory to '
 'run. Increasing workspace size may increase performance, please check '
 'verbose output.\n'
 '[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: '
 'CPU 976, GPU 743 (MiB)\n'
 '[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: '
 'Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)\n'
 '2022-03-02:11:35:08,root ERROR    [utils.py:43] Failed to create TensorRT '
 'engine\n'
 '2022-03-02 11:35:10,113 - mmdeploy - ERROR - onnx2tensorrt of '
 'resnet18_8xb32_in1k_20210831-fbbb1da6.pth/end2end.onnx failed.\n') (test_convert.py:59)

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

Environment

2022-03-02 11:47:56,295 - mmdeploy - INFO - **********Environmental information**********
2022-03-02 11:47:58,692 - mmdeploy - INFO - sys.platform: linux
2022-03-02 11:47:58,693 - mmdeploy - INFO - Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
2022-03-02 11:47:58,693 - mmdeploy - INFO - CUDA available: True
2022-03-02 11:47:58,693 - mmdeploy - INFO - GPU 0: Tesla V100-PCIE-16GB
2022-03-02 11:47:58,693 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2022-03-02 11:47:58,693 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 10.2, V10.2.89
2022-03-02 11:47:58,693 - mmdeploy - INFO - GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
2022-03-02 11:47:58,693 - mmdeploy - INFO - PyTorch: 1.9.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2022-03-02 11:47:58,694 - mmdeploy - INFO - TorchVision: 0.10.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - OpenCV: 4.5.4
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV: 1.4.0
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
2022-03-02 11:47:58,694 - mmdeploy - INFO - MMDeployment: 0.3.0+34879e6
2022-03-02 11:47:58,695 - mmdeploy - INFO - 

2022-03-02 11:47:58,695 - mmdeploy - INFO - **********Backend information**********
[2022-03-02 11:47:59.106] [mmdeploy] [info] Register 'DirectoryModel'
2022-03-02 11:47:59,141 - mmdeploy - INFO - onnxruntime: 1.10.0 ops_is_avaliable : True
2022-03-02 11:47:59,142 - mmdeploy - INFO - tensorrt: 8.0.3.4 ops_is_avaliable : True
2022-03-02 11:47:59,144 - mmdeploy - INFO - ncnn: None ops_is_avaliable : False
2022-03-02 11:47:59,146 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-03-02 11:47:59,149 - mmdeploy - INFO - openvino_is_avaliable: True
2022-03-02 11:47:59,149 - mmdeploy - INFO - 

2022-03-02 11:47:59,149 - mmdeploy - INFO - **********Codebase information**********
2022-03-02 11:47:59,151 - mmdeploy - INFO - mmcls: 0.19.0
2022-03-02 11:47:59,152 - mmdeploy - INFO - mmdet: 2.20.0
2022-03-02 11:47:59,153 - mmdeploy - INFO - mmedit: 0.12.0
2022-03-02 11:47:59,155 - mmdeploy - INFO - mmocr: 0.4.1
2022-03-02 11:47:59,156 - mmdeploy - INFO - mmseg: 0.21.1

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

@del-zhenwu
Copy link
Collaborator Author

The same as configs/seresnet/seresnet50_8xb32_in1k.py in Classification

@RunningLeon RunningLeon self-assigned this Mar 2, 2022
@RunningLeon
Copy link
Collaborator

Ok. we'll test it.

@RunningLeon
Copy link
Collaborator

@del-zhenwu Tested ok on my machine. Maybe something wrong with the env.

'[TensorRT] ERROR: 2: [ltWrapper.cpp::setupHeuristic::327] Error Code 2: '
'Internal Error (Assertion cublasStatus == CUBLAS_STATUS_SUCCESS failed.)\n'

@RunningLeon
Copy link
Collaborator

python tools/deploy.py
configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py
../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py
../mmclassification/checkpoints/se-resnet50_batch256_imagenet_20200804-ae206104.pth
../mmdetection/demo/demo.jpg
--work-dir
./work-dirs/mmcls/seresnet
--show
--device
cuda

@del-zhenwu
Copy link
Collaborator Author

configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py
../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py

Tried with configs/mmcls/classification_tensorrt_dynamic-224x224-224x224.py ../mmclassification/configs/seresnet/seresnet50_8xb32_in1k.py, still failed to convert.

@RunningLeon
Copy link
Collaborator

@del-zhenwu Do you still have this issue?

@RunningLeon
Copy link
Collaborator

fixed in #215

hanrui1sensetime pushed a commit to hanrui1sensetime/mmdeploy that referenced this issue Nov 25, 2022
* update mmengine and mmdet version

* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants