Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Sanath1998 · 2022-10-30T18:50:28Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

As I have quantized FLOAT32 model to INT8 model, I'am not able to convert the model to any other formats nor able to inference using detect.py and val.py

Error msg:

While doing inference

ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'

While converting to any other model type using export.py

[TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."

@glenn-jocher Can you plz look onto this? Looking forward for your reply

Additional

No response

glenn-jocher · 2022-10-31T13:44:56Z

👋 Hello! Thanks for asking about YOLOv5 🚀 benchmarks. YOLOv5 inference is officially supported in 11 formats, and all formats are benchmarked for identical accuracy and to compare speed every 24 hours by the YOLOv5 CI.

💡 ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks.
💡 ProTip: Export to TensorRT for up to 5x GPU speedup. See GPU Benchmarks.

Format	`export.py --include`	Model
PyTorch	-	`yolov5s.pt`
TorchScript	`torchscript`	`yolov5s.torchscript`
ONNX	`onnx`	`yolov5s.onnx`
OpenVINO	`openvino`	`yolov5s_openvino_model/`
TensorRT	`engine`	`yolov5s.engine`
CoreML	`coreml`	`yolov5s.mlmodel`
TensorFlow SavedModel	`saved_model`	`yolov5s_saved_model/`
TensorFlow GraphDef	`pb`	`yolov5s.pb`
TensorFlow Lite	`tflite`	`yolov5s.tflite`
TensorFlow Edge TPU	`edgetpu`	`yolov5s_edgetpu.tflite`
TensorFlow.js	`tfjs`	`yolov5s_web_model/`

Benchmarks

Benchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook . To reproduce:

python utils/benchmarks.py --weights yolov5s.pt --imgsz 640 --device 0

Colab Pro V100 GPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=0, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 46.7/166.8 GB disk)

Benchmarks complete (458.07s)
                   Format  [email protected]:0.95  Inference time (ms)
0                 PyTorch        0.4623                10.19
1             TorchScript        0.4623                 6.85
2                    ONNX        0.4623                14.63
3                OpenVINO           NaN                  NaN
4                TensorRT        0.4617                 1.89
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623                21.28
7     TensorFlow GraphDef        0.4623                21.22
8         TensorFlow Lite           NaN                  NaN
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Colab Pro CPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=cpu, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CPU
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 41.5/166.8 GB disk)

Benchmarks complete (241.20s)
                   Format  [email protected]:0.95  Inference time (ms)
0                 PyTorch        0.4623               127.61
1             TorchScript        0.4623               131.23
2                    ONNX        0.4623                69.34
3                OpenVINO        0.4623                66.52
4                TensorRT           NaN                  NaN
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623               123.79
7     TensorFlow GraphDef        0.4623               121.57
8         TensorFlow Lite        0.4623               316.61
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Good luck 🍀 and let us know if you have any other questions!

Sanath1998 · 2022-11-01T05:51:27Z

@glenn-jocher , my intent was to take a call on know-hows of converting model to quantized INT8 precison and doing inference on the INT8 precision mode?
Could you please explain me on this point

glenn-jocher · 2022-11-01T09:24:46Z

@Sanath1998 depends on the export format. Some are already doing this, i.e. python export.py --tflite --int8

Sanath1998 · 2022-11-03T08:06:28Z

@glenn-jocher
Can't we do quantization for other models which is exported into onnx, tensorrt etc?

glenn-jocher · 2022-11-03T15:09:11Z

Many formats support FP16 with the --half flag

Sanath1998 · 2022-11-07T10:35:18Z

Many formats support FP16 with the --half flag

@glenn-jocher actually the models before using --half flag and after using --half flag, the size of the model is same for both.
Actually once we reduce to fp16, the model size should decrease ryt?
Can u plz explain me the context on this

glenn-jocher · 2022-11-07T22:17:36Z

👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀.

Not all formats support --half and --int8. See export.py code for details.

We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

✅ Minimal – Use as little code as possible to produce the problem
✅ Complete – Provide all parts someone else needs to reproduce the problem
✅ Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

✅ Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
✅ Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

github-actions · 2022-12-08T01:39:12Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/hub
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

Sanath1998 added the question Further information is requested label Oct 30, 2022

github-actions bot added the Stale Stale and schedule for closing soon label Dec 8, 2022

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Sanath1998 commented Oct 30, 2022 •

edited

Loading

glenn-jocher commented Oct 31, 2022

Sanath1998 commented Nov 1, 2022

glenn-jocher commented Nov 1, 2022

Sanath1998 commented Nov 3, 2022

glenn-jocher commented Nov 3, 2022

Sanath1998 commented Nov 7, 2022

glenn-jocher commented Nov 7, 2022 •

edited

Loading

github-actions bot commented Dec 8, 2022 •

edited by glenn-jocher

Loading

Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979

Comments

Sanath1998 commented Oct 30, 2022 • edited Loading

Search before asking

Question

ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model AttributeError: 'collections.OrderedDict' object has no attribute 'to'

[TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput: [8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."

Additional

glenn-jocher commented Oct 31, 2022

Benchmarks

Colab Pro V100 GPU

Colab Pro CPU

Sanath1998 commented Nov 1, 2022

glenn-jocher commented Nov 1, 2022

Sanath1998 commented Nov 3, 2022

glenn-jocher commented Nov 3, 2022

Sanath1998 commented Nov 7, 2022

glenn-jocher commented Nov 7, 2022 • edited Loading

How to create a Minimal, Reproducible Example

github-actions bot commented Dec 8, 2022 • edited by glenn-jocher Loading

Sanath1998 commented Oct 30, 2022 •

edited

Loading

ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'

[TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."

glenn-jocher commented Nov 7, 2022 •

edited

Loading

github-actions bot commented Dec 8, 2022 •

edited by glenn-jocher

Loading