-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantized model INT8 is not able to do inference or convert to any other type of model structure #9979
Comments
👋 Hello! Thanks for asking about YOLOv5 🚀 benchmarks. YOLOv5 inference is officially supported in 11 formats, and all formats are benchmarked for identical accuracy and to compare speed every 24 hours by the YOLOv5 CI. 💡 ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks.
BenchmarksBenchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook . To reproduce: python utils/benchmarks.py --weights yolov5s.pt --imgsz 640 --device 0 Colab Pro V100 GPU
Colab Pro CPU
Good luck 🍀 and let us know if you have any other questions! |
@glenn-jocher , my intent was to take a call on know-hows of converting model to quantized INT8 precison and doing inference on the INT8 precision mode? |
@Sanath1998 depends on the export format. Some are already doing this, i.e. |
@glenn-jocher |
Many formats support FP16 with the --half flag |
@glenn-jocher actually the models before using --half flag and after using --half flag, the size of the model is same for both. |
👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. Not all formats support --half and --int8. See export.py code for details. We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem. How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs. Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐! |
Search before asking
Question
Hi @glenn-jocher
As I have quantized FLOAT32 model to INT8 model, I'am not able to convert the model to any other formats nor able to inference using detect.py and val.py
Error msg:
ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float() # FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'to'
[TRT] [E] ModelImporter.cpp:779: ERROR: images:232 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) && "Failed to convert ONNX date type to TensorRT data type."
@glenn-jocher Can you plz look onto this? Looking forward for your reply
Additional
No response
The text was updated successfully, but these errors were encountered: