-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Object Detection MLModel for iOS with output configuration of confidence scores & coordinates for the bounding box. #535
Comments
Hello @ajurav1, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments. If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com. |
Hi @ajurav1 Those tensors store network predictions that are not decoded. There are a few recommendations I made in #343 to decode these and there is even a NumPy sample code that does this for the ONNX model. There's also guidance here: https://docs.ultralytics.com/yolov5/tutorials/model_export Take a look in there and let me know if you require further support. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi @dlawrences , |
@ShreshthSaxena , there is a lot of useful information which you will be able to use on this blog: https://machinethink.net/blog/ I recommend acquiring the book as well. |
Hi @ajurav1 and @ShreshthSaxena, have you managed to convert |
@dlawrences In the article, the author says that the "Ultralytics YOLOv5 model has a Core ML version but it requires changes before you can use it with Vision." Did you manage to make it work with Vision? |
@kir486680 the CoreML model, as exported from this repository, presents the final feature maps, thus the predictions are not decoded, nor any NMS is applied. I have been able to create the required steps locally, yes, and it works directly with Vision. |
@dlawrences could you please share at least some clues pls? I implemented NMS but I do not know what to do with the output from the model. I seen some implementations here . Do you have something similar? |
@kir486680 @dlawrences any update on this? |
If you come across this issue, there's a script here for creating a CoreML model which outputs the expected values https://github.com/Workshed/yolov5-coreml-export |
@Workshed thanks for sharing your script for creating a CoreML model that outputs the expected values with YOLOv5. This will be helpful for others who are looking to work with YOLOv5 in CoreML. Keep up the great work! |
I have exported the mlmodel from export.py, and the model exported have a type of Neural Network and output configuration is
[<VNCoreMLFeatureValueObservation: 0x281d80240> 4702BA0E-857D-4CE6-88C1-4E47186E751F requestRevision=1 confidence=1.000000 "2308" - "MultiArray : Float32 1 x 3 x 20 x 20 x 85 array" (1.000000), <VNCoreMLFeatureValueObservation: 0x281d802d0> AE0A0580-7DA2-4991-98BB-CD26EE257C7A requestRevision=1 confidence=1.000000 "2327" - "MultiArray : Float32 1 x 3 x 40 x 40 x 85 array" (1.000000), <VNCoreMLFeatureValueObservation: 0x281d80330> 0253FD2B-10B0-4047-A001-624D1864D27C requestRevision=1 confidence=1.000000 "2346" - "MultiArray : Float32 1 x 3 x 80 x 80 x 85 array" (1.000000)]
I was looking for output type VNRecognizedObjectObservation in yoloV5 instead of VNCoreMLFeatureValueObservation.
So, my question is what information does this VNCoreMLFeatureValueObservation MultiArray hold (is it something like a UIImage or CGRect?, or something different?) and how can I convert this Multidimensional Array into a useful set of data that I can actually see confidence scores & coordinates?
The text was updated successfully, but these errors were encountered: