How to "finish" raw inference output (with respect to anchors), to get the bounding boxes #6136

hamedmh · 2021-12-30T09:56:57Z

Hi @glenn-jocher,
I use a "pure" Yolov5s model which outputs three tensors, such as: torch.Size([1, 3, 48, 80, 85]) , torch.Size([1, 3, 24, 40, 85]) , and torch.Size([1, 3, 12, 20, 85]).

I would like to convert them to bounding boxes.
I need to know which functions or equations can be used to get the bounding boxes.
Thanks!

The original issue:
@Kieran31 see PyTorch Hub tutorial for full inference examples on trained custom models.

Simple Example

This example loads a pretrained YOLOv5s model from PyTorch Hub as model and passes an image for inference. 'yolov5s' is the lightest and fastest YOLOv5 model. For details on all available models please see the README.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')

# Image
img = 'https://ultralytics.com/images/zidane.jpg'

# Inference
results = model(img)

results.pandas().xyxy[0]
#      xmin    ymin    xmax   ymax  confidence  class    name
# 0  749.50   43.50  1148.0  704.5    0.874023      0  person
# 1  433.50  433.50   517.5  714.5    0.687988     27     tie
# 2  114.75  195.75  1095.0  708.0    0.624512      0  person
# 3  986.00  304.00  1028.0  420.0    0.286865     27     tie

YOLOv5 Tutorials

Train Custom Data 🚀 RECOMMENDED
Tips for Best Training Results ☘️ RECOMMENDED
Weights & Biases Logging 🌟 NEW
Supervisely Ecosystem 🌟 NEW
Multi-GPU Training
PyTorch Hub ⭐ NEW
TorchScript, ONNX, CoreML Export 🚀
Test-Time Augmentation (TTA)
Model Ensembling
Model Pruning/Sparsity
Hyperparameter Evolution
Transfer Learning with Frozen Layers ⭐ NEW
TensorRT Deployment

Originally posted by @glenn-jocher in #5304 (comment)

The text was updated successfully, but these errors were encountered:

hamedmh · 2022-01-01T14:00:18Z

Hi @glenn-jocher, Wish you a Happy Year!

This is an update of my experiment.
I created the model using:

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model = model.model

output = model(img_tensor)

output is: torch.Size([1, 25200, 85])

detections_instance = AutoShape(model).forward(img_tensor)

or

detections_instance = AutoShape(model).forward(model(img_tensor))

The result is an instance of Detections class, but the result is wrong when for example using:
img_file_name = '../data/images/zidane.jpg'

The detections_instance is: detections_instance.print()

image 1/1: Detected 1 objects of class personx1 , 1 objects of class carx1 , 18 objects of class traffic lightx18 , 1 objects of class tiex1 , 46 objects of class sports ballx46 , 1 objects of class bottlex1 , 1 objects of class cupx1 , 30 objects of class bowlx30 , 1 objects of class bananax1 , 13 objects of class applex13 , 9 objects of class broccolix9 , 24 objects of class dining tablex24 , 3 objects of class mousex3 , 58 objects of class clockx58 ,

The forward() of class AutoShape is modified to the following:

def forward(self, imgs, pred, size=640, augment=False, profile=False):
      autocast = self.amp and (p.device.type != 'cpu')  # Automatic Mixed Precision (AMP) inference
      with amp.autocast(enabled=autocast):
            y = non_max_suppression(pred, conf_thres=self.conf, iou_thres=self.iou, classes=self.classes,
                                    agnostic=self.agnostic, multi_label=self.multi_label, max_det=self.max_det)
            return Detections(imgs, y, files=["-"], names=self.names)

imgs id an image tensor (img_tensor) of size: torch.Size([1, 3, 640, 640])
Now I need to know what else should be added to forward() to fix it and get correct detections.

Best regards,
Hamed

glenn-jocher · 2022-01-02T00:56:48Z

@hamedmh 👋 Hello! Thanks for asking about handling inference results. YOLOv5 🚀 PyTorch Hub models allow for simple model loading and inference in a python environment.

Simple Inference Example

This example loads a pretrained YOLOv5s model from PyTorch Hub as model and passes an image for inference. 'yolov5s' is the lightest and fastest YOLOv5 model. For details on all available models please see the README.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, custom

# Images
img = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print  # or .show(), .save(), .crop(), .pandas(), etc.

results.pandas().xyxy[0]
#      xmin    ymin    xmax   ymax  confidence  class    name
# 0  749.50   43.50  1148.0  704.5    0.874023      0  person
# 1  433.50  433.50   517.5  714.5    0.687988     27     tie
# 2  114.75  195.75  1095.0  708.0    0.624512      0  person
# 3  986.00  304.00  1028.0  420.0    0.286865     27     tie

See YOLOv5 PyTorch Hub Tutorial for details.

Good luck 🍀 and let us know if you have any other questions!

hamedmh · 2022-01-02T20:16:12Z

Hi @glenn-jocher, Thank you for your answer!
I need to feed an image tensor (torch.Size([1, 3, 640, 640])) to the model to perform inference.
I discovered that I could simply write:

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
output = model(img_tensor)

and get a Detections instance as an output (now when I modified forward() to take a single image tensor as an input, as explained in my post above).
However, the inference result is still wrong - totally different from the result that I get when feeding model with the image file-name.
I have simplified the task by not including image size scaling.

Best regards,
Hamed

glenn-jocher · 2022-01-05T01:49:55Z

@hamedmh good news 😃! Your original issue may now be fixed ✅ in PR #6195. This PR adds support for YOLOv5 CoreML inference.

!python export.py --weights yolov5s.pt --include coreml  # CoreML export
!python detect.py --weights yolov5s.mlmodel  # CoreML inference (MacOS-only)
!python val.py --weights yolov5s.mlmodle  # CoreML validation (MacOS-only)

model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.mlmodel')  # CoreML PyTorch Hub model

To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

hamedmh · 2022-01-05T08:26:30Z

@glenn-jocher Thank you for the update! I'll test it.
Best regards, Hamed

hamedmh · 2022-01-31T15:15:52Z

Hi @glenn-jocher,

Now I have a more specific question regarding the same issue (post processing of predictions).
In my train.py , I save the model using: torch.save , then in my detect.py I do the following:

my_model = torch.load(pt_path + "best.pt")
my_model.model.half() if half else my_model.model.float()
source = imgs_path+img_file_name
dataset = LoadImages(source)
for path, im, im0s, vid_cap, s in dataset:
     im = torch.from_numpy(im).to(device)
     im = im.half() if half else im.float()  # uint8 to fp16/32
     im /= 255  # 0 - 255 to 0.0 - 1.0
     im = im.unsqueeze(0)
     output = my_model(im)
     pred = non_max_suppression(output[0])

However, when I print the result:

print(pred)
I get the following:

[tensor([[-3.21432e-01, -1.96759e+00,  3.47019e-01, -1.08958e+00,  4.47312e+01,  0.00000e+00],
         [ 9.95539e-02, -2.22332e+00, -6.70392e-02, -7.91115e-01,  4.30290e+01,  0.00000e+00],
         [-3.16685e-01,  2.95826e-01,  3.56224e-01,  1.16115e+00,  4.05378e+01,  0.00000e+00],
         [ 8.81307e-02,  4.03144e-02, -7.08852e-02,  1.44941e+00,  4.02665e+01,  0.00000e+00],
         [-1.86408e-01, -1.37995e+00, -7.09979e-01, -3.34498e-01,  1.07682e+01,  0.00000e+00],
         [-5.10289e-01, -1.14743e+00, -3.85259e-01, -5.38978e-01,  1.02692e+01,  0.00000e+00],
         [-5.21439e-01,  6.93630e-01, -3.84875e-01,  1.32826e+00,  8.37832e+00,  0.00000e+00],
         [-1.97015e-01,  4.23762e-01, -7.11327e-01,  1.50808e+00,  8.13119e+00,  0.00000e+00],
         [ 2.11624e+00, -1.18973e+00,  1.63665e+00, -1.84288e-01,  4.55044e+00,  0.00000e+00],
         [ 1.78832e+00, -9.54638e-01,  1.95905e+00, -3.72235e-01,  3.56341e+00,  0.00000e+00]], grad_fn=<IndexBackward>)]

My question is: How to understand and visualize this "raw" result? We have only one class (0).

All the best,
Hamed

glenn-jocher · 2022-01-31T16:42:32Z

@hamedmh detect.py inference with trained weights is simple:

python detect.py --weights path/to/best.pt

DeepLearnerYe · 2023-11-13T01:48:56Z

AutoSha

hi, there. I've got the same question as you, and I don't find a solution. The output is three tensor, how can I transfer them to bounding boxes?

glenn-jocher · 2023-11-13T05:20:18Z

@DeepLearnerYe you can post-process the raw model output using the post_process method available in YOLOv5. This method will convert the model output into bounding boxes. Here's a sample implementation:

from models.yolo import Model
from utils.general import non_max_suppression

# Load the model
model = Model('path/to/yolov5s.yaml', ch=3, nc=80)  # Replace with actual config and class number
ckpt = torch.load('path/to/checkpoint.pt')  # Replace with actual checkpoint path
model.load_state_dict(ckpt['model'])

# Perform inference
img = torch.randn(1, 3, 640, 640)  # Replace with actual input image
pred = model(img)

# Post-process the output
pred = non_max_suppression(pred, conf_thres=0.4, iou_thres=0.5)

# Output the bounding boxes
print(pred)

Please replace the placeholder paths and input image with your actual data. Let me know if you encounter any more issues!

hamedmh changed the title ~~@Kieran31 see **PyTorch Hub tutorial** for full inference examples on trained custom models.~~ How to "finish" raw inference output (with respect to anchors), to get the bounding boxes Dec 30, 2021

glenn-jocher linked a pull request Jan 5, 2022 that will close this issue

Add CoreML inference #6195

Merged

glenn-jocher closed this as completed in #6195 Jan 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to "finish" raw inference output (with respect to anchors), to get the bounding boxes #6136

How to "finish" raw inference output (with respect to anchors), to get the bounding boxes #6136

hamedmh commented Dec 30, 2021 •

edited

Loading

hamedmh commented Jan 1, 2022 •

edited

Loading

glenn-jocher commented Jan 2, 2022 •

edited

Loading

hamedmh commented Jan 2, 2022

glenn-jocher commented Jan 5, 2022 •

edited by UltralyticsAssistant

Loading

hamedmh commented Jan 5, 2022

hamedmh commented Jan 31, 2022 •

edited

Loading

glenn-jocher commented Jan 31, 2022

DeepLearnerYe commented Nov 13, 2023

glenn-jocher commented Nov 13, 2023

How to "finish" raw inference output (with respect to anchors), to get the bounding boxes #6136

How to "finish" raw inference output (with respect to anchors), to get the bounding boxes #6136

Comments

hamedmh commented Dec 30, 2021 • edited Loading

Simple Example

YOLOv5 Tutorials

hamedmh commented Jan 1, 2022 • edited Loading

glenn-jocher commented Jan 2, 2022 • edited Loading

Simple Inference Example

hamedmh commented Jan 2, 2022

glenn-jocher commented Jan 5, 2022 • edited by UltralyticsAssistant Loading

hamedmh commented Jan 5, 2022

hamedmh commented Jan 31, 2022 • edited Loading

glenn-jocher commented Jan 31, 2022

DeepLearnerYe commented Nov 13, 2023

glenn-jocher commented Nov 13, 2023

hamedmh commented Dec 30, 2021 •

edited

Loading

hamedmh commented Jan 1, 2022 •

edited

Loading

glenn-jocher commented Jan 2, 2022 •

edited

Loading

glenn-jocher commented Jan 5, 2022 •

edited by UltralyticsAssistant

Loading

hamedmh commented Jan 31, 2022 •

edited

Loading