question about training img size and postprocessing #14182
-
Hi, when i change the training img size , what is that mean? for the postprocessing, since most of the ai chip company in China would do quantization in their ways to fit in their chip, and the postprocessing layer seems not friendly to their quantization and they require to remove the postprocessing layer. Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
@kronee0516 hi there, Great questions! Let me address them one by one: Training Image SizeWhen you change the In YOLOv5 and YOLOv8, the model's input size is indeed set by the Regarding your specific use case with a camera capturing images at 768x384, you can indeed train a model with this input size to avoid rescaling. Simply set Postprocessing and ONNX ExportFor the postprocessing layer, it's common for AI chip companies to require custom quantization methods. To export a YOLO model to ONNX without the postprocessing layer, you can use the from ultralytics import YOLO
# Load your trained model
model = YOLO('path/to/your/model.pt')
# Export the model to ONNX without postprocessing
model.export(format='onnx', simplify=True, dynamic=True, opset=12, nms=False) In this example:
For more detailed guidance, you can refer to our Model Export Documentation. Additional ResourcesFor further insights and tips on model training, you might find our Model Training Tips Guide helpful. If you encounter any issues or have further questions, please provide a reproducible example to help us assist you better. You can find more information on creating a minimum reproducible example here. Hope this helps! 😊 |
Beta Was this translation helpful? Give feedback.
Hi @kronee0516,
I'm glad to hear that the previous response was helpful! Let's dive into your additional questions regarding the export and output structure of YOLOv8.
YOLOv8 Output Structure
In YOLOv8, the output structure has been streamlined compared to YOLOv5. Instead of having multiple outputs for different strides, YOLOv8 consolidates the outputs into a single tensor. This tensor has the shape
(1, n+4, 8400)
, where:n
is the number of classes.4
represents the bounding box coordinates (x, y, width, height).8400
is the total number of predictions, which is a result of the feature map sizes and strides combined.Strides and Anchor Boxes
In YOLOv8, the concept of anchor boxes has be…