Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YoloV3 Tiny PRN Training #4858

Open
PaserSRL opened this issue Feb 12, 2020 · 12 comments
Open

YoloV3 Tiny PRN Training #4858

PaserSRL opened this issue Feb 12, 2020 · 12 comments

Comments

@PaserSRL
Copy link

PaserSRL commented Feb 12, 2020

Hi all,
I'm going crazy in order to train YoloV3 Tiny PRN, I've read all possible about it and trying to merge all information in a script that prepares an enviroment to train model according to required objects.

First of all YoloV3 Tiny PRN is absolutely good but I would like to remove useless objects (tie, pen, bottle...) so I created a scripts that executes following steps:

./script --evironmanet-name name --allowed-objects car,person,dog...

  1. Create a directory with a precise structure:
    ./environment-name/cfg/
    ./environment-name/dataset/images/train2014/
    ./environment-name/dataset/images/val2014/
    ./environment-name/dataset/labels/train2014/
    ./environment-name/dataset/labels/val2014/
    -/environment-name/output/

  2. Copy all COCO dataset 2014 from source to my environment directory:
    coco_source (train images) -> ./environment-name/dataset/images/train2014/
    coco_source (val images) -> ./environment-name/dataset/images/val2014/

  3. Label index correction
    My goal is to remove useless objects from detection so is necessary to generate a new file objects.names and objects will have a new index from original coco index according to my script parameter.
    so if original is:
    0 - person
    1 -bicycle
    2 - car
    3 - motorbike
    and my allowed objects are:
    0 - person
    1 - car
    I need to correct COCO labels with new index related to object (car - from index 2 must become 1) and remove all object coordinates related to objects not required.
    My script copy labels from source, edit them and place them to:
    ./environment-name/dataset/labels/train2014/
    ./environment-name/dataset/labels/val2014/

  4. Create objects.names file in ./environment-name/objects.names

  5. Create model.data file in ./environment-name/model.data with content:
    classes=$(number_of_classes)
    train = ./environment-name/list.txt
    valid = ./environment-name/val.txt
    names = ./environment-name/objects.names
    backup = ./environment-name/output

  6. Create a file list.txt with all images available in ./environment-name/dataset/images/train2014/

  7. Create a file val.txt with all images available in ./environment-name/dataset/images/val2014/

  8. Copy source of yolov3-tiny-prn.cfg to ./environment-name/cfg/model.cfg and edit it following these rules:

  • change max_batches from 500200 to 2000*(number of allowed objects)
  • change steps from 400000,450000 to max_batches80%,max_batches90%
  • change classes from 80 to (number of allowed objects)
  • change all "filters=255" to filters=((number of allowed objects)+5)*3

9. Create a train.sh script that starts training with following command:
./darknet detector train '.$model_data_path.' '.$model_cfg_path.' yolov3-tiny-prn.conv.15
According to this ticket: #4091 (comment)
Is necessary to use pretrainer weight so I generated it from yolov3-tiny-prn.weights with command:
./darknet partial cfg/yolov3-tiny-prn.cfg yolov3-tiny-prn.weights yolov3-tiny-prn.conv.15 15

Why each model that I try to train is very very far from results of original yolov3-tiny-prn (Lower mAP)?
What I'm doing wrong?

This script is for my purpose but once it will be created and fully working I would like to share with the community.

@AlexeyAB
Copy link
Owner

  • Show chart.png with Loss and mAP
  • Check your dataset by using Yolo_mark
  • Run training with flag -show_imgs and show screenshots
  • Show your cfg-file

@WongKinYiu
Copy link
Collaborator

If you only want to discard some of classes without adding new classes, it is better to modify load_weight function.
https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L1959

@PaserSRL
Copy link
Author

PaserSRL commented Feb 13, 2020

Chart
chart

Screenshot
Schermata da 2020-02-12 17-26-22

CFG Files (Extension changed due to github restrinctions)
model.cfg.txt
model.data.txt
model.names.txt

I trained the model with only 2 classes: person,car
I used entire COCO Dataset 2014.

PS: For this test I used 6000 iterations for class, so max_batches = 12000

@AlexeyAB
Copy link
Owner

Why each model that I try to train is very very far from results of original yolov3-tiny-prn (Lower mAP)?
What I'm doing wrong?

Default yolov3-tiny-prn model has mAP@50, person = 48%, car = 33%, so avg ~= 40%, you got just 3% less (37% mAP@50)

  • Try to train 50 000 - 100 000 iterations and change max_batches= ad steps=, since you use ~100 000 images of MS COCO

  • Or just use default cfg/weights-file and add dont_show before each class in the coco.names file except car and person.

@PaserSRL
Copy link
Author

PaserSRL commented Feb 13, 2020

Default yolov3-tiny-prn model has mAP@50, person = 48%, car = 33%, so avg ~= 40%, you got just 3% less (37% mAP@50)

  • Try to train 50 000 - 100 000 iterations and change max_batches= ad steps=, since you use ~100 000 images of MS COCO

I will do.

  • Or just use default cfg/weights-file and add dont_show before each class in the coco.names file except car and person.

Really? this is new for me!

My test to train model is caused by I would like to understand how to train it in order to detect less and smaller objects.
So the first point is undestand how to correctly train it =)

I will detect a person of 15x40 px in a frame of 416x416 px:
detection_416x416

For the moment to solve this point I make 2 detections for each frame, in this way:
detection_320x320

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice.
I already found yolov3-tiny-3l but PRN has a better detection and also better perfomance.

Any suggestion?
(Thanks a lot for your help!!!)

@AlexeyAB
Copy link
Owner

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice.

Instead of this:
Train with width=416 height=416 in cfg-file then after training set width=576 height=320 in cfg and do just 1 detection instead of 2 detections.

@PaserSRL
Copy link
Author

PaserSRL commented Feb 13, 2020

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice.

Instead of this:
Train with width=416 height=416 in cfg-file then after training set width=576 height=320 in cfg and do just 1 detection instead of 2 detections.

Because, I've not understood what is the right image aspect ratio to supply to Yolo.

If I have a 16:9 (570x320 for example) frame:

  • Do I need to convert my frame to 416x416 shrinking it and set Yolo to 416x416?
  • Do I need to convert my frame to 416x234 (keeping aspect ratio) and fit in image of 416x416 to supply to Yolo?
  • Do I need to keep my frame at 570x320 and change Yolo resolution to 576x320 (trained at 416x416)?
  • Do I need to keep my frame at 570x320 and train Yolo at 576x320?

@AlexeyAB
Copy link
Owner

Every image will be resized to the network size automatically.
You just should know that distortion of objects during Training and Detection should be approximately the same.

Follow general rule to calculate network size and images size: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

General rule - your training dataset should include such a set of relative sizes of objects that you want to detect:

train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width
train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height
I.e. for each object from Test dataset there must be at least 1 object in the Training dataset with the same class_id and about the same relative size:

object width in percent from Training dataset ~= object width in percent from Test dataset

That is, if only objects that occupied 80-90% of the image were present in the training set, then the trained network will not be able to detect objects that occupy 1-10% of the image.

@PaserSRL
Copy link
Author

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416) ? Or I need to start from scratch?

@AlexeyAB
Copy link
Owner

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416)

Yes.

@PaserSRL
Copy link
Author

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416)

Yes.

Thanks Alexey! You are a boss :)

@fused-byte
Copy link

To keep the distortion of a similar level in training and testing, Can we train the network with one width & height values and have different width & height values during inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants