Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plan for adding oversampling function for imbalanced dataset? #1115

Closed
luvwinnie opened this issue Oct 11, 2020 · 19 comments
Closed

Any plan for adding oversampling function for imbalanced dataset? #1115

luvwinnie opened this issue Oct 11, 2020 · 19 comments
Labels
question Further information is requested Stale Stale and schedule for closing soon

Comments

@luvwinnie
Copy link

❔Question

I think it is often have a imbalanced dataset in real scenario, does this repo tested on the technique of resampling such as oversampling, undersampling or SMOTE etc?

@luvwinnie luvwinnie added the question Further information is requested label Oct 11, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Oct 11, 2020

Hello @luvwinnie, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@glenn-jocher
Copy link
Member

@luvwinnie I'm not familiar with SMOTE. We tested a few adaptions to class imbalances that helped in early training, but these overfit faster as well and resulted in lower final maps, so our current implementation has no specific class imbalance adaptations in place when using default settings.

In any case, COCO and VOC suffer from severe class imbalances, and these train very well with the default settings.

@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 11, 2020

@luvwinnie ah, BTW, one technique to address custom datasets, including class imabalances, is simply to evolve hyperparameters, which include loss balancers and BCE positive weights for class and conf. See https://docs.ultralytics.com/yolov5

@luvwinnie
Copy link
Author

@glenn-jocher Thank you for reply! I would like to test the evolve hyperparameters and SMOTE later. Just one thing is that seems like the evolve hyperparameters take very long , is the evolve hyperparameters training can be resume later with any options?

@glenn-jocher
Copy link
Member

@luvwinnie yes, evolution is an expensive habit, as you basically want to train 300 times or so. It can be stopped and resumed from the same evolve.txt, and you can also deploy multiple gpus (to evolve in parallel to a single evolve.txt) and multiple nodes/VMs (to evolve from a central cloud based evolve.txt).

@luvwinnie
Copy link
Author

@glenn-jocher Thank you! i try with python -m torch.distributed.launch --nproc_per_node 2 train.py seems like it show the following errors. AssertionError: DDP mode not implemented for --evolve the multiple GPU cannot be use for --evolve?

@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 11, 2020

@luvwinnie evolving multi-GPU is done with one GPU per process. I've updated the hyperparameter evolution tutorial https://docs.ultralytics.com/yolov5/tutorials/hyperparameter_evolution with an example.

# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve

# Multi-GPU
for i in 0 1 2 3; do
  python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve --device $i
done

EDIT: this shows a bash for loop, but in practice you'd want to run these in detached screens or simply in new terminal windows, one screen/window per cuda device, with --device 0, --device 1, etc.

@luvwinnie
Copy link
Author

luvwinnie commented Oct 11, 2020

@glenn-jocher thank you for reply! I think suppose better with the following with nohup and background task?

#!/bin/bash
### EDITIED ###
for i in 0 1 2 3; do
  nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve --device $i > evolve_gpu_$i.log &
done

@glenn-jocher
Copy link
Member

@luvwinnie ah, yes that's the effect I was going for. When running docker images in detached -d mode you can use a simple for loop as I had initially, but for running directly you need to detach the command somehow. In the past I used screens for this: https://superuser.com/questions/454907/how-to-execute-a-command-in-screen-and-detach

Does this nohup command do the same? Is the & character at the end required to complete the nohup command?

@luvwinnie
Copy link
Author

luvwinnie commented Oct 11, 2020

@glenn-jocher yes. For my environment, I didn't use docker to detach the session. To run as background process we need the & character, nohup command is use to make sure the process keeps running even the terminal session closed.

EDIT: I have updated the code to log out the print in order to use the tail -f command to check the progress

@glenn-jocher
Copy link
Member

@luvwinnie very cool! I've updated the tutorial with your nohup command.

How would you check in on a thread to see it's logging (to make sure it hasn't crashed etc.)?

@luvwinnie
Copy link
Author

luvwinnie commented Oct 11, 2020

@glenn-jocher Thank you! Normally I will just redirect the stdout to some log file, as I edited the command I redirected all the stdout in a separate log file, which names as evolve_gpu_$i.log. Let say I have 2 GPU, I will just use the following command to check the progress. Although it is not clean @@;

terminal 1:$ tail -f evolve_gpu_0.log
terminal 2:$ tail -f evolve_gpu_1.log

@Samjith888
Copy link

@luvwinnie yes, evolution is an expensive habit, as you basically want to train 300 times or so. It can be stopped and resumed from the same evolve.txt, and you can also deploy multiple gpus (to evolve in parallel to a single evolve.txt) and multiple nodes/VMs (to evolve from a central cloud based evolve.txt).

Hyperparameter evolution stopped at 30th step , how to resume this from the 30th step ?

@glenn-jocher
Copy link
Member

@Samjith888 you just rerun your same --evolve command (don't use --resume, that's only for normal training).

If an evolve.txt file already exists in your yolov5 directory, it resumes from there.

@luvwinnie
Copy link
Author

luvwinnie commented Oct 20, 2020

@glenn-jocher I have ran the evolve and currently 287 generation, it shows the below results. Does this means it doesn't help much with evolve? From my understanding I think data/hyp.finetune.yaml is one of the yolov5m evolved results. which means the metrics shows as same as data/hyp.finetune.yaml am I wrong?Or the data/hyp.finetune.yaml is the final result of trained with evolve config?

Hyperparameter Evolution Results
Generations: 287
Metrics: 0.421 0.681 0.604 0.366 0.0192 0.0101 0.00248

@glenn-jocher
Copy link
Member

@luvwinnie ok great. These are your evolved metrics. You can paste the labels from finetune.yaml, i.e.

#                   P         R     mAP.5 mAP.5:.95       box       obj       cls

You should compare these to your baseline results you had before you started evolving.

@luvwinnie
Copy link
Author

@glenn-jocher Thank you for reply. I have a baseline trained without evolve from pretrained yolov5s.pt, Do you mean I should compare the baseline with the result that train with this command?

python -u train.py --img=640 --batch=16 --epochs=300  --data=dataset/custom.yaml --cfg=models/yolov5s.yaml --weights=weights/yolov5s.pt --hyp hyp_evolved.yaml

@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 20, 2020

@luvwinnie sure. All you do is point the same baseline command to your new hyp_evolved.yaml to get your updated results. You compare 3 to 1 obviously.

  1. train.py ...
  2. train.py ... --evolve
  3. train.py ... --hyp hyp_evolved.yaml

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale Stale and schedule for closing soon
Projects
None yet
Development

No branches or pull requests

3 participants