Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train different tasks at same time #20

Closed
acai66 opened this issue Jun 6, 2020 · 6 comments
Closed

Train different tasks at same time #20

acai66 opened this issue Jun 6, 2020 · 6 comments
Labels
enhancement New feature or request Stale Stale and schedule for closing soon

Comments

@acai66
Copy link

acai66 commented Jun 6, 2020

🚀 Feature

Train different tasks at same time.

Motivation

there always are multi gpu in a machine, We should have been able to train different models at same time, but outputs and results are stored in same directory now, it may be conflict.

Pitch

split outputs and results include weights in separate directories.

Alternatives

Additional context

I made a temporary change to train.py so i can train different tasks, but i really hope this funiction will be official support.
tkanks.

    wdir = 'weights' + os.sep + opt.name + os.sep  # weights dir
    if not os.path.exists(wdir):
        os.mkdir(wdir) 
    last = wdir + 'last.pt'
    best = wdir + 'best.pt'
    results_dir = 'logs' + os.sep + opt.name + os.sep
    results_file = results_dir + 'results.txt' 
@acai66 acai66 added the enhancement New feature or request label Jun 6, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jun 6, 2020

Hello @acai66, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI surveillance systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@glenn-jocher
Copy link
Member

@acai66 yes you make a good point. We use multiple docker containers on a single machine to exploit multiple single-gpu trainings simultaneously.

Without docker containers you might simply copy the directory, one per gpu.

For a more comprehensive solution, we might be better off depositing all run-related items (jpgs, results.txt, checkpoints etc.) into the unique ./runs directory already created automatically by tensorboard when a training run starts. What do you think?

Screen Shot 2020-06-06 at 10 34 13 PM

@acai66
Copy link
Author

acai66 commented Jun 7, 2020

@acai66 yes you make a good point. We use multiple docker containers on a single machine to exploit multiple single-gpu trainings simultaneously.

Without docker containers you might simply copy the directory, one per gpu.

For a more comprehensive solution, we might be better off depositing all run-related items (jpgs, results.txt, checkpoints etc.) into the unique ./runs directory already created automatically by tensorboard when a training run starts. What do you think?

Screen Shot 2020-06-06 at 10 34 13 PM

good idea, thanks

@glenn-jocher
Copy link
Member

The unique directory is defined in

yolov5/train.py

Lines 394 to 399 in b810b21

# Train
if not opt.evolve:
tb_writer = SummaryWriter(comment=opt.name)
print('Start Tensorboard with "tensorboard --logdir=runs", view at http://localhost:6006/')
train(hyp)

tb_writer.log_dir
Out[3]: 'runs/Jun07_09-10-55_Glenns-MBP.attlocal.net'

@glenn-jocher
Copy link
Member

@acai66 see #104, this PR seems to address many of your concerns. Perhaps you could look it over and give feedback to the PR author.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Aug 1, 2020
zldrobit pushed a commit to zldrobit/yolov5 that referenced this issue Sep 3, 2022
manole-alexandru added a commit to manole-alexandru/yolov5-uolo that referenced this issue Apr 25, 2023
manole-alexandru added a commit to manole-alexandru/yolov5-uolo that referenced this issue Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale Stale and schedule for closing soon
Projects
None yet
Development

No branches or pull requests

2 participants