Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

Merged
merged 6 commits into from
Mar 2, 2022

Conversation

EasonQYS
Copy link
Contributor

[Enhance] support training api on cpu

Motivation

support training api on cpu

Modification

add a parameter 'device' and add a if-else in train_module(...)

Checklist

Before PR:

  • I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
  • Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit tests to ensure correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • CLA has been signed and all committers have signed the CLA in this PR.

[Enhace] api train support cpu training
@CLAassistant
Copy link

CLAassistant commented Jan 29, 2022

CLA assistant check
All committers have signed the CLA.

@jin-s13
Copy link
Collaborator

jin-s13 commented Jan 29, 2022

Thanks for your contribution!

Not sure if it is really necessary to support CPU training. It normally takes a lot of time. Haha 😆

@jin-s13 jin-s13 requested a review from ly015 January 29, 2022 07:40
@codecov
Copy link

codecov bot commented Jan 30, 2022

Codecov Report

❗ No coverage uploaded for pull request base (dev-0.24@18af415). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head cd89d70 differs from pull request most recent head 5435695. Consider uploading reports for the commit 5435695 to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##             dev-0.24    #1161   +/-   ##
===========================================
  Coverage            ?   82.67%           
===========================================
  Files               ?      204           
  Lines               ?    16365           
  Branches            ?     2943           
===========================================
  Hits                ?    13529           
  Misses              ?     2092           
  Partials            ?      744           
Flag Coverage Δ
unittests 82.60% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18af415...5435695. Read the comment docs.

@EasonQYS
Copy link
Contributor Author

Thanks for your contribution!

Not sure if it is really necessary to support CPU training. It normally takes a lot of time. Haha 😆

Thanks!
As far as I am concerned, many students (and their family) and developers are just interested at AI and CV, who are not equiped with the PC good enough, but they should not be left out of the consideration. So I recommand to support CPU training.

@ly015
Copy link
Member

ly015 commented Feb 9, 2022

Thank you very much for your contribution!

In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.

Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

@ly015 ly015 changed the title [Enhace] api train support cpu training [Enhacemnet] api train support cpu training for mmcv<1.4.4 Feb 9, 2022
@EasonQYS
Copy link
Contributor Author

EasonQYS commented Feb 9, 2022

Thank you very much for your contribution!

In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.

Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

@ly015
Thank you to tell me that, and I am glad to help!

However, there is one thing I am confused, if I use just use a single cpu to train the model, why should I put data to parallel. Is it a standard, or cpu can be more than one pice?

Thank you again!

@ly015
Copy link
Member

ly015 commented Feb 9, 2022

Thank you very much for your contribution!
In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.
Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

@ly015 Thank you to tell me that, and I am glad to help!

However, there is one thing I am confused, if I use just use a single cpu to train the model, why should I put data to parallel. Is it a standard, or cpu can be more than one pice?

Thank you again!

MMDataParallel does not really perform data parallel in CPU mode. It's equivalent to the simple forward of the wrapped PyTorch module. This design is for keeping a unified interface for both CPU and GPU environments.

@EasonQYS
Copy link
Contributor Author

EasonQYS commented Feb 9, 2022

MMDataParallel does not really perform data parallel in CPU mode. It's equivalent to the simple forward of the wrapped PyTorch module. This design is for keeping a unified interface for both CPU and GPU environments.

Thanks for letting me know!

mmpose/apis/train.py Outdated Show resolved Hide resolved
@ly015 ly015 changed the base branch from dev-0.23 to dev-0.24 March 1, 2022 05:42
@ly015 ly015 requested a review from jin-s13 March 1, 2022 05:50
@ly015 ly015 merged commit b618970 into open-mmlab:dev-0.24 Mar 2, 2022
ly015 added a commit that referenced this pull request Mar 2, 2022
ly015 added a commit that referenced this pull request Mar 7, 2022
shuheilocale pushed a commit to shuheilocale/mmpose that referenced this pull request May 6, 2023
ajgrafton pushed a commit to ajgrafton/mmpose that referenced this pull request Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants