[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

EasonQYS · 2022-01-29T07:23:19Z

[Enhance] support training api on cpu

Motivation

support training api on cpu

Modification

add a parameter 'device' and add a if-else in train_module(...)

Checklist

Before PR:

I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit tests to ensure correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

CLA has been signed and all committers have signed the CLA in this PR.

[Enhace] api train support cpu training

CLAassistant · 2022-01-29T07:23:23Z

All committers have signed the CLA.

jin-s13 · 2022-01-29T07:28:41Z

Thanks for your contribution!

Not sure if it is really necessary to support CPU training. It normally takes a lot of time. Haha 😆

codecov · 2022-01-30T06:54:53Z

Codecov Report

❗ No coverage uploaded for pull request base (dev-0.24@18af415). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head cd89d70 differs from pull request most recent head 5435695. Consider uploading reports for the commit 5435695 to get more accurate results

@@             Coverage Diff             @@
##             dev-0.24    #1161   +/-   ##
===========================================
  Coverage            ?   82.67%           
===========================================
  Files               ?      204           
  Lines               ?    16365           
  Branches            ?     2943           
===========================================
  Hits                ?    13529           
  Misses              ?     2092           
  Partials            ?      744

Flag	Coverage Δ
unittests	`82.60% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18af415...5435695. Read the comment docs.

EasonQYS · 2022-01-30T13:43:45Z

Thanks for your contribution!

Not sure if it is really necessary to support CPU training. It normally takes a lot of time. Haha 😆

Thanks！
As far as I am concerned, many students (and their family) and developers are just interested at AI and CV, who are not equiped with the PC good enough, but they should not be left out of the consideration. So I recommand to support CPU training.

ly015 · 2022-02-09T03:31:01Z

Thank you very much for your contribution!

In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.

Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

EasonQYS · 2022-02-09T05:01:38Z

Thank you very much for your contribution!

In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.

Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

@ly015
Thank you to tell me that, and I am glad to help!

However, there is one thing I am confused, if I use just use a single cpu to train the model, why should I put data to parallel. Is it a standard, or cpu can be more than one pice?

Thank you again!

ly015 · 2022-02-09T05:23:22Z

Thank you very much for your contribution!
In fact, we have just supported CPU training/test in #1157, boosted by new features in mmcv 1.4.4.
Nevertheless, I think it would be nice to also support for earlier mmcv version. So would you mind modifying this PR and adding an mmcv version check to determine whether to use MMDataParallel on CPU? Here is a reference.

@ly015 Thank you to tell me that, and I am glad to help!

However, there is one thing I am confused, if I use just use a single cpu to train the model, why should I put data to parallel. Is it a standard, or cpu can be more than one pice?

Thank you again!

MMDataParallel does not really perform data parallel in CPU mode. It's equivalent to the simple forward of the wrapped PyTorch module. This design is for keeping a unified interface for both CPU and GPU environments.

EasonQYS · 2022-02-09T05:26:42Z

MMDataParallel does not really perform data parallel in CPU mode. It's equivalent to the simple forward of the wrapped PyTorch module. This design is for keeping a unified interface for both CPU and GPU environments.

Thanks for letting me know!

mmpose/apis/train.py

co-authored-by: ly015 <[email protected]>

…b#1161) co-authored-by: ly015 <[email protected]>

[Enhace] api train support cpu training

dc29e09

[Enhace] api train support cpu training

jin-s13 requested a review from ly015 January 29, 2022 07:40

ly015 changed the title ~~[Enhace] api train support cpu training~~ [Enhacemnet] api train support cpu training for mmcv<1.4.4 Feb 9, 2022

EasonQYS added 3 commits February 9, 2022 13:30

update for cpu trainning and add a note for mmcv<1.4.4

b40ad13

support for mmcv < 1.4.4 for cpu training

a073027

support cpu trainning by cuda.is_available

fe3b2e3

ly015 reviewed Feb 9, 2022

View reviewed changes

mmpose/apis/train.py Outdated Show resolved Hide resolved

jin-s13 approved these changes Feb 28, 2022

View reviewed changes

ly015 changed the base branch from dev-0.23 to dev-0.24 March 1, 2022 05:42

fix import and replace print with warnings.warn

cd89d70

ly015 requested a review from jin-s13 March 1, 2022 05:50

fix bug

5435695

ly015 approved these changes Mar 2, 2022

View reviewed changes

ly015 merged commit b618970 into open-mmlab:dev-0.24 Mar 2, 2022

ly015 added a commit that referenced this pull request Mar 2, 2022

[Enhacemnet] api train support cpu training for mmcv<1.4.4 (#1161)

33434b4

co-authored-by: ly015 <[email protected]>

ly015 added a commit that referenced this pull request Mar 7, 2022

[Enhacemnet] api train support cpu training for mmcv<1.4.4 (#1161)

4b6064b

co-authored-by: ly015 <[email protected]>

shuheilocale pushed a commit to shuheilocale/mmpose that referenced this pull request May 6, 2023

[Enhacemnet] api train support cpu training for mmcv<1.4.4 (open-mmla…

3fb4fe2

…b#1161) co-authored-by: ly015 <[email protected]>

ajgrafton pushed a commit to ajgrafton/mmpose that referenced this pull request Mar 6, 2024

[Enhacemnet] api train support cpu training for mmcv<1.4.4 (open-mmla…

cf76018

…b#1161) co-authored-by: ly015 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

EasonQYS commented Jan 29, 2022

CLAassistant commented Jan 29, 2022 •

edited

Loading

jin-s13 commented Jan 29, 2022

codecov bot commented Jan 30, 2022 •

edited

Loading

EasonQYS commented Jan 30, 2022

ly015 commented Feb 9, 2022

EasonQYS commented Feb 9, 2022

ly015 commented Feb 9, 2022

EasonQYS commented Feb 9, 2022

[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

[Enhacemnet] api train support cpu training for mmcv<1.4.4 #1161

Conversation

EasonQYS commented Jan 29, 2022

Motivation

Modification

Checklist

CLAassistant commented Jan 29, 2022 • edited Loading

jin-s13 commented Jan 29, 2022

codecov bot commented Jan 30, 2022 • edited Loading

Codecov Report

EasonQYS commented Jan 30, 2022

ly015 commented Feb 9, 2022

EasonQYS commented Feb 9, 2022

ly015 commented Feb 9, 2022

EasonQYS commented Feb 9, 2022

CLAassistant commented Jan 29, 2022 •

edited

Loading

codecov bot commented Jan 30, 2022 •

edited

Loading