[hivemind.Optimizer] TrainingStateAverager #407

justheuristic · 2021-11-08T20:26:52Z

This PR implements a component of hivemind.Optimizer ( #400 ) that holds the training state and supports (delayed) optimizer steps and averaging rounds.
Unlike TrainingAverager, this class is does not need data locks as it will only update model parameters during .step.

implement TrainingStateAverager
test different initializations
test load_state_from_peers
test synchronous offloaded
test asynchronous updates (with "slow" optimizer)
include scheduler in load_state_from_peers

Co-authored-by: Roman Zhytar <[email protected]>

codecov · 2021-11-08T20:29:25Z

Codecov Report

Merging #407 (cf40b94) into master (8fa0a8e) will increase coverage by 0.31%.
The diff coverage is 90.53%.

@@            Coverage Diff             @@
##           master     #407      +/-   ##
==========================================
+ Coverage   83.84%   84.15%   +0.31%     
==========================================
  Files          74       75       +1     
  Lines        6783     7121     +338     
==========================================
+ Hits         5687     5993     +306     
- Misses       1096     1128      +32

Impacted Files	Coverage Δ
hivemind/optim/experimental/state_averager.py	`90.53% <90.53%> (ø)`
hivemind/utils/mpfuture.py	`94.03% <0.00%> (-0.92%)`	⬇️
hivemind/averaging/matchmaking.py	`84.19% <0.00%> (-0.31%)`	⬇️
hivemind/dht/node.py	`91.44% <0.00%> (+0.71%)`	⬆️

Co-authored-by: Roman Zhytar <[email protected]> Co-authored-by: Anton Sinitsin <[email protected]> Co-authored-by: Max Ryabinin <[email protected]>

tests/test_optimizer.py

justheuristic · 2021-11-15T12:38:43Z

Note to a potential @mryab : despite running 20 optimizer steps, the test runs in ~1.3s on my laptop and ~2s in the CI.

hivemind/optim/experimental/state_averager.py

Co-authored-by: Alexander Borzunov <[email protected]>

…d into state_averaging

hivemind/optim/experimental/state_averager.py

Co-authored-by: Alexander Borzunov <[email protected]>

…aging

hivemind/optim/experimental/state_averager.py

Co-authored-by: Alexander Borzunov <[email protected]>

justheuristic · 2021-11-15T13:54:55Z

Note: this implements delayed optimizer step as a part of #394

…aging

justheuristic and others added 2 commits November 8, 2021 23:21

state averager

10622fe

Co-authored-by: Roman Zhytar <[email protected]>

state averager

654b48d

Co-authored-by: Roman Zhytar <[email protected]>

justheuristic requested a review from borzunov November 8, 2021 20:26

justheuristic and others added 12 commits November 8, 2021 23:30

test load_state_from_peers

26cd8c8

black-isort

4e1b738

black-isort

6759b3a

reduce waiting time

c2efb5b

finalize PR

8dfb2e8

finalize PR

fc81833

Co-authored-by: Roman Zhytar <[email protected]> Co-authored-by: Anton Sinitsin <[email protected]> Co-authored-by: Max Ryabinin <[email protected]>

Add an option to not update gradients.

b66b6b6

more capitalization for the god of capitalization (@mryab)

6eac241

more capitalization for the god of capitalization (@mryab)

318e836

remove debugprint

6a6c33b

test increment

1b09ade

test synchronous averaging

c18f86e

borzunov reviewed Nov 15, 2021

View reviewed changes

tests/test_optimizer.py Show resolved Hide resolved

tests/test_optimizer.py Outdated Show resolved Hide resolved

tests/test_optimizer.py Outdated Show resolved Hide resolved

tests/test_optimizer.py Show resolved Hide resolved

borzunov added 3 commits November 15, 2021 15:33

review

bbeed61

review

7691a35

review

14f93a2

borzunov and others added 9 commits November 15, 2021 15:43

why so serious?

3c032dd

missing docstrings

a9fd014

missing docstrings

0fda341

missing docstrings

ef3d2b5

test sync_epoch_when_averaging

339b4a7

common kwargs

8ea6f45

docstr

f87e28c

docstr

52cccd8

docstr

364cec1

borzunov reviewed Nov 15, 2021

View reviewed changes

justheuristic and others added 7 commits November 15, 2021 16:05

Update hivemind/optim/experimental/state_averager.py

16ef974

Co-authored-by: Alexander Borzunov <[email protected]>

Update hivemind/optim/experimental/state_averager.py

98e5251

Co-authored-by: Alexander Borzunov <[email protected]>

Update hivemind/optim/experimental/state_averager.py

4b7124a

Co-authored-by: Alexander Borzunov <[email protected]>

rollback

354742f

Merge branch 'state_averaging' of github.com:learning-at-home/hivemin…

d6addac

…d into state_averaging

pep8

5ff13b5

line parity

2caff85

borzunov reviewed Nov 15, 2021

View reviewed changes

hivemind/optim/experimental/state_averager.py Show resolved Hide resolved

hivemind/optim/experimental/state_averager.py Outdated Show resolved Hide resolved

hivemind/optim/experimental/state_averager.py Outdated Show resolved Hide resolved

justheuristic and others added 4 commits November 15, 2021 16:28

Update hivemind/optim/experimental/state_averager.py

4a5b26d

Co-authored-by: Alexander Borzunov <[email protected]>

line parity

0374c1d

Merge remote-tracking branch 'origin/state_averaging' into state_aver…

283cade

…aging

review

d3bb889

borzunov approved these changes Nov 15, 2021

View reviewed changes

hivemind/optim/experimental/state_averager.py Outdated Show resolved Hide resolved

hivemind/optim/experimental/state_averager.py Outdated Show resolved Hide resolved

hivemind/optim/experimental/state_averager.py Outdated Show resolved Hide resolved

justheuristic and others added 3 commits November 15, 2021 16:52

Update hivemind/optim/experimental/state_averager.py

7266788

Co-authored-by: Alexander Borzunov <[email protected]>

Update hivemind/optim/experimental/state_averager.py

7074785

Co-authored-by: Alexander Borzunov <[email protected]>

Update hivemind/optim/experimental/state_averager.py

0ee16c8

Co-authored-by: Alexander Borzunov <[email protected]>

justheuristic added 2 commits November 15, 2021 16:58

review

09c5817

Merge remote-tracking branch 'origin/state_averaging' into state_aver…

cf40b94

…aging

justheuristic merged commit 02eee92 into master Nov 15, 2021

justheuristic deleted the state_averaging branch November 15, 2021 14:06

justheuristic mentioned this pull request Nov 15, 2021

Implement core functionality of hivemind.Optimizer #403

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hivemind.Optimizer] TrainingStateAverager #407

[hivemind.Optimizer] TrainingStateAverager #407

justheuristic commented Nov 8, 2021 •

edited

Loading

codecov bot commented Nov 8, 2021 •

edited

Loading

justheuristic commented Nov 15, 2021

justheuristic commented Nov 15, 2021

[hivemind.Optimizer] TrainingStateAverager #407

[hivemind.Optimizer] TrainingStateAverager #407

Conversation

justheuristic commented Nov 8, 2021 • edited Loading

codecov bot commented Nov 8, 2021 • edited Loading

Codecov Report

justheuristic commented Nov 15, 2021

justheuristic commented Nov 15, 2021

justheuristic commented Nov 8, 2021 •

edited

Loading

codecov bot commented Nov 8, 2021 •

edited

Loading