Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParameterAveragingOptimizer: support scheduler #239

Closed
12 tasks
justheuristic opened this issue Apr 21, 2021 · 1 comment
Closed
12 tasks

ParameterAveragingOptimizer: support scheduler #239

justheuristic opened this issue Apr 21, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@justheuristic
Copy link
Member

justheuristic commented Apr 21, 2021

  • switch self.averager from DecentralizedAverager to TrainingAverager
    ( see parameters
    :param average_parameters: whether or not to average model parameters in self.step(...)
    :param average_gradients: whether or not to average model gradients in self.step(...)
    :param average_opt_statistics: if specified, average optimizer statistics with corresponding names in statedict
    )
  • add .epoch paramter to track local progress
  • add optional scheduler parameter
  • make a background thread triggered by event on each .step (see progress_reporter for example)
    • AFTER each step, add +1 to local steps and trigger progress event for progress reporter
    • this thread should report (current steps, epoch) to collaboration
      • use f"{self.prefix}.progress" as key and self.averager.endpoint as subkey
    • fetch steps and epoch from all the peers (see fetch_collaboration_state)
    • if some peer has higher epoch than us, set to his epoch
    • if peers on latest epoch collectively accumulated T steps, self.epoch += 1
  • update scheduler to current epoch
  • save/load epoch with state_dict/local_state_dict
  • if remote epoch was above current epoch by more than M (parameter, e.g. 3), load_state_from_peers
  • optionally change name from simple.py to something more appropriate, e.g. asgd.py averaged_opt.py, **your ideas
@justheuristic justheuristic added the enhancement New feature or request label Apr 21, 2021
@justheuristic
Copy link
Member Author

The functionality was originally implemented by @xtinkt in #252 , then reintegrated by me in #400

The exact behavior of ParamterAveragingOptimizer can be achieved as such:

hivemind.Optimizer(..., use_local_updates=True, target_batch_size=AVERAGE_EVERY_THIS_MANY_SAMPLES)

For more detailed explanation, please refer to

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants