-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix offloaded optimizer with single peer #450
Conversation
Codecov Report
@@ Coverage Diff @@
## master #450 +/- ##
==========================================
+ Coverage 83.72% 84.10% +0.37%
==========================================
Files 78 78
Lines 7928 7931 +3
==========================================
+ Hits 6638 6670 +32
+ Misses 1290 1261 -29
|
@@ -618,6 +618,12 @@ def _load_averaged_gradients_into_optimizer_(self): | |||
|
|||
self.grad_averager.notify_used_averaged_gradients() | |||
|
|||
def _load_local_gradients_into_optimizer(self): | |||
"""Fallback to using local gradients in the optimizer (instead of averaged gradients)""" | |||
logger.log(self.status_loglevel, f"Proceeding with local gradients") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment that this can be optimized in case of one peer (if we'd ever need to optimize this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just did it
The bug was originally found by @elricwan and @finger92 (seemingly independently) and reported in #447
Here's what was caused the issue:
The bug was introduced in #440 and affects the following setups (all 3 must be true):
Here's the behavior after the fix is introduced: