Skip to content

Commit

Permalink
deepspeed-chat: handle overflow for bf16_optimizer
Browse files Browse the repository at this point in the history
DeepSpeed's bf16_optimizer does not have an overflow attribute.
This is ok since bf16 dtype has same range as fp32 and is not expected to
overflow.
Therefore, for bf16, always return no overflow.

Change-Id: I66a2204f3af81e52e7fa8d024afafdbbc7494327
Signed-off-by: Moshe Island <[email protected]>
  • Loading branch information
mosheisland committed Oct 3, 2023
1 parent 58e4e9c commit e444aa8
Showing 1 changed file with 5 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,11 @@ def train_rlhf(self, inputs):
return actor_loss, critic_loss

def get_overflow(self):
# Overflow is not expected when using bf16
# Therefore, DeepSpeed's BF16_Optimizer does not maintain an overflow indication
if self.args.dtype == "bf16":
return False, False

actor_overflow = self.actor_model.optimizer.overflow
critic_overflow = self.critic_model.optimizer.overflow

Expand Down

0 comments on commit e444aa8

Please sign in to comment.