Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepspeed-chat: handle overflow for bf16_optimizer #745

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

mosheisland
Copy link
Contributor

DeepSpeed's bf16_optimizer does not have an overflow attribute. This is ok since bf16 dtype has same range as fp32 and is not expected to overflow.
Therefore, for bf16, always return no overflow.

Change-Id: I66a2204f3af81e52e7fa8d024afafdbbc7494327

@mosheisland
Copy link
Contributor Author

The formatting error is not due to this commit.
"applications/DeepSpeed-Chat/training/utils/ds_utils.py:6:1: F401 'torch' imported but unused"

@tjruwase
Copy link
Contributor

tjruwase commented Oct 2, 2023

The formatting error is not due to this commit. "applications/DeepSpeed-Chat/training/utils/ds_utils.py:6:1: F401 'torch' imported but unused"

Yes, you are correct. I actually have a pending PR to take of this.

DeepSpeed's bf16_optimizer does not have an overflow attribute.
This is ok since bf16 dtype has same range as fp32 and is not expected to
overflow.
Therefore, for bf16, always return no overflow.

Change-Id: I66a2204f3af81e52e7fa8d024afafdbbc7494327
Signed-off-by: Moshe Island <[email protected]>
@tjruwase tjruwase merged commit 2f99dcd into microsoft:master Oct 3, 2023
2 checks passed
@mosheisland mosheisland deleted the 2_overflow_bf16 branch October 4, 2023 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants