Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] Investigate fp8 Training #23

Open
3 tasks
Quentin-Anthony opened this issue Nov 27, 2023 · 0 comments
Open
3 tasks

[ENHANCEMENT] Investigate fp8 Training #23

Quentin-Anthony opened this issue Nov 27, 2023 · 0 comments
Assignees

Comments

@Quentin-Anthony
Copy link
Collaborator

fp8 would be great to use, but we're seeing comparatively high loss with it still. I would like to give it another shot with MS-AMP.

This task is to use a 1p3B-8E config (same as our bf16) and try to get it running efficiently in fp8:

  • Convert our 1p3B-8E config to use fp8, run for 5-10k iterations (however long it takes to see degradation)
  • Investigate https://arxiv.org/abs/2310.18313 and https://github.com/Azure/MS-AMP to see how much effort it would take to apply to our MLM
  • If it's feasible, apply MS-AMP to our MLM and try to see if we can match bf16 validation loss for 5-10k iterations

@yury-tokpanov -- I would like you to lead this, but I'll be closely involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants