-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curriculum learning support #132
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @conglongli!
I added a few suggestions to the tutorial's wording and one question.
Next to testing it out.
Co-authored-by: Stas Bekman <[email protected]>
Added a failing test, but can't push into this PR, since your branch doesn't allow maintainer edit.
|
Co-authored-by: Stas Bekman <[email protected]>
…ed-1 into conglongli/main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just trying to make the code easier to grasp faster
Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All is now working, but need to wait for microsoft/DeepSpeed#1443 to be merged before we can merge this.
Thank you, @conglongli
CI seems to check out main branch's head, which is absolutely wrong. CI really needs some attention. But the tests pass, so merging this. |
CL integration for big science project. Currently only integrated for GPT pre-training. Includes an example train script and a tutorial. Before merge need @stas00 to test it on real big science experiments, also probably need @ShadenSmith 's code review since there are some changes related to DeepSpeed pipeline parallelism support.
Corresponding changes needed on DS side: microsoft/DeepSpeed#1440
Blocking event: