Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
deepspeed-chat: Support zero3 params initialization in the last LN (#839
) Zero3 requires that gathering partitioned parameters before they can be accessed. We enable that mechanism for initialization of the last LN weight and bias. Co-authored-by: Olatunji Ruwase <[email protected]>
- Loading branch information