Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Deepspeed zero3] lazy weights init #12272

Open
stas00 opened this issue Jun 20, 2021 · 0 comments
Open

[Deepspeed zero3] lazy weights init #12272

stas00 opened this issue Jun 20, 2021 · 0 comments
Assignees
Labels
DeepSpeed WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Comments

@stas00
Copy link
Contributor

stas00 commented Jun 20, 2021

I'm pretty sure we need to follow up to the lazy weights init feature #11471
and add under zero3 deepspeed.zero.GatheredParameters here (or inside _init_weights):

https://github.com/huggingface/transformers/pull/11471/files#diff-6b72b98c4c2dcfc6cc606843917733f5d858374fbc22a735ff483bbc0c1e63eaR1275-R1276

plus need a test.

@stas00 stas00 self-assigned this Jun 20, 2021
@stas00 stas00 added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Jul 21, 2021
@huggingface huggingface deleted a comment from github-actions bot Jul 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DeepSpeed WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
Projects
None yet
Development

No branches or pull requests

1 participant