Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parallelize method to GPT-neo models #11054

Closed
anamtamais opened this issue Apr 4, 2021 · 2 comments
Closed

Add parallelize method to GPT-neo models #11054

anamtamais opened this issue Apr 4, 2021 · 2 comments

Comments

@anamtamais
Copy link

🚀 Feature request

Add parallelize method to GPT-neo models, so we can finetune them using model parallelism using less expensive GPUs.

Motivation

I want to finetune a GPT-neo model using model parallelism in order to do it using less expensive GPUs. It is not yet implemented, and, as higher-end GPUs are too expensive, it would be better if we distributed the model along several less expensive GPUs rather than using a very expensive one. It would also make it possible for us to iterate using larger batches, what can have big impact on the model fitting.

I would be very glad if you people could do it and I think it would enable the finetuning of specific purpose GPT-neo language models.

@stas00
Copy link
Contributor

stas00 commented Apr 5, 2021

Hi @anamtamais,

We decided that it's not worth investing time into porting the naive MP solution to other models beyond t5+gpt2 since this solution doesn't scale well resource-wise. And given the 2 existing implementations of ZeRO (fairscale and DeepSpeed) this is by far more scalable solution, in particular now that ZeRO stage 3 has been released. You don't need high-end GPUs for ZeRO.

We have everything ready on our side #10753, just waiting for the DeepSpeed team to merge several PRs and make a new release. If you want to try it right away, you can use the 2 branches posted here #11044

Also there are notes comparing the different scalability solutions here: #9766

If you have any questions please let me know.

@github-actions
Copy link

github-actions bot commented May 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants