Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ZeRO-Offload #337

Closed
Spenhouet opened this issue Jan 28, 2021 · 4 comments
Closed

Support ZeRO-Offload #337

Spenhouet opened this issue Jan 28, 2021 · 4 comments
Assignees
Labels
enhancement New feature or request in_progress this issue is being worked on

Comments

@Spenhouet
Copy link

🚀 Feature

DeepSpeed now has the ZeRO-Offload functionality which seems very powerful: https://www.deepspeed.ai/tutorials/zero-offload/

Motivation

I would like to make use of this functionality via PytorchLightning but PytorchLightning does seem to depend on fairscale for this ZeRO so I would guess that fairscale first would need to support ZeRO-Offload.

Pitch

Please add support for ZeRO-Offload.

@blefaudeux
Copy link
Contributor

Planned and work in progress :)

@msbaines msbaines added the enhancement New feature or request label Feb 9, 2021
@blefaudeux blefaudeux added the in_progress this issue is being worked on label Mar 5, 2021
@ibro45
Copy link

ibro45 commented Mar 28, 2021

Hi, I see that a PR for Zero Offload has been implemented for Sequential models #432 and that it will be in the next release. However, do you expect to have an implementation that works with non-Sequential models soon too? Thank you!

@anj-s
Copy link
Contributor

anj-s commented Mar 30, 2021

Hi, I see that a PR for Zero Offload has been implemented for Sequential models #432 and that it will be in the next release. However, do you expect to have an implementation that works with non-Sequential models soon too? Thank you!

Work in progress :) Will update this issue with an ETA soon. Thanks!

@anj-s
Copy link
Contributor

anj-s commented Oct 18, 2021

Closing this since Zero-offload is available as part of the FSDP API in FairScale. OffloadModel currently works for sequential models but you can try using it for non sequential models by using the auto shard functionality. This is an experimental feature and its success in sharding your model depends very much on how compatible the FW pass is with torch.fx tracing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request in_progress this issue is being worked on
Projects
None yet
Development

No branches or pull requests

5 participants