-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ZeRO-Offload #337
Comments
Planned and work in progress :) |
Hi, I see that a PR for Zero Offload has been implemented for Sequential models #432 and that it will be in the next release. However, do you expect to have an implementation that works with non-Sequential models soon too? Thank you! |
Work in progress :) Will update this issue with an ETA soon. Thanks! |
Closing this since Zero-offload is available as part of the FSDP API in FairScale. OffloadModel currently works for sequential models but you can try using it for non sequential models by using the auto shard functionality. This is an experimental feature and its success in sharding your model depends very much on how compatible the FW pass is with torch.fx tracing. |
🚀 Feature
DeepSpeed now has the ZeRO-Offload functionality which seems very powerful: https://www.deepspeed.ai/tutorials/zero-offload/
Motivation
I would like to make use of this functionality via PytorchLightning but PytorchLightning does seem to depend on fairscale for this ZeRO so I would guess that fairscale first would need to support ZeRO-Offload.
Pitch
Please add support for ZeRO-Offload.
The text was updated successfully, but these errors were encountered: