-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the HivemindStrategy
#16407
Remove the HivemindStrategy
#16407
Conversation
⛈️ Required checks status: Has failure 🔴
Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Benchmarks
These checks are required after the changes to 🟢 pytorch_lightning: Azure HPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure IPU
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🔴 pytorch_lightning: Docker
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to Thank you for your contribution! 💜
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for moving it in a separate repository as it is the only thing we have that supports training on heterogeneous clusters of variable sizes (spot instances).
In addition to @justusschock suggestion, this could also make a great tutorial how to build a custom strategy, integrating a library like this, perhaps in the context of Fabric. |
I think that shall be for the time being preserved in separate repo |
Marking as a draft until there's a decision |
I also vote for moving it out (and having it in ecosystem-ci), and second @awaelchli 's suggestion about demonstrating how to maintain a strategy for others that will want/need to do it. |
Remove the collaborative strategy
Remove the collaborative strategy
Remove the collaborative strategy
Remove the collaborative strategy
What does this PR do?
Removes the
HivemindStrategy()
The code for this strategy is conveniently self-contained. We will move the implementation to a separate repository (TBD) which tests support with newer PL versions.
Related Lightning-Universe/lightning-Hivemind#15 (to be transferred)
Does your PR introduce any breaking changes? If yes, please list them.
Removes the
HivemindStrategy()
cc @justusschock @awaelchli @Borda