-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plans for mpi-operator in Kubeflow 0.5? #66
Comments
We are planning some work internally (at NVIDIA) to have better support for
gang scheduling, bare metal/networking performance etc. Not sure how what
would fit into the kubeflow release schedule. What's the target date for
0.5?
…On Mon, Jan 7, 2019 at 5:42 AM Jeremy Lewi ***@***.***> wrote:
What are the plans for the mpi-operator in 0.5?
/cc @everpeace <https://github.com/everpeace> @rongou
<https://github.com/rongou>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#66>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAeVzYUeZwMzsTxKX81nZ5rDzUDYsEugks5vA07YgaJpZM4Zzhoi>
.
|
@rongou , are you going to have a new implementation of gang-scheduling or leveraging kube-batch ? |
This work is mostly done by the Nvidia GPU Cloud (NGC) team. They have an internal scheduler, but they are also looking at kube-batch, so I guess it's still to be determined. |
Got that; if anything I can help, please let me now :) |
BTW, if the scheduler is internal, how others use that? |
Not sure if these will fit in Kubeflow 0.5 but just want to add a couple related issues for discussion here:
|
Probably, some users using no GPU would like to configure custom |
@k82cn they have plans to eventually open source it, but for now it's on NGC only. |
Got that :) |
@everpeace Thanks. That's good to know. Let's continue non-GPU specific discussion on the PR #75. |
@rongou , are there any plans to update API spec? For example, the currently job launcher and workers are sharing the same pod spec. I suggest to use different role spec to specify the launcher and worker, the reasons are :
Thanks. |
Thanks, got it! |
What are the plans for the mpi-operator in 0.5?
/cc @everpeace @rongou
The text was updated successfully, but these errors were encountered: