-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The image kubectl-delivery has an arm/aarch version ? #1857
Comments
Yes, that image isn't built automatically. But building the image might be good, feel free to open PR: But I would suggest using the MPIJob v2 (https://github.com/kubeflow/mpi-operator) instead of MPIJob v1. |
/kind question |
Hello @tenzen-y it appears that the dockerfile for this image does not exist on this repository and it was removed from the mpi-operator with this commit (https://github.com/kubeflow/mpi-operator/pull/494/files). What is the replacement for this image when using mpijob v2 ? Would be to pass as argument on the CRD definition (#1525) ? Another question is mandatory to use a scheduling plugin like the one provided by the volcano project? thanks |
Oh, yes. It seems that we need to copy the Dockerfile to this repository (kubeflow/training-operator).
We have 2 MPIJob,s and those MPIJobs are hosted in separate operator (repository):
Then, MPIJob v1 uses |
The training-operator supports the volcano gang-scheduling, and you can refer to the following docs how to use volcano scheduler: https://www.kubeflow.org/docs/components/training/job-scheduling However, we currently confirm only volcano gang scheduling. So I'm not sure if the training operator can work well with the other volcano scheduler plugins. |
Hello @tenzen-y . I am looking into the mpi-operator repository. Is there any guidelines on how to support ssh on the images to be used by this operator (thinking about custom images). I am seeing only one set of images that add the ssh (https://github.com/kubeflow/mpi-operator/tree/master/build/base), is there any documentation about the contract expected by the mpi operator ? thanks again =) |
Can you create a separate issue on the mpi-operator repository? |
/close If you have any other questions about the training-operator, feel free to open new issues. |
@tenzen-y: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hello everyone. I am trying to run the training-operator with a small test-cluster of rpi4. The training operator have been installed and appears to be working. However I had tried to run a small test and I got an error with the launcher container.
The image kubectl-delivery on the github appears to be last updated two years ago and only shows amd64 archs
https://hub.docker.com/r/mpioperator/kubectl-delivery/tags
The log of the launcher container is:
Is this expected ?
thanks
The text was updated successfully, but these errors were encountered: