Skip to content

Commit

Permalink
MPI Operator plugin interface (flyteorg#217)
Browse files Browse the repository at this point in the history
* Added mpi plugin

Signed-off-by: Yuvraj <code@evalsocket.dev>

* Rename variable name

Signed-off-by: Yuvraj <code@evalsocket.dev>

* Added docs for mpi

Signed-off-by: Yuvraj <code@evalsocket.dev>
yindia authored Oct 5, 2021
1 parent e15cc50 commit 1967714
Showing 11 changed files with 1,893 additions and 0 deletions.
24 changes: 24 additions & 0 deletions gen/pb-cpp/flyteidl/plugins/mpi.grpc.pb.cc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

47 changes: 47 additions & 0 deletions gen/pb-cpp/flyteidl/plugins/mpi.grpc.pb.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

461 changes: 461 additions & 0 deletions gen/pb-cpp/flyteidl/plugins/mpi.pb.cc

Large diffs are not rendered by default.

257 changes: 257 additions & 0 deletions gen/pb-cpp/flyteidl/plugins/mpi.pb.h
107 changes: 107 additions & 0 deletions gen/pb-go/flyteidl/plugins/mpi.pb.go
110 changes: 110 additions & 0 deletions gen/pb-go/flyteidl/plugins/mpi.pb.validate.go
737 changes: 737 additions & 0 deletions gen/pb-java/flyteidl/plugins/Mpi.java

Large diffs are not rendered by default.

85 changes: 85 additions & 0 deletions gen/pb_python/flyteidl/plugins/mpi_pb2.py
3 changes: 3 additions & 0 deletions gen/pb_python/flyteidl/plugins/mpi_pb2_grpc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
import grpc

42 changes: 42 additions & 0 deletions protos/docs/plugins/plugins.rst
Original file line number Diff line number Diff line change
@@ -48,6 +48,48 @@ will be executed concurrently.



.. _ref_flyteidl/plugins/mpi.proto:

flyteidl/plugins/mpi.proto
==================================================================





.. _ref_flyteidl.plugins.DistributedMPITrainingTask:

DistributedMPITrainingTask
------------------------------------------------------------------

MPI operator proposal https://github.com/kubeflow/community/blob/master/proposals/mpi-operator-proposal.md
Custom proto for plugin that enables distributed training using https://github.com/kubeflow/mpi-operator



.. csv-table:: DistributedMPITrainingTask type fields
:header: "Field", "Type", "Label", "Description"
:widths: auto

"num_workers", ":ref:`ref_int32`", "", "number of worker spawned in the cluster for this job"
"num_launcher_replicas", ":ref:`ref_int32`", "", "number of launcher replicas spawned in the cluster for this job The launcher pod invokes mpirun and communicates with worker pods through MPI."
"slots", ":ref:`ref_int32`", "", "number of slots per worker used in hostfile. The available slots (GPUs) in each pod."





<!-- end messages -->

<!-- end enums -->

<!-- end HasExtensions -->

<!-- end services -->




.. _ref_flyteidl/plugins/presto.proto:

flyteidl/plugins/presto.proto
20 changes: 20 additions & 0 deletions protos/flyteidl/plugins/mpi.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
syntax = "proto3";

package flyteidl.plugins;

option go_package = "github.com/flyteorg/flyteidl/gen/pb-go/flyteidl/plugins";

// MPI operator proposal https://github.com/kubeflow/community/blob/master/proposals/mpi-operator-proposal.md
// Custom proto for plugin that enables distributed training using https://github.com/kubeflow/mpi-operator
message DistributedMPITrainingTask {
// number of worker spawned in the cluster for this job
int32 num_workers = 1;

// number of launcher replicas spawned in the cluster for this job
// The launcher pod invokes mpirun and communicates with worker pods through MPI.
int32 num_launcher_replicas = 2;

// number of slots per worker used in hostfile.
// The available slots (GPUs) in each pod.
int32 slots = 3;
}

0 comments on commit 1967714

Please sign in to comment.