Package v1 is the v1 version of the API.
Package v1 contains API Schema definitions for the kubeflow.org v1 API group
Field | Description |
---|---|
|
|
|
|
|
|
|
|
|
RDZVConf contains additional rendezvous configuration (<key1>=<value1>,<key2>=<value2>,…). |
|
Start a local standalone rendezvous backend that is represented by a C10d TCP store on port 29400. Useful when launching single-node, multi-worker job. If specified --rdzv_backend, --rdzv_endpoint, --rdzv_id are auto-assigned; any explicitly set values are ignored. |
|
Number of workers per node; supported values: [auto, cpu, gpu, int]. |
PyTorchJob Represents a PyTorchJob resource.
Field | Description |
---|---|
|
|
|
|
|
Standard Kubernetes type metadata. |
|
Refer to Kubernetes API documentation for fields of |
|
Specification of the desired state of the PyTorchJob. |
|
Most recently observed status of the PyTorchJob. Read-only (modified by the system). |
PyTorchJobList is a list of PyTorchJobs.
Field | Description |
---|---|
|
|
|
|
|
Standard type metadata. |
|
Refer to Kubernetes API documentation for fields of |
|
List of PyTorchJobs. |
PyTorchJobSpec is a desired state description of the PyTorchJob.
Field | Description |
---|---|
|
RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active. |
|
|
|
A map of PyTorchReplicaType (type) to ReplicaSpec (value). Specifies the PyTorch cluster configuration. For example, { "Master": PyTorchReplicaSpec, "Worker": PyTorchReplicaSpec, } |