Skip to content

Latest commit

 

History

History
138 lines (92 loc) · 5.35 KB

pytorch_generated.asciidoc

File metadata and controls

138 lines (92 loc) · 5.35 KB

API Reference

Packages

kubeflow.org/v1

Package v1 is the v1 version of the API.

Package v1 contains API Schema definitions for the kubeflow.org v1 API group

Resource Types

Definitions

ElasticPolicy

Appears In:
Field Description

backend RDZVBackend

rdzvPort integer

rdzvHost string

rdzvId string

rdzvConf RDZVConf array

RDZVConf contains additional rendezvous configuration (<key1>=<value1>,<key2>=<value2>,…​).

standalone boolean

Start a local standalone rendezvous backend that is represented by a C10d TCP store on port 29400. Useful when launching single-node, multi-worker job. If specified --rdzv_backend, --rdzv_endpoint, --rdzv_id are auto-assigned; any explicitly set values are ignored.

nProcPerNode integer

Number of workers per node; supported values: [auto, cpu, gpu, int].

PyTorchJob

PyTorchJob Represents a PyTorchJob resource.

Appears In:
Field Description

apiVersion string

kubeflow.org/v1

kind string

PyTorchJob

TypeMeta TypeMeta

Standard Kubernetes type metadata.

metadata ObjectMeta

Refer to Kubernetes API documentation for fields of metadata.

Specification of the desired state of the PyTorchJob.

status JobStatus

Most recently observed status of the PyTorchJob. Read-only (modified by the system).

PyTorchJobList

PyTorchJobList is a list of PyTorchJobs.

Field Description

apiVersion string

kubeflow.org/v1

kind string

PyTorchJobList

TypeMeta TypeMeta

Standard type metadata.

metadata ListMeta

Refer to Kubernetes API documentation for fields of metadata.

items PyTorchJob

List of PyTorchJobs.

PyTorchJobSpec

PyTorchJobSpec is a desired state description of the PyTorchJob.

Appears In:
Field Description

runPolicy RunPolicy

RunPolicy encapsulates various runtime policies of the distributed training job, for example how to clean up resources and how long the job can stay active.

elasticPolicy ElasticPolicy

pytorchReplicaSpecs object (keys:ReplicaType, values:ReplicaSpec)

A map of PyTorchReplicaType (type) to ReplicaSpec (value). Specifies the PyTorch cluster configuration. For example, { "Master": PyTorchReplicaSpec, "Worker": PyTorchReplicaSpec, }

RDZVBackend (string)

Appears In:

RDZVConf

Appears In:
Field Description

key string

value string