Skip to content

Latest commit

 

History

History
149 lines (99 loc) · 3.99 KB

developer_guide.md

File metadata and controls

149 lines (99 loc) · 3.99 KB

Developer Guide

There are two versions of the TF operator: one for v1alpha2 (to be deprecated) and one for v1beta1.

Building the operator

Create a symbolic link inside your GOPATH to the location you checked out the code

mkdir -p ${GOPATH}/src/github.com/kubeflow
ln -sf ${GIT_TRAINING} ${GOPATH}/src/github.com/kubeflow/tf-operator

Resolve dependencies (if you don't have dep install, check how to do it here)

Install dependencies

dep ensure

Build it

go install github.com/kubeflow/tf-operator/cmd/tf-operator.v1beta1

If you want to build the operator for v1alpha2, please use the command here:

go install github.com/kubeflow/tf-operator/cmd/tf-operator.v2

Building all the artifacts.

pipenv is recommended to manage local Python environment. You can find setup information on their website.

To build the following artifacts:

  • Docker image for the operator
  • Helm chart for deploying it

You can run

# to setup pipenv you have to step into the directory where Pipfile is located
cd py
pipenv install
pipenv shell
cd ..
python -m py.release local --registry=${REGISTRY}
  • The docker image will be tagged into your registry
  • The helm chart will be created in ./bin

Running the Operator Locally

Running the operator locally (as opposed to deploying it on a K8s cluster) is convenient for debugging/development.

Run a Kubernetes cluster

First, you need to run a Kubernetes cluster locally. There are lots of choices:

local-up-cluster.sh runs a single-node Kubernetes cluster locally, but Minikube runs a single-node Kubernetes cluster inside a VM. It is all compilable with the controller, but the Kubernetes version should be 1.8 or above.

Notice: If you use local-up-cluster.sh, please make sure that the kube-dns is up, see kubernetes/kubernetes#47739 for more details.

Configure KUBECONFIG and KUBEFLOW_NAMESPACE

We can configure the operator to run locally using the configuration available in your kubeconfig to communicate with a K8s cluster. Set your environment:

export KUBECONFIG=$(echo ~/.kube/config)
export KUBEFLOW_NAMESPACE=$(your_namespace)
  • KUBEFLOW_NAMESPACE is used when deployed on Kubernetes, we use this variable to create other resources (e.g. the resource lock) internal in the same namespace. It is optional, use default namespace if not set.

Create the TFJob CRD

After the cluster is up, the TFJob CRD should be created on the cluster.

# If you are using v1beta1
kubectl create -f ./examples/crd/crd-v1beta1.yaml

Or

# If you are using v1alpha2
kubectl create -f ./examples/crd/crd-v1alpha2.yaml

Run Operator

Now we are ready to run operator locally:

tf-operator

To verify local operator is working, create an example job and you should see jobs created by it.

# If you are using v1beta1
cd ./examples/v1beta1/dist-mnist
docker build -f Dockerfile -t kubeflow/tf-dist-mnist-test:1.0 .
kubectl create -f ./tf_job_mnist.yaml

Or

# If you are using v1alpha2
cd ./examples/v1alpha2/dist-mnist
docker build -f Dockerfile -t kubeflow/tf-dist-mnist-test:1.0 .
kubectl create -f ./tf_job_mnist.yaml

Go version

On ubuntu the default go package appears to be gccgo-go which has problems see issue golang-go package is also really old so install from golang tarballs instead.

Code Style

Python

  • Use yapf to format Python code

  • yapf style is configured in .style.yapf file

  • To autoformat code

    yapf -i py/**/*.py
  • To sort imports

    isort path/to/module.py