Skip to content

Commit

Permalink
*: Refactor the structure (#65)
Browse files Browse the repository at this point in the history
* cmd: Add CLI

Signed-off-by: Ce Gao <[email protected]>

* scripts: Move the scripts to the directory

Signed-off-by: Ce Gao <[email protected]>

* manager: Refactor

Signed-off-by: Ce Gao <[email protected]>

* mock: Refactor

Signed-off-by: Ce Gao <[email protected]>

* earlystopping: Refactor

Signed-off-by: Ce Gao <[email protected]>

* build.sh: Fix

Signed-off-by: Ce Gao <[email protected]>

* kubernetes: Remove

Signed-off-by: Ce Gao <[email protected]>

* suggestion: Refactor

Signed-off-by: Ce Gao <[email protected]>

* examples: Rename conf to examples

Signed-off-by: Ce Gao <[email protected]>

* api: Refactor

Signed-off-by: Ce Gao <[email protected]>

* *: Fix

Signed-off-by: Ce Gao <[email protected]>

* build.sh: Remove comments

Signed-off-by: Ce Gao <[email protected]>
  • Loading branch information
gaocegege authored and k8s-ci-robot committed Apr 22, 2018
1 parent 895aaab commit 3157a7a
Show file tree
Hide file tree
Showing 85 changed files with 214 additions and 152 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,10 +95,10 @@ Katib provides a Web UI based on ModelDB(https://github.com/mitdbg/modeldb). The
In addition to TensorFlow, other deep learning frameworks (e.g. PyTorch, MXNet) support TensorBoard format logging.
Katib integrates with TensorBoard easily. To use TensorBoard from Katib, we define a persistent volume claim and set the mount config for the Study. Katib searches each trial log in `{pvc mount path}/logs/{Study ID}/{Trial ID}`.
`{{STUDY_ID}}` and `{{TRIAL_ID}}` in the Studyconfig file are replaced the corresponding value when creating each job.
See example `conf/tf-nmt.yml` which is a config for parameter tuning of [tensorflow/nmt](https://github.com/tensorflow/nmt).
See example `examples/tf-nmt.yml` which is a config for parameter tuning of [tensorflow/nmt](https://github.com/tensorflow/nmt).

```bash
./katib-cli -s gpu-node2:30678 -f ../conf/tf-nmt.yml Createstudy
./katib-cli -s gpu-node2:30678 -f ../examples/tf-nmt.yml Createstudy
2018/04/03 05:52:11 connecting gpu-node2:30678
2018/04/03 05:52:11 study conf{tf-nmt root MINIMIZE 0 configs:<name:"--num_train_steps" parameter_type:INT feasible:<max:"1000" min:"1000" > > configs:<name:"--dropout" parameter_type:DOUBLE feasible:<max:"0.3" min:"0.1" > > configs:<name:"--beam_width" parameter_type:INT feasible:<max:"15" min:"5" > > configs:<name:"--num_units" parameter_type:INT feasible:<max:"1026" min:"256" > > configs:<name:"--attention" parameter_type:CATEGORICAL feasible:<list:"luong" list:"scaled_luong" list:"bahdanau" list:"normed_bahdanau" > > configs:<name:"--decay_scheme" parameter_type:CATEGORICAL feasible:<list:"luong234" list:"luong5" list:"luong10" > > configs:<name:"--encoder_type" parameter_type:CATEGORICAL feasible:<list:"bi" list:"uni" > > [] random median [name:"SuggestionNum" value:"10" name:"MaxParallel" value:"6" ] [] test_ppl [ppl bleu_dev bleu_test] yujioshima/tf-nmt:latest-gpu [python -m nmt.nmt --src=vi --tgt=en --out_dir=/nfs-mnt/logs/{{STUDY_ID}}_{{TRIAL_ID}} --vocab_prefix=/nfs-mnt/learndatas/wmt15_en_vi/vocab --train_prefix=/nfs-mnt/learndatas/wmt15_en_vi/train --dev_prefix=/nfs-mnt/learndatas/wmt15_en_vi/tst2012 --test_prefix=/nfs-mnt/learndatas/wmt15_en_vi/tst2013 --attention_architecture=standard --attention=normed_bahdanau --batch_size=128 --colocate_gradients_with_ops=true --eos=</s> --forget_bias=1.0 --init_weight=0.1 --learning_rate=1.0 --max_gradient_norm=5.0 --metrics=bleu --share_vocab=false --num_buckets=5 --optimizer=sgd --sos=<s> --steps_per_stats=100 --time_major=true --unit_type=lstm --src_max_len=50 --tgt_max_len=50 --infer_batch_size=32] 1 default-scheduler pvc:"nfs" path:"/nfs-mnt" }
2018/04/03 05:52:11 req Createstudy
Expand Down
12 changes: 0 additions & 12 deletions build.sh

This file was deleted.

4 changes: 2 additions & 2 deletions cli/Dockerfile → cmd/cli/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/cli
WORKDIR /go/src/github.com/kubeflow/katib/cmd/cli
RUN go build -o katib-cli

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/cli/katib-cli /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/cli/katib-cli /app/
2 changes: 1 addition & 1 deletion cli/main.go → cmd/cli/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import (

"google.golang.org/grpc"

pb "github.com/kubeflow/katib/api"
pb "github.com/kubeflow/katib/pkg/api"
)

var server = flag.String("s", "127.0.0.1:6789", "server address")
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/earlystopping/medianstopping
WORKDIR /go/src/github.com/kubeflow/katib/cmd/earlystopping/medianstopping
RUN go build -o medianstopping

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/earlystopping/medianstopping /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/earlystopping/medianstopping /app/
ENTRYPOINT ["./medianstopping"]
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"

pb "github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/earlystopping"
pb "github.com/kubeflow/katib/pkg/api"
"github.com/kubeflow/katib/pkg/earlystopping"
)

func main() {
Expand Down
12 changes: 12 additions & 0 deletions cmd/manager/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/cmd/manager
RUN go build -o vizier-manager

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/manager/vizier-manager /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/pkg/manager/visualise /
ENTRYPOINT ["./vizier-manager"]
CMD ["-w", "dlk"]
17 changes: 9 additions & 8 deletions manager/main.go → cmd/manager/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,14 @@ import (
"strconv"
"time"

pb "github.com/kubeflow/katib/api"
kdb "github.com/kubeflow/katib/db"
"github.com/kubeflow/katib/manager/modelstore"
tbif "github.com/kubeflow/katib/manager/visualise/tensorboard"
"github.com/kubeflow/katib/manager/worker_interface"
dlkwif "github.com/kubeflow/katib/manager/worker_interface/dlk"
k8swif "github.com/kubeflow/katib/manager/worker_interface/kubernetes"
nvdwif "github.com/kubeflow/katib/manager/worker_interface/nvdocker"
pb "github.com/kubeflow/katib/pkg/api"
kdb "github.com/kubeflow/katib/pkg/db"
"github.com/kubeflow/katib/pkg/manager/modelstore"
tbif "github.com/kubeflow/katib/pkg/manager/visualise/tensorboard"
"github.com/kubeflow/katib/pkg/manager/worker_interface"
dlkwif "github.com/kubeflow/katib/pkg/manager/worker_interface/dlk"
k8swif "github.com/kubeflow/katib/pkg/manager/worker_interface/kubernetes"
nvdwif "github.com/kubeflow/katib/pkg/manager/worker_interface/nvdocker"

"google.golang.org/grpc"
"google.golang.org/grpc/reflection"
Expand Down Expand Up @@ -394,6 +394,7 @@ func main() {
switch *workerType {
case "kubernetes":
log.Printf("Worker: kubernetes\n")
// Notice: Missing in the repo.
kc, err := clientcmd.BuildConfigFromFlags("", "/conf/kubeconfig")
if err != nil {
log.Fatal(err)
Expand Down
8 changes: 4 additions & 4 deletions manager/main_test.go → cmd/manager/main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ import (

"github.com/golang/mock/gomock"

api "github.com/kubeflow/katib/api"
mockdb "github.com/kubeflow/katib/mock/db"
mockmodelstore "github.com/kubeflow/katib/mock/modelstore"
mockworker "github.com/kubeflow/katib/mock/worker"
api "github.com/kubeflow/katib/pkg/api"
mockdb "github.com/kubeflow/katib/pkg/mock/db"
mockmodelstore "github.com/kubeflow/katib/pkg/mock/modelstore"
mockworker "github.com/kubeflow/katib/pkg/mock/worker"
)

func TestCreateStudy(t *testing.T) {
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM python:3

ADD . /usr/src/app/github.com/kubeflow/katib
WORKDIR /usr/src/app/github.com/kubeflow/katib/suggestion/bayesianoptimization
WORKDIR /usr/src/app/github.com/kubeflow/katib/cmd/suggestion/bayesianoptimization
RUN pip install --no-cache-dir -r requirements.txt
ENV PYTHONPATH /usr/src/app/github.com/kubeflow/katib

Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

import time

from api.python import api_pb2_grpc
from suggestion.bayesian_service import BayesianService
from suggestion.types import DEFAULT_PORT
from pkg.api.python import api_pb2_grpc
from pkg.suggestion.bayesian_service import BayesianService
from pkg.suggestion.types import DEFAULT_PORT

_ONE_DAY_IN_SECONDS = 60 * 60 * 24

Expand All @@ -14,6 +14,7 @@ def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
api_pb2_grpc.add_SuggestionServicer_to_server(BayesianService(), server)
server.add_insecure_port(DEFAULT_PORT)
print("Listening...")
server.start()
try:
while True:
Expand Down
4 changes: 2 additions & 2 deletions suggestion/grid/Dockerfile → cmd/suggestion/grid/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/suggestion/grid
WORKDIR /go/src/github.com/kubeflow/katib/cmd/suggestion/grid
RUN go build -o grid

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/suggestion/grid /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/suggestion/grid /app/
ENTRYPOINT ["./grid"]
4 changes: 2 additions & 2 deletions suggestion/grid/main.go → cmd/suggestion/grid/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"

pb "github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/suggestion"
pb "github.com/kubeflow/katib/pkg/api"
"github.com/kubeflow/katib/pkg/suggestion"
)

func main() {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/suggestion/hyperband
WORKDIR /go/src/github.com/kubeflow/katib/cmd/suggestion/hyperband
RUN go build -o hyperband

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/suggestion/hyperband /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/suggestion/hyperband /app/
ENTRYPOINT ["./hyperband"]
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"

pb "github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/suggestion"
pb "github.com/kubeflow/katib/pkg/api"
"github.com/kubeflow/katib/pkg/suggestion"
)

func main() {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/katib
WORKDIR /go/src/github.com/kubeflow/katib/suggestion/random
WORKDIR /go/src/github.com/kubeflow/katib/cmd/suggestion/random
RUN go build -o random

FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/katib/suggestion/random /app/
COPY --from=build-env /go/src/github.com/kubeflow/katib/cmd/suggestion/random /app/
ENTRYPOINT ["./random"]
4 changes: 2 additions & 2 deletions suggestion/random/main.go → cmd/suggestion/random/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"

pb "github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/suggestion"
pb "github.com/kubeflow/katib/pkg/api"
"github.com/kubeflow/katib/pkg/suggestion"
)

func main() {
Expand Down
11 changes: 0 additions & 11 deletions deploy.sh

This file was deleted.

2 changes: 1 addition & 1 deletion docs/developer-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
You can build all images from source.

```bash
./build.sh
./scripts/build.sh
```

## Implement new suggestion algorithm
Expand Down
6 changes: 3 additions & 3 deletions docs/getting-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Kubernetes manifests are in `manifests` directory.
Set the environment of your cluster(Ingress, Persistent Volumes).

```bash
$ ./deploy.sh
$ ./scripts/deploy.sh
```

## Use CLI
Expand Down Expand Up @@ -50,7 +50,7 @@ StudyID Name Owner RunningTrial CompletedTrial
Try Createstudy. Study will be created and start hyperparameter search.

```bash
$ katib-cli -s gpu-node2:30678 -f ../conf/random.yml Createstudy
$ katib-cli -s gpu-node2:30678 -f ../examples/random.yml Createstudy
2018/04/03 05:16:37 connecting gpu-node2:30678
2018/04/03 05:16:37 study conf{cifer10 root MAXIMIZE 0 configs:<name:"--lr" parameter_type:DOUBLE feasible:<max:"0.07" min:"0.03" > > configs:<name:"--lr-factor" parameter_type:DOUBLE feasible:<max:"0.2" min:"0.05" > > configs:<name:"--max-random-h" parameter_type:INT feasible:<max:"46" min:"26" > > configs:<name:"--max-random-l" parameter_type:INT feasible:<max:"75" min:"25" > > configs:<name:"--num-epochs" parameter_type:INT feasible:<max:"3" min:"3" > > [] random median [name:"SuggestionNum" value:"2" name:"MaxParallel" value:"2" ] [] Validation-accuracy [accuracy] mxnet/python:gpu [python /mxnet/example/image-classification/train_cifar10.py --batch-size=512 --gpus=0,1] 2 default-scheduler <nil> }
2018/04/03 05:16:37 req Createstudy
Expand Down Expand Up @@ -215,7 +215,7 @@ parameterconfigs:
```

```bash
$ katib-cli -s gpu-node2:30678 -f ../conf/random-pv.yml Createstudy
$ katib-cli -s gpu-node2:30678 -f ../examples/random-pv.yml Createstudy
2018/04/03 05:49:47 connecting gpu-node2:30678
2018/04/03 05:49:47 study conf{cifer10-pv-test root MAXIMIZE 0 configs:<name:"--lr" parameter_type:DOUBLE feasible:<max:"0.07" min:"0.03" > > configs:<name:"--lr-factor" parameter_type:DOUBLE feasible:<max:"0.2" min:"0.05" > > configs:<name:"--max-random-h" parameter_type:INT feasible:<max:"46" min:"26" > > configs:<name:"--max-random-l" parameter_type:INT feasible:<max:"75" min:"25" > > configs:<name:"--num-epochs" parameter_type:INT feasible:<max:"3" min:"3" > > [] random median [name:"SuggestionNum" value:"2" name:"MaxParallel" value:"2" ] [] Validation-accuracy [accuracy] mxnet/python:gpu [python /mxnet/example/image-classification/train_cifar10.py --batch-size=512 --gpus=0,1] 2 default-scheduler pvc:"nfs" path:"/nfs-mnt" }
2018/04/03 05:49:47 req Createstudy
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
12 changes: 0 additions & 12 deletions manager/Dockerfile

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
import grpc

import api.python.api_pb2 as api__pb2
import pkg.api.python.api_pb2 as api__pb2


class ManagerStub(object):
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion db/interface.go → pkg/db/interface.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import (
"strings"
"time"

api "github.com/kubeflow/katib/api"
api "github.com/kubeflow/katib/pkg/api"

_ "github.com/go-sql-driver/mysql"
)
Expand Down
2 changes: 1 addition & 1 deletion db/interface_test.go → pkg/db/interface_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import (
"github.com/golang/protobuf/jsonpb"
"gopkg.in/DATA-DOG/go-sqlmock.v1"

api "github.com/kubeflow/katib/api"
api "github.com/kubeflow/katib/pkg/api"
)

var db_interface VizierDBInterface
Expand Down
2 changes: 1 addition & 1 deletion db/test/test.go → pkg/db/test/test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ package main

import (
"fmt"
"github.com/kubeflow/katib/db"
"github.com/kubeflow/katib/pkg/db"
"os"
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ package earlystopping
import (
"context"
"errors"
"github.com/kubeflow/katib/api"
vdb "github.com/kubeflow/katib/db"
"github.com/kubeflow/katib/pkg/api"
vdb "github.com/kubeflow/katib/pkg/db"
"log"
"sort"
"strconv"
Expand Down
2 changes: 1 addition & 1 deletion earlystopping/types.go → pkg/earlystopping/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ package earlystopping
import (
"context"

"github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/pkg/api"
)

const (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ import (
"context"
"fmt"
"git.apache.org/thrift.git/lib/go/thrift"
"github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/manager/modelstore/modeldb"
"github.com/kubeflow/katib/pkg/api"
"github.com/kubeflow/katib/pkg/manager/modelstore/modeldb"
"log"
"net"
"strconv"
Expand Down
File renamed without changes.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package modelstore

import (
"github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/pkg/api"
)

type ModelStore interface {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ package tensorboard

import (
"bytes"
"github.com/kubeflow/katib/api"
"github.com/kubeflow/katib/pkg/api"
"io/ioutil"
apiv1 "k8s.io/api/core/v1"
exbeatav1 "k8s.io/api/extensions/v1beta1"
Expand Down
Loading

0 comments on commit 3157a7a

Please sign in to comment.