Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/kubeflow/website into docsy
Browse files Browse the repository at this point in the history
  • Loading branch information
sarahmaddox committed Nov 23, 2018
2 parents 72de12d + aecdccb commit d9412c1
Show file tree
Hide file tree
Showing 6 changed files with 57 additions and 31 deletions.
8 changes: 7 additions & 1 deletion content/docs/about/events.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ Please raise a [GitHub issue](https://github.com/kubeflow/website/issues/new) if
* [Data Day Texas, Austin](https://datadaytexas.com/), 26 January, 2019
- Kubeflow: Portable Machine Learning on Kubernetes: Michelle Casbon
* [KubeCon, Seattle](https://events.linuxfoundation.org/events/kubecon-cloudnativecon-north-america-2018/), 11-13 December, 2018
- [Workshop: Kubeflow End-to-End: GitHub Issue Summarization](https://sched.co/GrWE): Amy Unruh, Michelle Casbon
- [Natural Language Code Search for GitHub Using Kubeflow](https://sched.co/GrVn): Jeremy Lewi, Hamel Husain
- [Eco-Friendly ML: How the Kubeflow Ecosystem Bootstrapped Itself](https://sched.co/GrTc): Peter McKinnon
- [Deep Dive: Kubeflow BoF](https://sched.co/Ha1X): Jeremy Lewi, David Aronchick
- [Machine Learning as Code](https://sched.co/GrVh): Jay Smith
* Women in ML & Data Science, Melbourne, 5 December, 2018
- Panel: Juliet Hougland, Michelle Casbon
* [YOW!, Melbourne](https://melbourne.yowconference.com.au/), 4-7 December, 2018
Expand All @@ -33,7 +38,8 @@ Please raise a [GitHub issue](https://github.com/kubeflow/website/issues/new) if
- Kubeflow End to End: Amy Unruh
* [Data@Scale, Boston](https://dataatscale2018.splashthat.com/), 25 October, 2018
- [Women in Engineering Panel](https://datascalewomensbreakfast.splashthat.com/): Michelle Casbon
- Kubeflow: Portable Machine Learning on Kubernetes: Michelle Casbon
- [Kubeflow: Portable Machine Learning on Kubernetes](https://code.fb.com/core-data/data-scale-boston/): Michelle Casbon
- [Video](https://www.facebook.com/atscaleevents/videos/114311602829170/)
* [Kafka Summit, San Francisco](https://kafka-summit.org/), 16-17 October, 2018
* [O’Reilly AI Conference, London](https://conferences.oreilly.com/artificial-intelligence/ai-eu), 08-11 October, 2018
- [Machine Learning at Scale with Kubernetes](https://conferences.oreilly.com/artificial-intelligence/ai-eu/public/schedule/detail/69194): Chris Cho
Expand Down
35 changes: 11 additions & 24 deletions content/docs/guides/components/hyperparameter.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,17 @@ toc = true
weight = 5
+++

## Deploying Katib

[Katib](https://github.com/kubeflow/katib) is a hyperparameter tuning framework, inspired by
[Google Vizier](https://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/bcb15507f4b52991a0783013df4222240e942381.pdf).

To deploy katib,
```shell
ks pkg install kubeflow/katib@master
ks generate katib katib
ks apply ${ENV} -c katib
```

## Using Katib

Create namespace `katib` as the service launches jobs in this namespace.
Currently we are using port-forwarding to access the katib services.
kubernetes version 1.9~
```
kubectl create namespace katib
kubectl -n kubeflow port-forward svc/katib-ui 8000:80
```

Currently we are using port-forwarding to access the katib services.
~1.8
```
kubectl get pod -n kubeflow # Find your vizier-core and modedb-frontend pods
kubectl port-forward -n kubeflow [vizier-core pod] 6789:6789 &
kubectl port-forward -n kubeflow [modeldb-frontend pod] 3000:3000 &
kubectl get pod -n kubeflow # Find your katib-ui pods
kubectl port-forward -n kubeflow [katib-ui pod] 8000:80 &
```
## Creating a Study Job
You can create Study Job for Katib by defining a StudyJob config file.
Expand All @@ -55,15 +42,15 @@ In this demo, 3 hyper parameters
are randomly generated.

```
$ kubectl -n katib get studyjob
$ kubectl -n kubeflow get studyjob
```

Check the study status.

```
$ kubectl -n katib describe studyjobs random-example
$ kubectl -n kubeflow describe studyjobs random-example
Name: random-example
Namespace: katib
Namespace: kubeflow
Labels: controller-tools.k8s.io=1.0
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"kubeflow.org/v1alpha1","kind":"StudyJob","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0"},"name":"random-example"...
API Version: kubeflow.org/v1alpha1
Expand All @@ -73,7 +60,7 @@ Metadata:
Creation Timestamp: 2018-08-15T01:29:13Z
Generation: 0
Resource Version: 173289
Self Link: /apis/kubeflow.org/v1alpha1/namespaces/katib/studyjobs/random-example
Self Link: /apis/kubeflow.org/v1alpha1/namespaces/kubeflow/studyjobs/random-example
UID: 9e136400-a02a-11e8-b88c-42010af0008b
Spec:
Study Spec:
Expand Down Expand Up @@ -140,4 +127,4 @@ Events: <none>

It should start a study and run two jobs with different parameters.

Go to http://localhost:3000/katib to see the result.
Go to http://localhost:8000/katib to see the result.
32 changes: 29 additions & 3 deletions content/docs/guides/components/tftraining.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
+++
title = "TensorFlow Training"
title = "TensorFlow Training (TFJob)"
linkTitle = "TensorFlow Training"
description = ""
weight = 10
toc = true
Expand Down Expand Up @@ -147,7 +148,7 @@ the [`TFJob` custom resource](https://github.com/kubeflow/tf-operator) is availa

We treat each TensorFlow job as a [component](https://ksonnet.io/docs/tutorial#2-generate-and-deploy-an-app-component) in your APP.

### Run the TfCnn example
### Running the TfCnn example

Kubeflow ships with a [ksonnet prototype](https://ksonnet.io/docs/concepts#prototype) suitable for running the [TensorFlow CNN Benchmarks](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks).

Expand Down Expand Up @@ -203,8 +204,33 @@ Typically you will want to change the following values
* For example, you might need to configure various environment variables to talk to datastores like GCS or S3
1. Attach PV's if you want to use PVs for storage.
1. Attach PVs if you want to use PVs for storage.
### Accessing the TFJob dashboard
The TFJob dashboard is available at `<path>/tfjobs/ui/`. Specifically:
* If you're using the central Kubeflow UI, you can access the TFJob dashboard
by clicking **TFJOB DASHBOARD**:
![Central UI](/docs/images/central-ui.png)
* If you followed the
[guide for GKE](/docs/started/getting-started-gke), you can
access the TFJob dashboard at the following URL:
```
https://<deployment-name>.endpoints.<project>.cloud.goog/tfjobs/ui/
```
* If you're using portforwarding, you can access the TFJob dashboard at the
following URL:
```
http://localhost:8080/tfjobs/ui/
```
See more details about [accessing the Kubeflow UIs](/docs/guides/accessing-uis).
## Using GPUs
Expand Down
9 changes: 8 additions & 1 deletion content/docs/started/getting-started-gke.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,14 @@ Create an OAuth client ID to be used to identify Cloud IAP when requesting acces
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>
```
## Deploy Kubeflow on Kubernetes Engine
## Deploy Kubeflow on GKE using the UI
1. Open [https://deploy.kubeflow.cloud/](https://deploy.kubeflow.cloud/#/deploy) in your web browser.
1. Sign in using a GCP account with admin privileges for your GCP project.
1. Complete the form.
1. Click **Create Deployment**.
## Deploy Kubeflow on GKE using the command line
Run the following steps to deploy Kubeflow:
Expand Down
2 changes: 1 addition & 1 deletion themes/kf/layouts/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ <h4>Notebooks</h4>
</div>
<div class="text">
<h4>TensorFlow model training</h4>
<p>A TensorFlow Training Controller that can be configured to use either CPU’s or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator. </p>
<p>A TensorFlow Training Controller that can be configured to use either CPUs or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator. </p>
</div>
</div>

Expand Down
2 changes: 1 addition & 1 deletion themes/kf/sass/src/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ <h4>Notebooks</h4>
</div>
<div class="text">
<h4>TensorFlow model training</h4>
<p>A TensorFlow Training Controller that can be configured to use either CPU’s or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator. </p>
<p>A TensorFlow Training Controller that can be configured to use either CPUs or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator. </p>
</div>
</div>

Expand Down

0 comments on commit d9412c1

Please sign in to comment.