diff --git a/content/docs/about/events.md b/content/docs/about/events.md index 7885726e4e..580a47c2f4 100644 --- a/content/docs/about/events.md +++ b/content/docs/about/events.md @@ -16,6 +16,11 @@ Please raise a [GitHub issue](https://github.com/kubeflow/website/issues/new) if * [Data Day Texas, Austin](https://datadaytexas.com/), 26 January, 2019 - Kubeflow: Portable Machine Learning on Kubernetes: Michelle Casbon * [KubeCon, Seattle](https://events.linuxfoundation.org/events/kubecon-cloudnativecon-north-america-2018/), 11-13 December, 2018 + - [Workshop: Kubeflow End-to-End: GitHub Issue Summarization](https://sched.co/GrWE): Amy Unruh, Michelle Casbon + - [Natural Language Code Search for GitHub Using Kubeflow](https://sched.co/GrVn): Jeremy Lewi, Hamel Husain + - [Eco-Friendly ML: How the Kubeflow Ecosystem Bootstrapped Itself](https://sched.co/GrTc): Peter McKinnon + - [Deep Dive: Kubeflow BoF](https://sched.co/Ha1X): Jeremy Lewi, David Aronchick + - [Machine Learning as Code](https://sched.co/GrVh): Jay Smith * Women in ML & Data Science, Melbourne, 5 December, 2018 - Panel: Juliet Hougland, Michelle Casbon * [YOW!, Melbourne](https://melbourne.yowconference.com.au/), 4-7 December, 2018 @@ -33,7 +38,8 @@ Please raise a [GitHub issue](https://github.com/kubeflow/website/issues/new) if - Kubeflow End to End: Amy Unruh * [Data@Scale, Boston](https://dataatscale2018.splashthat.com/), 25 October, 2018 - [Women in Engineering Panel](https://datascalewomensbreakfast.splashthat.com/): Michelle Casbon - - Kubeflow: Portable Machine Learning on Kubernetes: Michelle Casbon + - [Kubeflow: Portable Machine Learning on Kubernetes](https://code.fb.com/core-data/data-scale-boston/): Michelle Casbon + - [Video](https://www.facebook.com/atscaleevents/videos/114311602829170/) * [Kafka Summit, San Francisco](https://kafka-summit.org/), 16-17 October, 2018 * [O’Reilly AI Conference, London](https://conferences.oreilly.com/artificial-intelligence/ai-eu), 08-11 October, 2018 - [Machine Learning at Scale with Kubernetes](https://conferences.oreilly.com/artificial-intelligence/ai-eu/public/schedule/detail/69194): Chris Cho diff --git a/content/docs/guides/components/hyperparameter.md b/content/docs/guides/components/hyperparameter.md index 92e291cf3e..58171d433d 100644 --- a/content/docs/guides/components/hyperparameter.md +++ b/content/docs/guides/components/hyperparameter.md @@ -9,30 +9,17 @@ toc = true weight = 5 +++ -## Deploying Katib - -[Katib](https://github.com/kubeflow/katib) is a hyperparameter tuning framework, inspired by -[Google Vizier](https://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/bcb15507f4b52991a0783013df4222240e942381.pdf). - -To deploy katib, -```shell -ks pkg install kubeflow/katib@master -ks generate katib katib -ks apply ${ENV} -c katib -``` - ## Using Katib -Create namespace `katib` as the service launches jobs in this namespace. +Currently we are using port-forwarding to access the katib services. +kubernetes version 1.9~ ``` -kubectl create namespace katib +kubectl -n kubeflow port-forward svc/katib-ui 8000:80 ``` - -Currently we are using port-forwarding to access the katib services. +~1.8 ``` -kubectl get pod -n kubeflow # Find your vizier-core and modedb-frontend pods -kubectl port-forward -n kubeflow [vizier-core pod] 6789:6789 & -kubectl port-forward -n kubeflow [modeldb-frontend pod] 3000:3000 & +kubectl get pod -n kubeflow # Find your katib-ui pods +kubectl port-forward -n kubeflow [katib-ui pod] 8000:80 & ``` ## Creating a Study Job You can create Study Job for Katib by defining a StudyJob config file. @@ -55,15 +42,15 @@ In this demo, 3 hyper parameters are randomly generated. ``` -$ kubectl -n katib get studyjob +$ kubectl -n kubeflow get studyjob ``` Check the study status. ``` -$ kubectl -n katib describe studyjobs random-example +$ kubectl -n kubeflow describe studyjobs random-example Name: random-example -Namespace: katib +Namespace: kubeflow Labels: controller-tools.k8s.io=1.0 Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"kubeflow.org/v1alpha1","kind":"StudyJob","metadata":{"annotations":{},"labels":{"controller-tools.k8s.io":"1.0"},"name":"random-example"... API Version: kubeflow.org/v1alpha1 @@ -73,7 +60,7 @@ Metadata: Creation Timestamp: 2018-08-15T01:29:13Z Generation: 0 Resource Version: 173289 - Self Link: /apis/kubeflow.org/v1alpha1/namespaces/katib/studyjobs/random-example + Self Link: /apis/kubeflow.org/v1alpha1/namespaces/kubeflow/studyjobs/random-example UID: 9e136400-a02a-11e8-b88c-42010af0008b Spec: Study Spec: @@ -140,4 +127,4 @@ Events: It should start a study and run two jobs with different parameters. -Go to http://localhost:3000/katib to see the result. +Go to http://localhost:8000/katib to see the result. diff --git a/content/docs/guides/components/tftraining.md b/content/docs/guides/components/tftraining.md index 783f86a964..9635f77acf 100644 --- a/content/docs/guides/components/tftraining.md +++ b/content/docs/guides/components/tftraining.md @@ -1,5 +1,6 @@ +++ -title = "TensorFlow Training" +title = "TensorFlow Training (TFJob)" +linkTitle = "TensorFlow Training" description = "" weight = 10 toc = true @@ -147,7 +148,7 @@ the [`TFJob` custom resource](https://github.com/kubeflow/tf-operator) is availa We treat each TensorFlow job as a [component](https://ksonnet.io/docs/tutorial#2-generate-and-deploy-an-app-component) in your APP. -### Run the TfCnn example +### Running the TfCnn example Kubeflow ships with a [ksonnet prototype](https://ksonnet.io/docs/concepts#prototype) suitable for running the [TensorFlow CNN Benchmarks](https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks). @@ -203,8 +204,33 @@ Typically you will want to change the following values * For example, you might need to configure various environment variables to talk to datastores like GCS or S3 -1. Attach PV's if you want to use PVs for storage. +1. Attach PVs if you want to use PVs for storage. +### Accessing the TFJob dashboard + +The TFJob dashboard is available at `/tfjobs/ui/`. Specifically: + +* If you're using the central Kubeflow UI, you can access the TFJob dashboard + by clicking **TFJOB DASHBOARD**: + + ![Central UI](/docs/images/central-ui.png) + +* If you followed the + [guide for GKE](/docs/started/getting-started-gke), you can + access the TFJob dashboard at the following URL: + + ``` + https://.endpoints..cloud.goog/tfjobs/ui/ + ``` + +* If you're using portforwarding, you can access the TFJob dashboard at the + following URL: + + ``` + http://localhost:8080/tfjobs/ui/ + ``` + +See more details about [accessing the Kubeflow UIs](/docs/guides/accessing-uis). ## Using GPUs diff --git a/content/docs/started/getting-started-gke.md b/content/docs/started/getting-started-gke.md index e6926000db..ba3519f5c6 100644 --- a/content/docs/started/getting-started-gke.md +++ b/content/docs/started/getting-started-gke.md @@ -73,7 +73,14 @@ Create an OAuth client ID to be used to identify Cloud IAP when requesting acces export CLIENT_SECRET= ``` -## Deploy Kubeflow on Kubernetes Engine +## Deploy Kubeflow on GKE using the UI + +1. Open [https://deploy.kubeflow.cloud/](https://deploy.kubeflow.cloud/#/deploy) in your web browser. +1. Sign in using a GCP account with admin privileges for your GCP project. +1. Complete the form. +1. Click **Create Deployment**. + +## Deploy Kubeflow on GKE using the command line Run the following steps to deploy Kubeflow: diff --git a/themes/kf/layouts/index.html b/themes/kf/layouts/index.html index 05f9409756..bf8dd83dfa 100644 --- a/themes/kf/layouts/index.html +++ b/themes/kf/layouts/index.html @@ -42,7 +42,7 @@

Notebooks

TensorFlow model training

-

A TensorFlow Training Controller that can be configured to use either CPU’s or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator.

+

A TensorFlow Training Controller that can be configured to use either CPUs or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator.

diff --git a/themes/kf/sass/src/index.html b/themes/kf/sass/src/index.html index fdacd80c86..6d9f2d400c 100755 --- a/themes/kf/sass/src/index.html +++ b/themes/kf/sass/src/index.html @@ -149,7 +149,7 @@

Notebooks

TensorFlow model training

-

A TensorFlow Training Controller that can be configured to use either CPU’s or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator.

+

A TensorFlow Training Controller that can be configured to use either CPUs or GPUs and be dynamically adjusted to the size of a cluster with a single setting. We also provide a TensorFlow job operator.