intel-analytics · glorysdj · Apr 15, 2020 · Apr 15, 2020 · Apr 15, 2020 · Apr 15, 2020
diff --git a/docker/hyperzoo/README.md b/docker/hyperzoo/README.md
@@ -2,6 +2,7 @@ Analytics Zoo hyperzoo image has been built to easily run applications on Kubern
 
 - [Launch pre-built hyperzoo image](#launch-pre-built-hyperzoo-image)
 - [Run Analytics Zoo examples on k8s](#Run-analytics-zoo-examples-on-k8s)
+- [Run Analytics Zoo Jupyter Notebooks on remote Spark cluster or k8s](#Run-Analytics-Zoo-Jupyter-Notebooks-on-remote-Spark-cluster-or-k8s)
 - [Launch Analytics Zoo cluster serving](#Launch-Analytics-Zoo-cluster-serving)
 
 ## Launch pre-built hyperzoo image
@@ -219,3 +220,62 @@ Or clean up the entire spark application by pod label:
 $ kubectl delete pod -l <pod label>
 ```
 
+## Run Analytics Zoo Jupyter Notebooks on remote Spark cluster or k8s
+
+When started a Docker container with specified argument RUNTIME_SPARK_MASTER=`k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>` or RUNTIME_SPARK_MASTER=`spark://<spark-master-host>:<spark-master-port>`, the container will submit jobs to k8s cluster or spark cluster if you use $RUNTIME_SPARK_MASTER as url of spark master.
+
+You may also need to specify NotebookPort=`<your-port>` and NotebookToken=`<your-token>` to start Jupyter Notebook on the specified port and bind to 0.0.0.0.
+
+To start the Jupyter notebooks on remote spark cluster, please use RUNTIME_SPARK_MASTER=`spark://<spark-master-host>:<spark-master-port>`, and attach the client container with command: “docker exec -it `<container-id>`  bash”, then run the shell script: “/opt/start-notebook-spark.sh”, this will start a Jupyter notebook instance on local container, and each tutorial in it will be submitted to the specified spark cluster. User can access the notebook with url `http://<local-ip>:<your-port>` in a preferred browser, and also need to input required  token with `<your-token>` to browse and run the tutorials of Analytics Zoo. Each tutorial will run driver part code in local container and run executor part code on spark cluster.
+
+To start the Jupyter notebooks on Kubernetes cluster, please use RUNTIME_SPARK_MASTER=`k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>`, and attach the client container with command: “docker exec -it `<container-id>`  bash”, then run the shell script: “/opt/start-notebook-k8s.sh”, this will start a Jupyter notebook instance on local container, and each tutorial in it will be submitted to the specified kubernetes cluster. User can access the notebook with url `http://<local-ip>:<your-port>` in a preferred browser, and also need to input required  token with `<your-token>` to browse and run the tutorials of Analytics Zoo. Each tutorial will run driver part code in local container and run executor part code in dynamic allocated spark executor pods on k8s cluster. 
+
+## Launch Analytics Zoo cluster serving
+
+To run Analytics Zoo cluster serving in hyper-zoo client container and submit the streaming job on K8S cluster, you may need to specify arguments RUNTIME_SPARK_MASTER=`k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>`, and you may also need to mount volume from host to container to load model and data files.
+
+You can leverage an existing Redis instance/cluster, or you can start one in the client container:
+```bash
+${REDIS_HOME}/src/redis-server ${REDIS_HOME}/redis.conf > ${REDIS_HOME}/redis.log &
+```
+And you can check the running logs of redis:
+```bash
+cat ${REDIS_HOME}/redis.log
+```
+
+Before starting the cluster serving job, please also modify the config.yaml to configure correct path of the model and redis host url, etc.
+```bash
+nano /opt/cluster-serving/config.yaml
+```
+
+After that, you can start the cluster-serving job and submit the streaming job on K8S cluster:
+```bash
+${SPARK_HOME}/bin/spark-submit \
+  --master ${RUNTIME_SPARK_MASTER} \
+  --deploy-mode cluster \
+  --conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
+  --name analytics-zoo \
+  --conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
+  --conf spark.executor.instances=${RUNTIME_EXECUTOR_INSTANCES} \
+  --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
+  --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/zoo \
+  --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
+  --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/zoo \
+  --conf spark.kubernetes.driver.label.<your-label>=true \
+  --conf spark.kubernetes.executor.label.<your-label>=true \
+  --executor-cores ${RUNTIME_EXECUTOR_CORES} \
+  --executor-memory ${RUNTIME_EXECUTOR_MEMORY} \
+  --total-executor-cores ${RUNTIME_TOTAL_EXECUTOR_CORES} \
+  --driver-cores ${RUNTIME_DRIVER_CORES} \
+  --driver-memory ${RUNTIME_DRIVER_MEMORY} \
+  --properties-file ${ANALYTICS_ZOO_HOME}/conf/spark-analytics-zoo.conf \
+  --py-files ${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-python-api.zip,/opt/analytics-zoo-examples/python/anomalydetection/anomaly_detection.py \
+  --conf spark.driver.extraJavaOptions=-Dderby.stream.error.file=/tmp \
+  --conf spark.sql.catalogImplementation='in-memory' \
+  --conf spark.driver.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar:/opt/cluster-serving/spark-redis-2.4.0-jar-with-dependencies.jar \
+  --conf spark.executor.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar:/opt/cluster-serving/spark-redis-2.4.0-jar-with-dependencies.jar \
+  --conf "spark.executor.extraJavaOptions=-Dbigdl.engineType=mklblas" \
+  --conf "spark.driver.extraJavaOptions=-Dbigdl.engineType=mklblas" \
+  --class com.intel.analytics.zoo.serving.ClusterServing \
+  local:/opt/analytics-zoo-0.8.0-SNAPSHOT/lib/analytics-zoo-bigdl_0.10.0-spark_2.4.3-0.8.0-SNAPSHOT-jar-with-dependencies.jar
+```