From 69dc196f2d634049060d72c2a3691bdcd394b9d2 Mon Sep 17 00:00:00 2001
From: Alejandro Saucedo <axsauze@gmail.com>
Date: Mon, 23 Mar 2020 20:15:22 +0000
Subject: [PATCH] Added documentation page on local testing (#1586)

* Added initial explanation to serving component

* Added updates on docs to specify local installation2
---
 doc/source/index.rst               |   1 +
 doc/source/python/python_module.md |  16 +---
 doc/source/workflow/quickstart.md  |  37 ++++-----
 doc/source/workflow/serving.md     | 118 +++++++++++++++++++++++++----
 4 files changed, 126 insertions(+), 46 deletions(-)

diff --git a/doc/source/index.rst b/doc/source/index.rst
index 4b2324b3d2..f2e446a5c3 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -47,6 +47,7 @@ Documentation Index
    Create your Inference Graph <graph/inference-graph.md>
    Deploy your Model  <workflow/deploying.md>
    Testing your Model Endpoints  <workflow/serving.md>
+   Python Module and Client <python/index.rst>
    Troubleshooting guide <workflow/troubleshooting.md>
    Usage reporting <workflow/usage-reporting.md>
    Upgrading <reference/upgrading.md>
diff --git a/doc/source/python/python_module.md b/doc/source/python/python_module.md
index 5b734e0503..860a03fee7 100644
--- a/doc/source/python/python_module.md
+++ b/doc/source/python/python_module.md
@@ -3,8 +3,6 @@
 Seldon Core has a python package `seldon-core` available on PyPI. The package makes it easier to work with Seldon Core if you are using python and is the basis of the Python S2I wrapper. The module provides:
 
  * `seldon-core-microservice` executable to serve microservice components in Seldon Core. This is used by the Python Wrapper for Seldon Core.
- * `seldon-core-microservice-tester` executable to test running Seldon Core microservices over REST or gRPC.
- * `seldon-core-api-tester` executable to test the external API for running Seldon Deployment inference graphs over REST or gRPC.
  * `seldon_core.seldon_client` library. Core reference API module to call Seldon Core services (internal microservices or the external API). This is used by the testing executable and can be used by users to build their own clients to Seldon Core in Python.
 
 ## Install
@@ -17,13 +15,9 @@ $ pip install seldon-core
 
 ### Tensorflow support
 
-Seldon Core adds optional support to send a `TFTensor` as your prediction
-input.
-However, most users will prefer to send a `numpy` array, string, binary or JSON
-input instead.
-Therefore, in order to avoid including the `tensorflow` dependency on
-installations where the `TFTensor` support won't be necessary, it isn't
-installed it by default.
+Seldon Core adds optional support to send a `TFTensor` as your prediction input.
+However, most users will prefer to send a `numpy` array, string, binary or JSON input instead.
+Therefore, in order to avoid including the `tensorflow` dependency on installations where the `TFTensor` support won't be necessary, it isn't installed it by default.
 
 To include the optional `TFTensor` support, you can install `seldon-core` as:
 
@@ -69,10 +63,6 @@ Seldon allows you to easily take your runtime inference code and create a Docker
 
 You can also create your own image and utilise the `seldon-core-microservice` executable to run your model code.
 
-## Testing Seldon Core Microservices
-
-To test your microservice standalone or your running Seldon Deployment inside Kubernetes you can follow the [API testing docs](../workflow/api-testing.md).
-
 
 ## Seldon Core Python API Client
 
diff --git a/doc/source/workflow/quickstart.md b/doc/source/workflow/quickstart.md
index eeca49629f..66a25c65f9 100644
--- a/doc/source/workflow/quickstart.md
+++ b/doc/source/workflow/quickstart.md
@@ -171,28 +171,19 @@ class Model:
         return output
 ```
 
-**3. Use the Seldon tools to containerise your model**
+**3. Test model locally**
 
-Now we can use the Seldon Core utilities to convert our python class into a fully fledged Seldon Core microservice. In this case we are also containerising the model binaries. 
-
-The result below is a container with the name `sklearn_iris` and the tag `0.1` which we will be able to deploy using Seldon Core.
-
-```console
-s2i build . seldonio/seldon-core-s2i-python3:0.18 sklearn_iris:0.1 
-```
-
-**4. Test model locally**
-
-Before we deploy our model to production, we can actually run our model locally using Docker, and send it a prediction request.
+Before we deploy our model to production, we can actually run our model locally using the [Python seldon-core Module](../python/python_module) microservice CLI functionality.
 
 ```console
-$ docker run -p 8000:8000 --rm sklearn_iris:0.1 
+$ seldon-core-microservice Model REST --service-type MODEL
 
-Listening on port 8080...
+2020-03-23 16:59:17,366 - werkzeug:_log:122 - INFO:   * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 
-$ curl -X POST localhost:8080/api/v1.0/predictions \
+$ curl -X POST localhost:5000/api/v1.0/predictions \
     -H 'Content-Type: application/json' \
-    -d '{ "data": { "ndarray": [1,2,3,4] } }' | json_pp
+    -d '{ "data": { "ndarray": [1,2,3,4] } }' \
+    | json_pp
 
 {
    "meta" : {},
@@ -213,7 +204,17 @@ $ curl -X POST localhost:8080/api/v1.0/predictions \
 }
 ```
 
-**4. Deploy to Kubernetes**
+**4. Use the Seldon tools to containerise your model**
+
+Now we can use the Seldon Core utilities to convert our python class into a fully fledged Seldon Core microservice. In this case we are also containerising the model binaries. 
+
+The result below is a container with the name `sklearn_iris` and the tag `0.1` which we will be able to deploy using Seldon Core.
+
+```console
+s2i build . seldonio/seldon-core-s2i-python3:0.18 sklearn_iris:0.1 
+```
+
+**5. Deploy to Kubernetes**
 
 Similar to what we did with the pre-packaged model server, we define here our deployment structure however we also have to specify the container that we just built, together with any further containerSpec options we may want to add.
 
@@ -239,7 +240,7 @@ spec:
 END
 ```
 
-**5. Send a request to your deployed model in Kubernetes**
+**6. Send a request to your deployed model in Kubernetes**
 
 Finally we can just send a request to the model and see the reply by the server.
 
diff --git a/doc/source/workflow/serving.md b/doc/source/workflow/serving.md
index 6d9053b771..9120da7ad5 100644
--- a/doc/source/workflow/serving.md
+++ b/doc/source/workflow/serving.md
@@ -2,22 +2,110 @@
 
 In order to test your components you are able to send the requests directly using CURL/grpCURL or a similar utility, as well as by using our Python SeldonClient SDK.
 
-## Pre-requisites
+## Testing options
 
-First you need to make sure you've deployed your model, and the model is available through one of the supported [Ingress (as outlined in installation docs)](../workflow/install.md) you are able
+There are several options for testing your model before deploying it.
+
+* Running your model directly with the Python Client
+* Running your model as a Docker container
+    * This can be used for all Language Wrappers (but not prepackaged inference servers)
+* Run your SeldonDeployment in a Kubernetes Dev client such as KIND
+    * This can be used for any models
+
+### Running your model directly with the Python Client
+
+* This can be used for Python Language Wrapped Models only
+
+When you create your Python model, such as a file called `MyModel.py` with the contents:
+
+```
+class MyModel:
+    def __init__(self):
+        pass
+
+    def predict(*args, **kwargs):
+        return ["hello, "world"]
+```
+
+You are able to test your model by running the microservice CLI that is provided by the [Python module](../python/python_module.md)
+
+Once you install the Python seldon-core module you will be able to run the model above with the following command:
+
+```console
+> seldon-core-microservice MyModel REST --service-type MODEL
+
+2020-03-23 16:59:17,320 - seldon_core.microservice:main:190 - INFO:  Starting microservice.py:main
+2020-03-23 16:59:17,322 - seldon_core.microservice:main:246 - INFO:  Parse JAEGER_EXTRA_TAGS []
+2020-03-23 16:59:17,322 - seldon_core.microservice:main:257 - INFO:  Annotations: {}
+2020-03-23 16:59:17,322 - seldon_core.microservice:main:261 - INFO:  Importing Model
+hello world
+2020-03-23 16:59:17,323 - seldon_core.microservice:main:325 - INFO:  REST microservice running on port 5000
+2020-03-23 16:59:17,323 - seldon_core.microservice:main:369 - INFO:  Starting servers
+ * Serving Flask app "seldon_core.wrapper" (lazy loading)
+ * Environment: production
+   WARNING: This is a development server. Do not use it in a production deployment.
+   Use a production WSGI server instead.
+ * Debug mode: off
+2020-03-23 16:59:17,366 - werkzeug:_log:122 - INFO:   * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
+```
+
+Now that our model microservice is running, we can send a request using curl:
+
+```
+> curl -X POST \
+>     -H 'Content-Type: application/json' \
+>     -d '{"data": { "ndarray": [[1,2,3,4]]}}' \
+>         http://localhost:5000/api/v1.0/predictions
+
+{"data":{"names":[],"ndarray":["hello","world"]},"meta":{}}
+```
+
+We can see that the output of the model is returned through the API.
+
+You can also send requests using the [Python Client](../python/seldon_client.md).
+
+### Running your model as a Docker container
+
+If you are building language models with other wrappers, you are able to run the containers [you build using S2I](../wrappers/language_wrappers.md) in your local docker client.
+
+For this you just have to run the docker client with the following command:
+
+```
+docker run --rm --name mymodel -p 5000:5000 mymodel:0.1
+```
+
+This will run the model and export it on port 5000, so we can now send a request using CURL:
+
+```
+> curl -X POST \
+>     -H 'Content-Type: application/json' \
+>     -d '{"data": { "ndarray": [[1,2,3,4]]}}' \
+>         http://localhost:5000/api/v1.0/predictions
+
+{"data":{"names":[],"ndarray":["hello","world"]},"meta":{}}
+```
+
+You can also send requests using the [Python Client](../python/seldon_client.md).
+
+## Testing your model on Kubernetes
+
+For Kubernetes you can set up a cluster as provided in the install section of the documentation.
+
+However you can also run Seldon using local client providers such as KIND (we use KIND for our development and e2e tests).
+
+Once you set up KIND or your kubernetes cluster of your choice, and you've set up your cluster with one of the supported [Ingress (as outlined in installation docs)](../workflow/install.md), you can now send requests to your models.
 
 Depending on whether you deployed Seldon Core with Ambassador or the API Gateway you can access your models as discussed below:
 
-## Ambassador
+### Ambassador
 
-### Ambassador REST
+#### Ambassador REST
 
 Assuming Ambassador is exposed at ```<ambassadorEndpoint>``` and with a Seldon deployment name ```<deploymentName>```  in namespace ```<namespace>```::
 
  * A REST endpoint will be exposed at : ```http://<ambassadorEndpoint>/seldon/<namespace>/<deploymentName>/api/v1.0/predictions```
 
-
-### Ambassador gRPC
+#### Ambassador gRPC
 
 Assuming Ambassador is exposed at ```<ambassadorEndpoint>``` and with a Seldon deployment name ```<deploymentName>```:
 
@@ -25,16 +113,16 @@ Assuming Ambassador is exposed at ```<ambassadorEndpoint>``` and with a Seldon d
     * key ```seldon``` and value ```<deploymentName>```.
     * key ```namespace``` and value ```<namespace>```.
 
-## Istio
+### Istio
 
-### Istio REST
+#### Istio REST
 
 Assuming the istio gateway is at ```<istioGateway>``` and with a Seldon deployment name ```<deploymentName>``` in namespace ```<namespace>```:
 
  * A REST endpoint will be exposed at : ```http://<istioGateway>/seldon/<namespace>/<deploymentName>/api/v1.0/predictions```
 
 
-### Istio gRPC
+#### Istio gRPC
 
 Assuming the istio gateway is at ```<istioGateway>``` and with a Seldon deployment name ```<deploymentName>``` in namespace ```<namespace>```:
 
@@ -43,11 +131,11 @@ Assuming the istio gateway is at ```<istioGateway>``` and with a Seldon deployme
     * key ```namespace``` and value ```<namespace>```.
 
 
-## Client Implementations
+### Client Implementations
 
-### Curl Examples
+#### Curl Examples
 
-#### Ambassador REST
+##### Ambassador REST
 
 Assuming a SeldonDeployment ```mymodel``` with Ambassador exposed on 0.0.0.0:8003:
 
@@ -55,15 +143,15 @@ Assuming a SeldonDeployment ```mymodel``` with Ambassador exposed on 0.0.0.0:800
 curl -v 0.0.0.0:8003/seldon/mymodel/api/v1.0/predictions -d '{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}' -H "Content-Type: application/json"
 ```
 
-### OpenAPI REST
+#### OpenAPI REST
 
 Use Swagger to generate a client for you from the [OpenAPI specifications](../reference/apis/openapi.html).
 
-### gRPC
+#### gRPC
 
 Use [gRPC](https://grpc.io/) tools in your desired language from the [proto buffer specifications](../reference/apis/prediction.md).
 
-#### Reference Python Client
+##### Reference Python Client
 
 Use our [reference python client](../python/python_module.md) which is part of the `seldon-core` module.