Add Pytorch Serving documentation in website (kubeflow#316)

* Add David Sabater as approver and reviewer * Add Pytorch Serving instructions using Seldon This is to implement [kubeflow#1117](kubeflow/kubeflow#1117)
sarahmaddox · Nov 28, 2018 · 9d35949 · 9d35949
1 parent 6e33e60
commit 9d35949
Show file tree

Hide file tree

Showing 2 changed files with 81 additions and 0 deletions.
diff --git a/OWNERS b/OWNERS
@@ -11,6 +11,7 @@ approvers:
   - sarahmaddox
   - texasmichelle
   - willingc
+  - dsdinter
 reviewers:
   - abhi-g
   - aronchick
@@ -25,3 +26,4 @@ reviewers:
   - sarahmaddox
   - texasmichelle
   - willingc
+  - dsdinter
diff --git a/content/docs/guides/components/pytorchserving.md b/content/docs/guides/components/pytorchserving.md
@@ -0,0 +1,79 @@
++++
+title = "PyTorch Serving"
+description = "Instructions for serving a PyTorch model with Seldon"
+weight = 10
+toc = true
+bref= "This guide will walk you through serving a PyTorch trained model in Kubeflow"
+[menu]
+[menu.docs]
+  parent = "components"
+  weight = 35
++++
+
+## Serving a model
+
+We use [seldon-core](https://github.com/SeldonIO/seldon-core) component deployed following [these](/docs/guides/components/seldon/) instructions to serve the model.
+
+See also this [Example module](https://github.com/kubeflow/examples/blob/master/pytorch_mnist/serving/seldon-wrapper/mnistddpserving.py) which contains the code to wrap the model with Seldon. 
+
+We will wrap this class into a seldon-core microservice which we can then deploy as a REST or GRPC API server.
+
+##  Building a model server
+
+We use the public model server image `gcr.io/kubeflow-examples/mnistddpserving` as an example
+
+  * This server loads the model from the mount point /mnt/kubeflow-gcfs and includes the supporting assets baked into the container image
+  * So you can just run this image to get a pre-trained model from the shared persistent disk
+  * Serving your own model using this server, exposing predict service as GRPC API
+
+## Building your own model server
+
+You can use the below command to build your own image to wrap your model, also check [this](https://github.com/kubeflow/examples/blob/master/pytorch_mnist/serving/seldon-wrapper/build_image.sh) 
+script example that calls the docker Seldon wrapper to build our server image, exposing the predict service as GRPC API.
+```
+docker run -v $(pwd):/my_model seldonio/core-python-wrapper:0.7 /my_model mnistddpserving 0.1 gcr.io --image-name=kubeflow-examples/mnistddpserving --grpc
+```
+
+You can then push the image by running `gcloud docker -- push gcr.io/kubeflow-examples/mnistddpserving:0.1`.
+
+You can find more details about wrapping a model with seldon-core [here](https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md)
+
+## Deploying the model to your Kubeflow cluster
+
+We need to have seldon component deployed, you can deploy the model once trained using a pre-defined ksonnet component, similar to [this](https://github.com/kubeflow/examples/blob/master/pytorch_mnist/ks_app/components/serving_model.jsonnet) example.
+We need to setup our own environment `${KF_ENV}` (e.g., 'default') and modify the Ksonnet component 
+[parameters](https://github.com/kubeflow/examples/blob/master/pytorch_mnist/ks_app/components/params.libsonnet) to use your specific image.
+
+```bash
+cd ks_app
+ks env add ${KF_ENV}
+ks apply ${KF_ENV} -c serving_model
+```
+
+## Testing model server
+
+Seldon Core component uses ambassador to route it's requests to our model server. To send requests to the model, you can port-forward the ambassador container locally:
+
+```
+kubectl port-forward $(kubectl get pods -n ${NAMESPACE} -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n ${NAMESPACE} 8080:80
+```
+
+And send a request, for our example we know is not a torch MNIST image, so it will return an error 500
+
+```
+curl -X POST -H 'Content-Type: application/json' -d '{"data":{"int":"8"}}' http://localhost:8080/seldon/mnist-classifier/api/v0.1/predictions
+```
+
+We should receive an error response as the model server is expecting a 1x786 vector representing a torch image, this will be sufficient to confirm the server model is up and running
+(This is to avoid having to send manually a vector of 786 pixels, you can interact properly with the model using a web interface if you follow all the 
+[instructions](https://github.com/kubeflow/examples/tree/master/pytorch_mnist) in the example)
+
+```
+{
+"timestamp":1540899355053,
+"status":500,"error":"Internal Server Error",
+"exception":"io.grpc.StatusRuntimeException",
+"message":"UNKNOWN: Exception calling application: tensor is not a torch image.",
+"path":"/api/v0.1/predictions"
+}
+```