diff --git a/docs/ambassador.md b/docs/ambassador.md deleted file mode 100644 index 4cd5d722bf..0000000000 --- a/docs/ambassador.md +++ /dev/null @@ -1,80 +0,0 @@ -# Deployment Options running Seldon Core with Ambassador - -Seldon Core works well with [Ambassador](https://www.getambassador.io/) handling ingress to your running machine learning deployments. In this doc we will discuss how your Seldon Deployments are exposed via Ambassador and how you can use both to do various production rollout strategies. - -## Ambassador REST - -Assuming Ambassador is exposed at `````` and with a Seldon deployment name `````` running in a namespace ```namespace```: - -For Seldon Core restricted to a namespace, `singleNamespace=true`, the endpoints exposed are: - - * ```http:///seldon//api/v0.1/predictions``` - * ```http:///seldon///api/v0.1/predictions``` - -For Seldon Core running cluster wide, `singleNamespace=false`, the endpoints exposed are all namespaced: - - * ```http:///seldon///api/v0.1/predictions``` - - -## Example Curl - -### Ambassador REST - -Assuming a Seldon Deployment ```mymodel``` with Ambassador exposed on 0.0.0.0:8003: - -``` -curl -v 0.0.0.0:8003/seldon/mymodel/api/v0.1/predictions -d '{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}' -H "Content-Type: application/json" -``` - -## Canary Deployments - -Canary rollouts are available where you wish to push a certain percentage of traffic to a new model to test whether it works ok in production. You simply need to add some annotations to your Seldon Deployment resource for your canary deployment. - - * `seldon.io/ambassador-weight`:`` : The weight (a value between 0 and 100) to be applied to this deployment. - * Example: `"seldon.io/ambassador-weight":"25"` - * `seldon.io/ambassador-service-name`:`` : The name of the existing Seldon Deployment you want to attach to as a canary. - * Example: "seldon.io/ambassador-service-name":"example" - -A worked example notebook can be found [here](https://github.com/SeldonIO/seldon-core/blob/master/examples/ambassador/canary/ambassador_canary.ipynb) - -To understand more about the Ambassador configuration for this see [their docs](https://www.getambassador.io/reference/canary/). - -## Shadow Deployments - -Shadow deployments allow you to send duplicate requests to a parallel deployment but throw away the response. This allows you to test machine learning models under load and compare the results to the live deployment. - -You simply need to add some annotations to your Seldon Deployment resource for your shadow deployment. - - * `seldon.io/ambassador-shadow`:`true` : Flag to mark this deployment as a Shadow deployment in Ambassador. - * `seldon.io/ambassador-service-name`:`` : The name of the existing Seldon Deployment you want to attach to as a shadow. - * Example: "seldon.io/ambassador-service-name":"example" - -A worked example notebook can be found [here](https://github.com/SeldonIO/seldon-core/blob/master/examples/ambassador/shadow/ambassador_shadow.ipynb) - -To understand more about the Ambassador configuration for this see [their docs](https://www.getambassador.io/reference/shadowing/). - -## Header based Routing - -Header based routing allows you to route requests to particular Seldon Deployments based on headers in the incoming requests. - -You simply need to add some annotations to your Seldon Deployment resource. - - * `seldon.io/ambassador-header`:`
` : The header to add to Ambassador configuration - * Example: "seldon.io/ambassador-header":"location: london" - * `seldon.io/ambassador-service-name`:`` : The name of the existing Seldon you want to attach to as an alternative mapping for requests. - * Example: "seldon.io/ambassador-service-name":"example" - -A worked example notebook can be found [here](https://github.com/SeldonIO/seldon-core/blob/master/examples/ambassador/headers/ambassador_headers.ipynb) - -To understand more about the Ambassador configuration for this see [their docs](https://www.getambassador.io/reference/headers). - - -## Custom Amabassador configuration - -The above discussed configurations should cover most cases but there maybe a case where you want to have a very particular Ambassador configuration under your control. You can acheieve this by adding your confguration as an annotation to your Seldon Deployment resource. - - * `seldon.io/ambassador-config`:`` : The custom ambassador configuration - * Example: `"seldon.io/ambassador-config":"apiVersion: ambassador/v0\nkind: Mapping\nname: seldon_example_rest_mapping\nprefix: /mycompany/ml/\nservice: production-model-example.seldon:8000\ntimeout_ms: 3000"` - -A worked example notebook can be found [here](https://github.com/SeldonIO/seldon-core/blob/master/examples/ambassador/custom/ambassador_custom.ipynb) - diff --git a/docs/analytics.md b/docs/analytics.md deleted file mode 100644 index 9135840c4d..0000000000 --- a/docs/analytics.md +++ /dev/null @@ -1,54 +0,0 @@ -# Seldon Core Analytics - -Seldon Core exposes metrics that can be scraped by Prometheus. The core metrics are exposed by the service orchestrator (```engine```) and API gateway (```server_ingress```). - -The metrics are: - -Prediction Requests - - * ```seldon_api_engine_server_requests_duration_seconds_(bucket,count,sum) ``` : Requests to the service orchestrator from an ingress, e.g. API gateway or Ambassador - * ```seldon_api_engine_client_requests_duration_seconds_(bucket,count,sum) ``` : Requests from the service orchestrator to a component, e.g., a model - * ```seldon_api_server_ingress_requests_duration_seconds_(bucket,count,sum) ``` : Requests to the API Gateway from an external client - -Feedback Requests - - * ```seldon_api_model_feedback_reward_total``` : Reward sent via Feedback API - * ```seldon_api_model_feedback_total``` : Total feedback requests - -Each metric has the following key value pairs for further filtering which will be taken from the SeldonDeployment custom resource that is running: - - * deployment_name - * predictor_name - * predictor_version - * This will be derived from the predictor metadata labels - * model_name - * model_image - * model_image - - -# Helm Analytics Chart - -Seldon Core provides an example Helm analytics chart that displays the above Prometheus metrics in Grafana. You can install it with: - -``` -helm install seldon-core-analytics --name seldon-core-analytics \ - --repo https://storage.googleapis.com/seldon-charts \ - --set grafana_prom_admin_password=password \ - --set persistence.enabled=false \ -``` - -The available parameters are: - - * ```grafana_prom_admin_password``` : The admin user Grafana password to use. - * ```persistence.enabled``` : Whether Prometheus persistence is enabled. - -Once running you can expose the Grafana dashboard with: - -``` -kubectl port-forward $(kubectl get pods -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') 3000:3000 -``` - -You can then view the dashboard at http://localhost:3000/dashboard/db/prediction-analytics?refresh=5s&orgId=1 - -![dashboard](./dashboard.png) - diff --git a/docs/annotations.md b/docs/annotations.md deleted file mode 100644 index 1c0fc7f0bc..0000000000 --- a/docs/annotations.md +++ /dev/null @@ -1,60 +0,0 @@ -# Annotation Based Configuration - -You can configure aspects of Seldon Core via annotations in the SeldonDeployment resource and also the optional API OAuth Gateway. Please create an issue if you would like some configuration added. - -## SeldonDeployment Annotations - -### gRPC API Control - - * ```seldon.io/grpc-max-message-size``` : Maximum gRPC message size - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/model_grpc_size.json) - * ```seldon.io/grpc-read-timeout``` : gRPC read timeout - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/model_long_timeouts.json) - - -### REST API Control - - * ```seldon.io/rest-read-timeout``` : REST read timeout - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/model_long_timeouts.json) - * ```seldon.io/rest-connection-timeout``` : REST connection timeout - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/model_long_timeouts.json) - -### Service Orchestrator - - * ```seldon.io/engine-java-opts``` : Java Opts for Service Orchestrator - * Locations : SeldonDeployment.spec.predictors.annotations - * [Example](../notebooks/resources/model_engine_java_opts.json) - * ```seldon.io/engine-separate-pod``` : Use a separate pod for the service orchestrator - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/model_svcorch_sep.json) - * ```seldon.io/headless-svc``` : Run main endpoint as headless kubernetes service. This is required for gRPC load balancing via Ambassador. - * Locations : SeldonDeployment.spec.annotations - * [Example](../notebooks/resources/grpc_load_balancing_ambassador.json) - -## API OAuth Gateway Annotations -The API OAuth Gateway, if used, can also have the following annotations: - -### gRPC API Control - - * ```seldon.io/grpc-max-message-size``` : Maximum gRPC message size - * ```seldon.io/grpc-read-timeout``` : gRPC read timeout - - -### REST API Control - - * ```seldon.io/rest-read-timeout``` : REST read timeout - * ```seldon.io/rest-connection-timeout``` : REST connection timeout - - -### Control via Helm -The API OAuth Gateway annotations can be set via Helm via the seldon-core values file, for example: - -``` -apife: - annotations: - seldon.io/grpc-max-message-size: "10485760" -``` diff --git a/docs/api-testing.md b/docs/api-testing.md deleted file mode 100644 index 584ebb994b..0000000000 --- a/docs/api-testing.md +++ /dev/null @@ -1,190 +0,0 @@ -# Testing Your Seldon Components - -Whether you have wrapped your component using [our S2I wrappers](./wrappers/readme.md) or created your own wrapper you will want to test the Docker container standalone and also quickly within a running cluster. We have provided two python console scripts within the [seldon-core Python package](../python) to allow you to easily do this: - - * ```seldon-core-microservice-tester``` - * Allows you to test a docker component to check it respects the Seldon internal microservice API. - * ```seldon-core-api-tester``` - * Allows you to test the external endpoints for a running Seldon Deployment graph. - -To use these, install the seldon-core package with ```pip install seldon-core```. - -## Run your Wrapped Model - -To test your model microservice you need to run it. If you have wrapped your model into a Docker container then you should run it and expose the ports. There are many examples in the notebooks in the [examples folders](https://github.com/SeldonIO/seldon-core/tree/master/examples/models) but essential if your model is wrapped in an image `myimage:0.1` then run: - -``` -docker run --name "my_model" -d --rm -p 5000:5000 myimage:0.1 -``` - -Alternatively, if your component is a Python module you can run it directly from python using the core tool ```seldon-core-microservice``` (installed as part of the pip package `seldon-core`). This tool takes the name of the Python module as first argument and the API type REST or GRPC as second argument, for example if you have a file IrisClassifier.py in the current folder you could run: - -``` -seldon-core-microservice IrisClassifier REST -``` - -To get full details about this tool run `seldon-core-microservice --help`. - -Next either use the [Microservce API tester](#microservice-api-tester) or testdirectly via [curl](#microservice-api-test-via-curl). - -## Microservice API Tester - -Use the ```seldon-core-microservice-tester``` script to test a packaged Docker microservice Seldon component. - -``` -usage: seldon-core-microservice-tester [-h] [--endpoint {predict,send-feedback}] - [-b BATCH_SIZE] [-n N_REQUESTS] [--grpc] [--fbs] - [-t] [-p] - contract host port - -positional arguments: - contract File that contains the data contract - host - port - -optional arguments: - -h, --help show this help message and exit - --endpoint {predict,send-feedback} - -b BATCH_SIZE, --batch-size BATCH_SIZE - -n N_REQUESTS, --n-requests N_REQUESTS - --grpc - --fbs - -t, --tensor - -p, --prnt Prints requests and responses -``` - -Example: - -``` -seldon-core-microservice-tester contract.json 0.0.0.0 5000 -p --grpc -``` - -The above sends a predict call to a gRPC component exposed at 0.0.0.0:5000 using the contract.json to create a random request. - -You can find more examples in the [example models folder notebooks](../examples/models). - -To understand the format of the contract.json see details [below](#api-contract). - - -## Microservice API Test via Curl -You can also test your component if run via Docker or from the command line via curl. An example for [Iris Classifier](http://localhost:8888/notebooks/sklearn_iris.ipynb) might be: - -``` -curl -g http://localhost:5000/predict --data-urlencode 'json={"data": {"names": ["sepal_length", "sepal_width", "petal_length", "petal_width"], "ndarray": [[7.233, 4.652, 7.39, 0.324]]}}' -``` - - - -## Seldon-Core API Tester for the External API - -Use the ```seldon-core-api-tester``` script to test a Seldon graph deployed to a kubernetes cluster. - -``` -usage: seldon-core-api-tester [-h] [--endpoint {predict,send-feedback}] - [-b BATCH_SIZE] [-n N_REQUESTS] [--grpc] [-t] - [-p] [--log-level {DEBUG,INFO,ERROR}] - [--namespace NAMESPACE] - [--oauth-port OAUTH_PORT] - [--oauth-key OAUTH_KEY] - [--oauth-secret OAUTH_SECRET] - contract host port [deployment] - -positional arguments: - contract File that contains the data contract - host - port - deployment - -optional arguments: - -h, --help show this help message and exit - --endpoint {predict,send-feedback} - -b BATCH_SIZE, --batch-size BATCH_SIZE - -n N_REQUESTS, --n-requests N_REQUESTS - --grpc - -t, --tensor - -p, --prnt Prints requests and responses - --log-level {DEBUG,INFO,ERROR} - --namespace NAMESPACE - --oauth-port OAUTH_PORT - --oauth-key OAUTH_KEY - --oauth-secret OAUTH_SECRET - -``` - -Example: - -``` -seldon-core-api-tester contract.json 0.0.0.0 8003 --oauth-key oauth-key --oauth-secret oauth-secret -p --grpc --oauth-port 8002 --endpoint send-feedback -``` - - The above sends a gRPC send-feedback request to 0.0.0.0:8003 using the given oauth key/secret (assumes you are using the Seldon API Gateway) with the REST oauth-port at 8002 and use the contract.json file to create a random request. In this example you would have port-forwarded the Seldon api-server to local ports. - -You can find more exampes in the [example models folder notebooks](../examples/models). - -To understand the format of the contract.json see details [below](#api-contract). - -## API Contract - -Both tester scripts require you to provide a contract.json file defining the data you intend to send in a request and the response you expect back. - -An example for the example Iris classification model is shown below: - -``` -{ - "features":[ - { - "name":"sepal_length", - "dtype":"FLOAT", - "ftype":"continuous", - "range":[4,8] - }, - { - "name":"sepal_width", - "dtype":"FLOAT", - "ftype":"continuous", - "range":[2,5] - }, - { - "name":"petal_length", - "dtype":"FLOAT", - "ftype":"continuous", - "range":[1,10] - }, - { - "name":"petal_width", - "dtype":"FLOAT", - "ftype":"continuous", - "range":[0,3] - } - ], - "targets":[ - { - "name":"class", - "dtype":"FLOAT", - "ftype":"continuous", - "range":[0,1], - "repeat":3 - } - ] -} -``` - -Here we have 4 input features each of which is continuous in certain ranges. The response targets will be a repeated set of floats in the 0-1 range. - -### Definition - -There are two sections: - - * ```features``` : The types of the feature array that will be in the request - * ```targets``` : The types of the feature array that will be in the response - -Each section has a list of definitions. Each definition consists of: - - * ```name``` : String : The name of the feature - * ```ftype``` : one of CONTINUOUS, CATEGORICAL : the type of the feature - * ```dtype``` : One of FLOAT, INT : Required for ftype CONTINUOUS : What type of feature to create - * ```values``` : list of Strings : Required for ftype CATEGORICAL : The possible categorical values - * ```range``` : list of two numbers : Optional for ftype CONTINUOUS : The range of values (inclusive) that a continuous value can take - * ```repeat``` : integer : Optional value for how many times to repeat this value - * ```shape``` : array of integers : Optional value for the shape of array to coerce the values - diff --git a/docs/articles/dist-tracing.png b/docs/articles/dist-tracing.png deleted file mode 100644 index 4a28e50fbb..0000000000 Binary files a/docs/articles/dist-tracing.png and /dev/null differ diff --git a/docs/articles/graphs.png b/docs/articles/graphs.png deleted file mode 100644 index c31800c6d3..0000000000 Binary files a/docs/articles/graphs.png and /dev/null differ diff --git a/docs/articles/mab-dashboard.png b/docs/articles/mab-dashboard.png deleted file mode 100644 index 19a29d7d8a..0000000000 Binary files a/docs/articles/mab-dashboard.png and /dev/null differ diff --git a/docs/articles/openshift_s2i.md b/docs/articles/openshift_s2i.md deleted file mode 100644 index 3a4c91f15d..0000000000 --- a/docs/articles/openshift_s2i.md +++ /dev/null @@ -1,178 +0,0 @@ -# Using Openshift Source-to-Image to facilitate Machine Learning Deployment - -Seldon aims to help organisations put their data science projects into production so they can decrease the time to get return on investment. By helping data scientists take their data science models and place them into production, scale them, get analytics and modify them Seldon allows data scientists to bridge the gap from development to production and use current dev-ops best practices in machine learning. Our core products run on top of Kubernetes and can be deployed on-cloud on on-premise. Integrating with enterprise ready Kubernetes distributions such as Openshift allows us to provide a solid foundation in which to supply our products for use in demanding verticals such as the FinTech sector. - -[Seldon-Core](https://github.com/SeldonIO/seldon-core) is an open source project that provides scalable machine learning deployment running on [Kubernetes](https://kubernetes.io/). One of Seldon-Core’s goals is to allow data scientists to continue to construct their training and inference components using any of the many available machine learning toolkits, be that python based (e.g., TensorFlow, sklearn), R or Java (e.g., Spark, H2O) amongst many popular options. Seldon-Core will then allow them easily to package and run their runtime prediction modules on Kubernetes. To achieve this goal we need to make it easy for data scientists to take their source code and package it as a Docker-formatted container in the correct form such that it can be managed as part of a runtime microservice graph on Kubernetes by Seldon-Core. For this we utilize Openshift’s Source-to-Image open source library to allow any code to be packaged in the correct format with minimal requirements from the data scientist. - -# Seldon-Core Overview -Seldon-core provides scalable machine learning deployments running on Kubernetes. To deploy their models data scientists follow the steps as shown below: - -![API](../deploy.png) - - 1. Package their runtime model as a Docker-formatted image - 1. Describe their runtime graph as a Kubernetes resource - 1. Deploy to Kubernetes using standard tools such as kubectl, Helm, ksonnet. - -Once running their deployment can be updated as new image releases are created for the runtime model as well as updates to the runtime graph. - -The components of the runtime graph can be of various types. The most typical is a model which will provide predictions given some input features. Typically, the data scientist will have trained a model and saved the model parameters for use by a runtime component that will be provide new predictions at runtime. However, Seldon-Core allows a range of components to be created that can be joined together as building blocks to create more complex runtime graphs as show below: - -![graphs](graphs.png) - -The types of component you can create can include: - - * Models - e.g., TensorFlow, sklearn models - * Routers - e.g., A-B Tests, Multi-Armed Bandits - * Combiners - e.g., Model ensemblers - * Transformers - e.g., Feature normalization, Outlier detection, concept drift - -As the above diagram shows these need to be fitted into the microservice API of seldon-core either as REST or gRPC services. - -# Source-to-Image integration -To integrate a component into seldon-core the data scientist needs to accomplish two things: - - 1. Create a Docker-formatted image from your source code - 1. Wrap your component as a service that exposes REST or gRPC endpoints that follow the seldon-core miroserice API. - -![wrap](wrap.png) - -To accomplish this we use Openshift's [source-to-image (s2i)](https://github.com/openshift/source-to-image) open source tool. S2i allows data scientists to wrap their code using a single command line call that can easily be embedded into a continuous integration pipeline. Seldon provides s2i builder images that contain middleware code to wrap the data scientist's component within a REST or gRPC server that respects the seldon-core microservice API. All that is needed is for the data scientist to follow a few conventions when creating their component in various languages as will be illustrated below. The growing set of source-to-image builder images can be found [here](https://github.com/SeldonIO/seldon-core/tree/master/wrappers/s2i). - -## Python -There are many popular machine learning libraries in python including Tensorflow, keras, sklearn, pyTorch and Statsmodels amongst many others. To use the Seldon-Core s2i builder image to package a python model the data scientist simply needs to provide: - - * A python file with a class that runs your model - * optional requirements.txt or setup.py - * .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -The data scientist's source code should contain a python file which defines a class of the same name as the file. For example: - -```python -class MyModel(object): - """ - Model template. You can load your model parameters in __init__ from a location accessible at runtime - """ - - def __init__(self): - """ - Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment Kubernetes resource manifest. - """ - print("Initializing") - - def predict(self,X,features_names): - """ - Return a prediction. - - Parameters - ---------- - X : array-like - feature_names : array of feature names (optional) - """ - print("Predict called - will run identity function") - return X -``` - - * The file is called MyModel.py and it defines a class MyModel - * The class contains a predict method that takes an array (numpy) X and feature_names and returns an array of predictions. - * Any required initialization can be put inside the class init method. - -An optional requirements.txt can detail any software dependencies the code requires. - -To allow the s2i builder image to correctly package the component the data scientist needs to provide a few environment variables either in an .s2i/environment file in the source code folder or on the command line. An example is: - -```bash -MODEL_NAME=MyModel -API_TYPE=REST -SERVICE_TYPE=MODEL -``` - -Finally we Use ```s2i build``` to create the Docker-formatted image from source code. Examples for python2 code are: - -```bash -s2i build seldonio/seldon-core-s2i-python2 -s2i build seldonio/seldon-core-s2i-python2 -``` - -## R -R is a popular statistical language which provides many machine learning related packages. - -To use the seldon s2i builder image to package an R model the requirements are: - - * An R file which provides an S3 class for your model via an ```initialise_seldon``` function and that has appropriate generics for the component, e.g. predict for a model. - * An optional install.R to be run to install any libraries needed - * .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -The data scientist's source code should contain an R file which defines an S3 class for their model. For example, - -```R -library(methods) - -predict.mymodel <- function(mymodel,newdata=list()) { - write("MyModel predict called", stdout()) - newdata -} - -new_mymodel <- function() { - structure(list(), class = "mymodel") -} - -initialise_seldon <- function(params) { - new_mymodel() -} -``` - -The above contains: - - * A ```seldon_initialise``` function that creates an S3 class for the model via a constructor ```new_mymodel```. This will be called on startup and you can run any configuration the model needs. - * A generic ```predict``` function is created for my model class. This will be called with a ```newdata``` field with the ```data.frame``` to be predicted. - -An ```install.R``` with any software dependencies required. For example: - -```R -install.packages('rpart') -``` - -Finally, as with all cases the builder image needs a few environment variables to be set to correctly package the R model. An example is: - -```bash -MODEL_NAME=MyModel -API_TYPE=REST -SERVICE_TYPE=MODEL -``` - -These values can also be provided in an .s2i/environment file with the source code or overridden on the command line when building the image. - -Once these steps are done we can use ```s2i build``` to create the Docker-formatted image from the source code. - -```bash -s2i build seldonio/seldon-core-s2i-r -s2i build seldonio/seldon-core-s2i-r -``` - -An example invocation using the test template model inside seldon-core: - -```bash -s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/R/test/model-template-app seldonio/seldon-core-s2i-r seldon-core-template-model -``` - -## Java -There are several popular machine learning libraries in Java including Spark, H2O and DL4J. Seldon-core also provides builder images for Java. To accomplish this we provide a Java library seldon-core-wrappers that can be included in a Maven Spring project to allow a Java component to be easily wrapped. - -To use the Seldon-Core s2i builder image to package a Java model the data scientist will need: - - * A Maven project that depends on the ```io.seldon.wrapper``` library - * A Spring Boot configuration class - * A class that implements ```io.seldon.wrapper.SeldonPredictionService``` for the type of component you are creating - * An optional .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -More details can be found in the seldon-core docs. - -# Summary - -By utilizing Openshift's source-to-image tool data scientists can easily build Docker-formatted images for their runtime components to be deployed at scale using seldon-core. This allows data science teams to use the best machine learning tool for the task and deploy the resulting model in a consistent manner. The seldon-core project is working on providing full Openshift integration in the near future so that Enterprise customers can easily utilize machine learning models within their organisation. - -Seldon will be joining Openshift Commons and will be present at [Kubecon Europe 2018](https://events.linuxfoundation.org/events/kubecon-cloudnativecon-europe-2018/) and the OpenShift Kubecon Europe event on Tues 1st May. Feel free to contact us to discuss Seldon-Core and Openshift and how they can work together to help data scientists put machine learning into production. - - - - diff --git a/docs/articles/outlier-detection-dashboard.png b/docs/articles/outlier-detection-dashboard.png deleted file mode 100644 index 4c01f577fe..0000000000 Binary files a/docs/articles/outlier-detection-dashboard.png and /dev/null differ diff --git a/docs/articles/release-0.2.3.md b/docs/articles/release-0.2.3.md deleted file mode 100644 index d198eb2e37..0000000000 --- a/docs/articles/release-0.2.3.md +++ /dev/null @@ -1,97 +0,0 @@ -# Seldon Core Release 0.2.3 - -[Seldon Core version 0.2.3](https://github.com/SeldonIO/seldon-core/releases/tag/v0.2.3) has several exciting additions to help users deploy machine learning models. The main additions are discussed below: - -## ONNX Support via Intel nGraph - -[Open Neural Network Exchange Format (ONNX)](https://onnx.ai/) is an initiative started by Facebook and Microsoft to allow machine learning estimators to output their trained models in an open format which will allow sharing of models between machine learning tools and organisations. We have integrated [Intel's nGraph](https://ai.intel.com/intel-ngraph/) inference engine inside a Seldon S2I wrapper to allow users who have exported their models in the ONNX format to easily include them in a Seldon Core inference graph. Our [python S2I wrappers](https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md) allow users to take their ONNX models and wrap them as a Docker container to be managed by Seldon Core. We have provided an [end-to-end example Jupyter notebook](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/onnx_resnet50/onnx_resnet50.ipynb) showing an ONNX ResNet image classification model being run inside Seldon Core. - -![resnet-onnx](./resnet-test.png) - -## Annotation based configuration - -You can now configure aspects of Seldon Core via annotations in the SeldonDeployment resource and also the optional API OAuth Gateway. The current release allow configuration of REST and gRPC timeouts and also max message sizes for gRPC. - -An example Seldon Deployment YAML resource with a gRPC max message size annotation is shown below. - -``` -apiVersion: machinelearning.seldon.io/v1alpha2 -kind: SeldonDeployment -metadata: - name: seldon-model -spec: - annotations: - seldon.io/grpc-max-message-size: '10000000' - name: test-deployment - predictors: - - componentSpecs: - - spec: - containers: - - image: seldonio/mock_classifier_grpc:1.0 - name: classifier - graph: - children: [] - endpoint: - type: GRPC - name: classifier - type: MODEL - name: grpc-size - replicas: 1 - -``` - -## Initial NodeJS Wrapper Support - -We have an external contribution by @SachinVarghese that provides an initial Seldon Core wrapper for NodeJS allowing you to take advantage of the emerging machine learning tools within the Javascript ecosystem. Thanks Sachin! - -An example notebook for an [MNIST Tensorflow model](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/nodejs_tensorflow/nodejs_tensorflow.ipynb) is provided which has the following Javascript inference code: - -``` -const tf = require("@tensorflow/tfjs"); -require("@tensorflow/tfjs-node"); -const path = require("path"); -const model_path = "/model.json"; - -class MnistClassifier { - async init() { - this.model = await tf.loadModel( - "file://" + path.join(__dirname, model_path) - ); - const optimizer = "rmsprop"; - this.model.compile({ - optimizer: optimizer, - loss: "categoricalCrossentropy", - metrics: ["accuracy"] - }); - } - - predict(X, feature_names) { - console.log("Predicting ..."); - try { - X = tf.tensor(X); - } catch (msg) { - console.log("Predict input may be a Tensor already"); - } - const result = this.model.predict(X); - let obj = result.dataSync(); - let values = Object.keys(obj).map(key => obj[key]); - var newValues = []; - while (values.length) newValues.push(values.splice(0, 10)); - return newValues; - } -} - -module.exports = MnistClassifier; -``` - -## Stability updates - -There have also been a few stability updates that should provide improvements for those moving towards production with Seldon Core: - - * [Lifecycle status updates](https://github.com/SeldonIO/seldon-core/pull/223) - * The ```Status``` field of the SeldonDeployment is now updated as it transitions from CREATING, to AVAILABLE or FAILED. When FAILED there will be a more descriptive error message in the ```Status``` section. - * Isolation of predictors on SeldonDeployment updates - * A [bug fix](https://github.com/SeldonIO/seldon-core/issues/199) ensures that when you update one predictor in a Seldon Deployment any others are not affected by the change. This is important for cases where you want to do canary and other advanced roll out techniques. - - -For the full release details see [here](https://github.com/SeldonIO/seldon-core/releases/tag/v0.2.3). We welcome feedback and suggestions on your machine learning deployment needs on our [Slack channel](https://join.slack.com/t/seldondev/shared_invite/enQtMzA2Mzk1Mzg0NjczLWQzMGFkNmRjN2UxZmFmMWJmNWIzMTM5Y2UxNGY1ODE5ZmI2NDdkMmNiMmUxYjZhZGYxOTllMDQwM2NkNDQ1MGI). diff --git a/docs/articles/release-0.2.5.md b/docs/articles/release-0.2.5.md deleted file mode 100644 index 1b24ba1ae8..0000000000 --- a/docs/articles/release-0.2.5.md +++ /dev/null @@ -1,60 +0,0 @@ -# Seldon Core Release 0.2.5 - -A summary of the main contributions to the [Seldon Core release 0.2.5](https://github.com/SeldonIO/seldon-core/releases/tag/v0.2.5). - -## PyPI Python Module - -When packaging components to run under Seldon Core we provide easy integration via [S2I](https://github.com/openshift/source-to-image) builder images. The core functionality for our Python S2I image has now been packaged as a Python module which can be easily installed via pip with: - -``` -pip install seldon-core -``` - -The module contains: - - * The top level REST and gRPC wrapper code which can be tested with your component via the executable ```seldon-core-microservice```. - * An internal tester for the Seldon microservice REST and gRPC API accessible via the executable ```seldon-core-tester```. See the [documentation](https://github.com/SeldonIO/seldon-core/blob/master/docs/api-testing.md#microservice-api-tester) for further information. - * An external tester for the external Seldon REST and gRPC API accessible via the executable ```seldon-core-api-tester```. See the [documentation](https://github.com/SeldonIO/seldon-core/blob/master/docs/api-testing.md#seldon-core-api-tester) for further information. - -## Inference Graph Components -One of the aims of Seldon Core is to allow machine learning models to be deployed in production with the appropriate metrics and optimisation to give the required compliance and observability guarantees needed. We have recently extended Seldon Core with Outlier Detection and Multi-Armed Bandit components as discussed below. - -### Outlier Detection -The ability to identify unexpected input feature payloads to a machine learning model is an important feature for production deployments. As part of this release we have added outlier detection modules as a plug-and-play component in Seldon Core. The training and deployment of the implemented deep learning and tree based algorithms ([Variational Auto-Encoders](https://github.com/SeldonIO/seldon-core/tree/master/components/outlier-detection/vae) and [Isolation Forests](https://github.com/SeldonIO/seldon-core/tree/master/components/outlier-detection/isolation-forest)) are illustrated by detecting computer network intrusions in real time. - -

- -

- -### Multi-Armed Bandits - -The problem of deciding how to route requests to competing machine learning model so that one determines which model is the best in the shortest amount of time can be treated as a [Multi-Armed Bandit Problem](https://en.wikipedia.org/wiki/Multi-armed_bandit). Seldon Core has extended the available components you can use with [Thompson Sampling](https://github.com/SeldonIO/seldon-core/tree/master/components/routers/thompson-sampling) and a [case study](https://github.com/SeldonIO/seldon-core/blob/master/components/routers/case_study/credit_card_default.ipynb) comparing it to the more basic [Epsilon Greedy](https://github.com/SeldonIO/seldon-core/tree/master/components/routers/epsilon-greedy) strategy. - -

- -

- -## Cluster Wide Operations -Seldon Core can now be installed in two ways: - - * Single-Namespace (default) : Manages and controls only Seldon Deployments created in that namespace. This only requires RBAC roles local to that namespace. - * Cluster-wide : Manages and controls Seldon Deployments in any namespace. This requires RBAC Cluster Roles. - -Cluster-wide operations have the advantage that only a single Seldon Core Operator needs to be installed and thus saves resources. This could be set up by the Kubernetes cluster manager. Single Namespace operations are best for multi-tenant use cases where you want to ensure everything is local to one namespace. See the [Install Docs](https://github.com/SeldonIO/seldon-core/blob/master/docs/install.md). - -## Extended API Payloads - -We have extended the Seldon Core API payload to include Tensorflow's [TensorProto](https://github.com/SeldonIO/seldon-core/blob/4149c6aeb11be518ec8589fd91599242c907e681/proto/prediction.proto#L29). This will allow utilisation of the fine grained type information available to most compactly send tensor based payloads as well as allow the use of [Tensorflow's library](https://www.tensorflow.org/api_docs/python/tf/make_tensor_proto) to construct the payload needed. - -We have ensured its now possible to use the ```binData``` and ```strData``` payload types to send arbitrary binary or string payloads via the Python wrapper. The python wrappers also allow more easy access to the low level payloads by allowing users to provide a ```predict_rest``` or ```predict_grpc``` methods to gain access directly to the underlying ```SeldonMessage```. - -## Custom Metrics - -The Seldon Core Service Orchestrator component that manages the request/response flow through a user's deployment graph already exposes Prometheus metrics for each of the API calls to the underling components in the graph (e.g. Models, Transformers etc). However, users can now pass back their own custom metrics in the returned ```SeldonMessage``` response from their components. Presently available are Counters, Gauges and Timers. Full documentation can be found [here](https://github.com/SeldonIO/seldon-core/blob/master/docs/custom_metrics.md). - -## Distributed Tracing -We have integrated distributed tracing via [Jaeger](https://www.jaegertracing.io/) into the Service Orchestrator and Python wrappers. This will allow you to get tracing information as shown below for REST and gRPC requests through your Seldon Deployment. For more details see the [full documentation](../distributed-tracing.md). - -

- -

diff --git a/docs/articles/release-0.2.6.md b/docs/articles/release-0.2.6.md deleted file mode 100644 index 3d7b89203b..0000000000 --- a/docs/articles/release-0.2.6.md +++ /dev/null @@ -1,41 +0,0 @@ -# Seldon Core Release 0.2.6 - -A summary of the main contributions to the [Seldon Core release 0.2.6](https://github.com/SeldonIO/seldon-core/releases/tag/v0.2.6). - -## Production Deployment Strategies -Its is important to have a variety of options when deploying new versions of machine learning models to production. With the 0.2.6 release of Seldon Core we provide a range of deployment strategies that can be applied when using the Ambassador reverse proxy for ingress. These include: - - * [Canary deployments](https://github.com/SeldonIO/seldon-core/blob/master/docs/ambassador.md#canary-deployments): Send a small percentage of traffic to the new Seldon Deployment graph while keeping most traffic going to the existing Seldon Deployment graph. - * [Shadow deployments](https://github.com/SeldonIO/seldon-core/blob/master/docs/ambassador.md#shadow-deployments): Split traffic so all traffic goes to the existing Seldon Deployment as well as a "shadow" deployment whose responses are ignored. This allows for full production traffic loads to be tested on the new Seldon Deployment graph. - * [Header based routing](https://github.com/SeldonIO/seldon-core/blob/master/docs/ambassador.md#header-based-routing): This allows traffic with certain headers to be routed to a new Seldon Deployment. It can be used both for deployment strategies as well as for production use cases where you want to split traffic based on header content. - -We also allow advanced users to provide a complete custom Ambassador config to be used rather than allow Seldon to generate it. - -## Intel OpenVINO ImageNet Example -An advanced machine learning deployment for image classification is provided that highlights the power of Intel OpenVINO optimised models used in conjunction with Seldon. An ImageNet ensemble is created with two model architectures, ResNet and DenseNet. The results of the two predictions are combined and the output is transformed into a human readable format. On the input request stream the raw input is compressed JPEG which is then decompressed and converted to a TFTensor proto payload. The pipeline of the Seldon deployment is illustrated below. - -

- -

- -The inference graph shows the ability to combine multiple transformation steps and model serving toolkits together in a single Seldon Deployment. See the [Jupyter Notebook](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/openvino_imagenet_ensemble/openvino_imagenet_ensemble.ipynb) for further details. - -## AWS SageMaker Integration -An example is provided in this release for using [AWS SageMaker](https://aws.amazon.com/sagemaker/) to train a model which is then deployed locally on a Kubernetes cluster with Seldon Core. This example illustrates the flexibility to use various tools to train models when and then deploy all of them in a consistent manner on Kubernetes with Seldon. - -## MlFlow Integration - -We provide a [Jupyter Notebook](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/mlflow_model/mlflow.ipynb) that shows how you can train a model with [MLFlow](https://mlflow.org/) and then deploy it onto Seldon Core. - -## CI/CD Integration - -Seldon Core aims to easily fit into the Continuous Integration and Deployment (CI/CD) pipeline of your organisation. In this release we provide an example CI/CD pipeline following the "GitOps" paradigm. Using standard tools such as [Argo](https://github.com/argoproj/argo), [Argo CD](https://github.com/argoproj/argo-cd), and [Jenkins](https://jenkins.io/) we show how changes by data scientists on new inference source code can easily trigger builds to create new images and push the resulting updated Seldon Deployment graphs into production to serve the new models. The pipeline schematic is show below. - -

- -

- - - -The full demo with instructions on how to apply it can be found [here](https://github.com/SeldonIO/seldon-core/tree/master/examples/cicd-argocd). - diff --git a/docs/articles/resnet-test.png b/docs/articles/resnet-test.png deleted file mode 100644 index f1d5d555cf..0000000000 Binary files a/docs/articles/resnet-test.png and /dev/null differ diff --git a/docs/articles/wrap.png b/docs/articles/wrap.png deleted file mode 100644 index 1a33875db7..0000000000 Binary files a/docs/articles/wrap.png and /dev/null differ diff --git a/docs/benchmarking.md b/docs/benchmarking.md deleted file mode 100644 index b713c95221..0000000000 --- a/docs/benchmarking.md +++ /dev/null @@ -1,64 +0,0 @@ -# Seldon-core Benchmarking - -This page is a work in progress to provide benchmarking stats for seldon-core. Please add further ideas and suggestions as an issue. - -## Goals - - * Load test REST and gRPC endpoints - * Provide stability tests under load - * Comparison to alternatives. - -## Components - - * We use [locust](https://locust.io/) as our benchmarking tool. - * We use Google Cloud Platform for the infrastructure to run Kubernetes. - - -# Tests - -## Maximum Throughput -To gauge the maximum throughput we will: - - * Call the seldon engine component directly thereby ignoring the additional latency that would be introduced by an external reverse proxy (Ambassador) or using the built in seldon API Front-End Oauth2 component. - * Utilize a "stub" model that does nothing but return a hard-wired result from inside the engine. - -This test will illustrate the maximum number of requests that can be pushed through seldon-core engine (which controls the request-response flow) as well as the added latency for the processing of REST and gRPC requests, e.g. serialization/deserialization. - -We will use cordoned off Kubernetes nodes running locust so the latency from node to node prediction calls on GCP will also be part of the returned statistics. - -A [notebook](https://github.com/SeldonIO/seldon-core/blob/master/notebooks/benchmark_simple_model.ipynb) provides the end to end test for reproducibility. - -We use: - - * 1 replica of the stub model running on 1 n1-standard-16 GCP node - * We use 3 nodes to run 64 locust slaves with a total of 256 clients calling as fast as they can. - -See [notebook](https://github.com/SeldonIO/seldon-core/blob/master/notebooks/benchmark_simple_model.ipynb) for details. - -### REST Results - -A throughput of 12,000 request per second with average response time of 9ms is obtained. - -|Method|Name|# requests|Requests/s|# failures|Median response time|Average response time|Min response time|Max response time|Average Content Size| -|--|--|--|--|--|--|--|--|--|--| -|POST|predictions|2363484|12088.95|0|4|9|1|5071|335| - -With percentiles: - -|Name|# requests|50%|66%|75%|80%|90%|95%|98%|99%|100%| -|--|--|--|--|--|--|--|--|--|--|--| -|POST predictions|2363484|4|5|7|9|28|43|60|69|5100| - -### gRPC Results - -A throughput of 28,000 requests per second with average response time of 1ms is obtained. - -|Method|Name|# requests|Requests/s|# failures|Median response time|Average response time|Min response time|Max response time|Average Content Size| -|--|--|--|--|--|--|--|--|--|--| -|grpc|loadtest:5001|4622728|28256.39|0|1|1|0|5020|0| - -With percentiles: - -|Name|# requests|50%|66%|75%|80%|90%|95%|98%|99%|100%| -|--|--|--|--|--|--|--|--|--|--|--| -|grpc loadtest:5001|4622728|1|2|3|3|4|5|6|6|5000| diff --git a/docs/challenges.md b/docs/challenges.md deleted file mode 100644 index 9e8d87ef85..0000000000 --- a/docs/challenges.md +++ /dev/null @@ -1,35 +0,0 @@ -# Challenges - -Machine Learning deployment has many challenges which Seldon Core's goals are to solve, these include: - - * Allow a wide range of ML modelling tools to be easily deployed, e.g. Python, R, Spark, and proprietary models - * Launch ML runtime graphs, scale up/down, perform rolling updates - * Run health checks and ensure recovery of failed components - * Infrastructure optimization for ML - * Latency optimization - * Connect to business apps via various APIs, e.g. REST, gRPC - * Allow construction of Complex runtime microservice graphs - * Route requests - * Transform data - * Ensembles results - * Allow various deployment modalities - * Synchronous - * Asynchronous - * Batch - * Allow Auditing and clear versioning - * Integrate into Continuous Integration (CI) - * Allow Continuous Deployment (CD) - * Provide Monitoring - * Base metrics: Accuracy, request latency and throughput - * Complex metrics: - * Concept drift - * Bias detection - * Outlier detection - * Allow for Optimization - * AB Tests - * Multi-Armed Bandits - -If you see further challenges please add an [Issue](https://github.com/SeldonIO/seldon-core/issues). - - - diff --git a/docs/cicd.png b/docs/cicd.png deleted file mode 100644 index 5f6bfd73f3..0000000000 Binary files a/docs/cicd.png and /dev/null differ diff --git a/docs/crd/readme.md b/docs/crd/readme.md deleted file mode 100644 index c47f692fdf..0000000000 --- a/docs/crd/readme.md +++ /dev/null @@ -1,91 +0,0 @@ -# Custom Resource Definitions - -## Seldon Deployment - -The runtime inference graph for a machine learning deployment is described as a SeldonDeployment Kubernetes resource. The structure of this manifest is defined as a [proto buffer](../reference/seldon-deployment.md). This doc will describe the SeldonDeployment resource in general and how to create one for your runtime inference graph. - -## Creating your resource definition - -The full specification can be found [here](../reference/seldon-deployment.md). Below we highlight various parts and describe their intent. - -The core goal is to describe your runtime inference graph(s) and deploy it with appropriate resources and scale. Example illustrative graphs are shown below: - -![graph](../reference/graph.png) - -The top level SeldonDeployment has standard Kubernetes meta data and consists of a spec which is defined by the user and a status which will be set by the system to represent the current state of the SeldonDeployment. - -```proto -message SeldonDeployment { - required string apiVersion = 1; - required string kind = 2; - optional k8s.io.apimachinery.pkg.apis.meta.v1.ObjectMeta metadata = 3; - required DeploymentSpec spec = 4; - optional DeploymentStatus status = 5; -} -``` - -The core deployment spec consists of a set of ```predictors```. Each predictor represents a separate runtime serving graph. The set of predictors will serve request as controlled by a load balancer. At present the share of traffic will be in relation to the number of replicas each predictor has. A use case for two predictors would be a main deployment and a canary, with the main deployment having 9 replicas and the canary 1, so the canary receives 10% of the overall traffic. Each predictor will be a separately set of managed deployments with Kubernetes so it is safe to add and remove predictors without affecting existing predictors. - -To allow an OAuth API to be provisioned you should specify an OAuth key and secret. If you are using Ambassador you will not need this as you can plug in your own external authentication using Ambassador. - -```proto - -message DeploymentSpec { - optional string name = 1; // A unique name within the namespace. - repeated PredictorSpec predictors = 2; // A list of 1 or more predictors describing runtime machine learning deployment graphs. - optional string oauth_key = 6; // The oauth key for external users to use this deployment via an API. - optional string oauth_secret = 7; // The oauth secret for external users to use this deployment via an API. - map annotations = 8; // Arbitrary annotations. -} - -``` - -For each predictor you should at a minimum specify: - - * A unique name - * A PredictiveUnit graph that presents the tree of components to deploy. - * One or more componentSpecs which describes the set of images for parts of your container graph that will be instigated as microservice containers. These containers will have been wrapped to work within the [internal API](../reference/internal-api.md). This component spec is a standard [PodTemplateSpec](https://kubernetes.io/docs/api-reference/extensions/v1beta1/definitions/#_v1_podtemplatespec). For complex graphs you can decide to use several componentSpecs so as to separate your components into separate Pods each with their own resource requirements. - * If you leave the ports empty for each container they will be added automatically and matched to the ports in the graph specification. If you decide to specify the ports manually they should match the port specified for the matching component in the graph specification. - * the number of replicas of this predictor to deploy - -```proto -message PredictorSpec { - required string name = 1; // A unique name not used by any other predictor in the deployment. - required PredictiveUnit graph = 2; // A graph describing how the predictive units are connected together. - repeated k8s.io.api.core.v1.PodTemplateSpec componentSpecs = 3; // A description of the set of containers used by the graph. One for each microservice defined in the graph. Can be split over 1 or more PodTemplateSpecs. - optional int32 replicas = 4; // The number of replicas of the predictor to create. - map annotations = 5; // Arbitrary annotations. - optional k8s.io.api.core.v1.ResourceRequirements engineResources = 6; // Optional set of resources for the Seldon engine which is added to each Predictor graph to manage the request/response flow - map labels = 7; // labels to be attached to entry deployment for this predictor -} - -``` - -The predictive unit graph is a tree. Each node is of a particular type. If the implementation is not specified then a microservice is assumed and you must define a matching named container within the componentSpec above. Each type of PredictiveUnit has a standard set of methods it is expected to manage, see [here](../reference/seldon-deployment.md). - -For each node in the graph: - - * A unique name. If the node describes a microservice then it must match a named container with the componentSpec. - * The children nodes. - * The type of the predictive unit : MODEL, ROUTER, COMBINER, TRANSFORMER or OUTPUT_TRANSFORMER. - * The implementation. This can be left blank if it will be a microservice as this is the default otherwise choose from the available appropriate implementations provided internally. - * Methods. This can be left blank if you wish to follow the standard methods for your PredictiveNode type : see [here](../reference/seldon-deployment.md). - * Endpoint. In here you should minimally if this a microservice specify whether the PredictiveUnit will use REST or gRPC. Ports will be defined automatically if not specified. - * Parameters. Specify any parameters you wish to pass to the PredictiveUnit. These will be passed in an environment variable called PREDICTIVE_UNIT_PARAMETERS as a JSON list. - -```proto - -message PredictiveUnit { - - - required string name = 1; //must match container name of component if no implementation - repeated PredictiveUnit children = 2; // The child predictive units. - optional PredictiveUnitType type = 3; - optional PredictiveUnitImplementation implementation = 4; - repeated PredictiveUnitMethod methods = 5; - optional Endpoint endpoint = 6; // The exposed endpoint for this unit. - repeated Parameter parameters = 7; // Customer parameter to pass to the unit. -} - - -``` diff --git a/docs/custom_metrics.md b/docs/custom_metrics.md deleted file mode 100644 index 09187a969f..0000000000 --- a/docs/custom_metrics.md +++ /dev/null @@ -1,70 +0,0 @@ -# Custom Metrics - -Seldon Core exposes basic metrics via Prometheus endpoints on its service orchestrator that include request count, request time percentiles and rolling accuracy for each running model. However, you may wish to expose custom metrics from your components which are automaticlaly added to Prometheus. For this purpose you can supply extra fields in the returned meta data of the response object in the API calls to your components as illustrated below: - -``` -{ - "meta": { - "metrics": [ - { - "type": "COUNTER", - "key": "mycounter", - "value": 1.0 - "tags": {"mytag":"mytagvalue"} - }, - { - "type": "GAUGE", - "key": "mygauge", - "value": 22.0 - }, - { - "type": "TIMER", - "key": "mytimer", - "value": 1.0 - } - ] - }, - "data": { - "ndarray": [ - [ - 1, - 2 - ] - ] - } -} -``` - -We provide three types of metric that can be returned in the meta.metrics list: - - * COUNTER : a monotonically increasing value. It will be added to any existing value from the metric key. - * GAUGE : an absolute value showing a level, it will overwrite any existing value. - * TIMER : a time value (in msecs) - -Each metric apart from the type takes a key and a value. The proto buffer definition is shown below: - -``` -message Metric { - enum MetricType { - COUNTER = 0; - GAUGE = 1; - TIMER = 2; - } - string key = 1; - MetricType type = 2; - float value = 3; - map tags = 4; -} -``` - - -As we expose the metrics via Prometheus, if ```tags``` are added they must appear in every metric response and always have the same set of keys as Prometheus does not allow metrics to have varying numbers of tags. This condition is enforced by the [micrometer](https://micrometer.io/) library we use to expose the metrics and exceptions will happen if violated. - -At present the following Seldon Core wrappers provide integrations with custom metrics: - - * [Python Wrapper](./wrappers/python.md#custom-metrics) - - -# Example - -There is an [example notebook illustrating a model with custom metrics in python](../examples/models/template_model_with_metrics/modelWithMetrics.ipynb). \ No newline at end of file diff --git a/docs/dashboard.png b/docs/dashboard.png deleted file mode 100644 index 378a9d5a0e..0000000000 Binary files a/docs/dashboard.png and /dev/null differ diff --git a/docs/deploy.png b/docs/deploy.png deleted file mode 100644 index e29709cc10..0000000000 Binary files a/docs/deploy.png and /dev/null differ diff --git a/docs/deploying.md b/docs/deploying.md deleted file mode 100644 index 857f89b7fc..0000000000 --- a/docs/deploying.md +++ /dev/null @@ -1,46 +0,0 @@ -# Deployment - - - 1. Deploy your machine learning model inference graph - 1. Validate successful deployment - - -## Deploy - -You can manage your deployment resource via the standard Kuberntes tools: - -### Kubectl - -You can manage your deployments via the standard Kubernetes CLI kubectl, e.g. - -```bash -kubectl apply -f my_ml_deployment.yaml -``` - -### Helm - -You can use Helm to manage your deployment as illustrated in the [Helm examples notebook](../notebooks/helm_examples.ipynb). - -We have a selection of [templated helm charts](../helm-charts/README.md#seldon-core-inference-graph-templates) you can use as a basis for your deployments. - -### Ksonnet - -You can use Ksonnet to manage your deployments as illustrated in the [Ksonnet examples notebook](../notebooks/ksonnet_examples.ipynb). - -We have a selection of [Ksonnet prototypes](../seldon-core/seldon-core/README.md) you can use as a basis for your deployments. - - -## Validate - -You can check the status of the running deployments using kubectl - -For example: - -``` -kubectl get sdep -o jsonpath='{.items[].status}' -``` - - - - - diff --git a/docs/developer/build-using-private-repo.md b/docs/developer/build-using-private-repo.md deleted file mode 100644 index 4eb3ca422c..0000000000 --- a/docs/developer/build-using-private-repo.md +++ /dev/null @@ -1,61 +0,0 @@ -# Build using private local repository - -## Prerequisites - -* Local Docker -* Kubernetes cluster access -* Private local repository setup in the cluster with local access - * use the project [k8s-local-docker-registry](https://github.com/SeldonIO/k8s-local-docker-registry) - * "127.0.0.1:5000" will be used as the repo host url - -## Prerequisite check - -Ensure the prerequisites are in place and the correct ports available. - -``` -# Check that the private local registry works -(set -x && curl -X GET http://127.0.0.1:5000/v2/_catalog && \ - docker pull busybox && docker tag busybox 127.0.0.1:5000/busybox && \ - docker push 127.0.0.1:5000/busybox) -``` - -## Updating components and redeploying into cluster - -Basic process of how to test code changes in cluster. - -1. Stop seldon core if its running. -1. Build and push the component that was updated or all components if necessary. -1. Start seldon core. -1. Deploy models. - -Below are details to achieve this. - -### Building all components - -Build all images and push to private local repository. - -``` -./build-all-private-repo -./push-all-private-repo -``` - -### start/stop Seldon Core - -``` -./start-seldon-core-private-repo -./stop-seldon-core-private-repo -``` - -### Building individual components - -``` -./cluster-manager/build-private-repo -./cluster-manager/push-private-repo - -./api-frontend/build-private-repo -./api-frontend/push-private-repo - -./engine/build-private-repo -./engine/push-private-repo -``` - diff --git a/docs/developer/readme.md b/docs/developer/readme.md deleted file mode 100644 index 45dd733a2a..0000000000 --- a/docs/developer/readme.md +++ /dev/null @@ -1,17 +0,0 @@ -# Developer - -We welcome new contributors. Please read the [code of conduct](../../CODE_OF_CONDUCT.md) and [contributing guidelines](../../CONTRIBUTING.md) - -## Release process - -To be completed. - -## Tools we use - - - [github-changelog-generator](https://github.com/skywinder/github-changelog-generator) - - [Grip - Local Markdown viewer](https://github.com/joeyespo/grip) - -## Building Seldon Core - -* [Build using private repository](build-using-private-repo.md) - diff --git a/docs/distributed-tracing.md b/docs/distributed-tracing.md deleted file mode 100644 index 86ec9797b5..0000000000 --- a/docs/distributed-tracing.md +++ /dev/null @@ -1,146 +0,0 @@ -# Distributed Tracing (Alpha Feature) - -You can use Jaeger Open Tracing to trace your API calls to Seldon Core. - -This feature is available from versions >=0.2.5-SNAPSHOT of the core images and presently in: - - * Python wrappers >=0.5-SNAPSHOT - -## Install Jaeger - -You will need to install Jaeger on your Kubernetes cluster. Follow their [documentation](https://github.com/jaegertracing/jaeger-kubernetes). - -## Configuration - -You will need to annotate your Seldon Deployment resource with environment variables to make tracing active and set the appropriate Jaeger configuration variables. - - * For the Seldon Service Orchestrator you will need to set the environment variables in the ```spec.predictors[].svcOrchSpec.env``` section. See the [Jaeger Java docs](https://github.com/jaegertracing/jaeger-client-java/tree/master/jaeger-core#configuration-via-environment) for available configuration variables. - * For each Seldon component you run (e.g., model transformer etc.) you will need to add environment variables to the container section. - - -### Python Wrapper Configuration - -Add an environment variable: TRACING with value 1 to activate tracing. - -You can utilize the default configuration by simply providing the name of the Jaeger agent service by providing JAEGER_AGENT_HOST environment variable. Override default Jaeger agent port `5775` by setting JAEGER_AGENT_PORT environment variable. - -To provide a custom configuration following the Jarger Python configuration yaml defined [here](https://github.com/jaegertracing/jaeger-client-python) you can provide a configmap and the path to the YAML file in JAEGER_CONFIG_PATH environment variable. - -An example is show below: - -``` -{ - "apiVersion": "machinelearning.seldon.io/v1alpha2", - "kind": "SeldonDeployment", - "metadata": { - "labels": { - "app": "seldon" - }, - "name": "tracing-example", - "namespace": "seldon" - }, - "spec": { - "name": "tracing-example", - "oauth_key": "oauth-key", - "oauth_secret": "oauth-secret", - "predictors": [ - { - "componentSpecs": [{ - "spec": { - "containers": [ - { - "name": "model1", - "image": "seldonio/mock_classifier_rest:1.1", - "env": [ - { - "name": "TRACING", - "value": "1" - }, - { - "name": "JAEGER_CONFIG_PATH", - "value": "/etc/tracing/config/tracing.yml" - } - ], - "volumeMounts": [ - { - "mountPath": "/etc/tracing/config", - "name": "tracing-config" - } - ] - } - ], - "terminationGracePeriodSeconds": 1, - "volumes": [ - { - "name": "tracing-config", - "volumeSource" : { - "configMap": { - "localObjectReference" : - { - "name": "tracing-config" - }, - "items": [ - { - "key": "tracing.yml", - "path": "tracing.yml" - } - ] - } - } - } - ] - } - }], - "graph": { - "name": "model1", - "endpoint": { "type" : "REST" }, - "type": "MODEL", - "children": [ - ] - }, - "name": "tracing", - "replicas": 1, - "svcOrchSpec" : { - "env": [ - { - "name": "TRACING", - "value": "1" - }, - { - "name": "JAEGER_AGENT_HOST", - "value": "jaeger-agent" - }, - { - "name": "JAEGER_AGENT_PORT", - "value": "5775" - }, - { - "name": "JAEGER_SAMPLER_TYPE", - "value": "const" - }, - { - "name": "JAEGER_SAMPLER_PARAM", - "value": "1" - } - ] - } - } - ] - } -} -``` - - - -## REST Example - -![jaeger-ui-rest](./jaeger-ui-rest-example.png) - -## gRPC Example - -![jaeger-ui-rest](./jaeger-ui-grpc-example.png) - - -## Worked Example - -A full worked template example can be found [here](../examples/models/template_model_tracing/tracing.ipynb) diff --git a/docs/getting_started/graph.png b/docs/getting_started/graph.png deleted file mode 100644 index fc9c2c4291..0000000000 Binary files a/docs/getting_started/graph.png and /dev/null differ diff --git a/docs/getting_started/readme.md b/docs/getting_started/readme.md deleted file mode 100644 index 0617dccb5a..0000000000 --- a/docs/getting_started/readme.md +++ /dev/null @@ -1,35 +0,0 @@ - -# Getting started Seldon Core - -There are 3 steps to using seldon-core. - - 1. Install seldon-core onto a Kubernetes cluster - 1. Wrap your components (usually runtime model servers) as Docker containers that respect the internal Seldon microservice API. - 1. Define your runtime service graph as a SeldonDeployment resource and deploy your model and serve predictions - -![steps](./steps.png) - -## Install Seldon Core - -To install seldon-core follow the [installation guide](../install.md). - -## Wrap Your Model - -The components you want to run in production need to be wrapped as Docker containers that respect the [Seldon microservice API](../reference/internal-api.md). You can create models that serve predictions, routers that decide on where requests go, such as A-B Tests, Combiners that combine responses and transformers that provide generic components that can transform requests and/or responses. - -To allow users to easily wrap machine learning components built using different languages and toolkits we provide wrappers that allow you easily to build a docker container from your code that can be run inside seldon-core. Our current recommended tool is RedHat's Source-to-Image. Wrapping your models is discussed [here](../wrappers/readme.md). - -## Define Runtime Service Graph - -To run your machine learning graph on Kubernetes you need to define how the components you created in the last step fit together to represent a service graph. This is defined inside a [SeldonDeployment Kubernetes Custom resource](../reference/seldon-deployment.md). A [guide to constructing this custom resource service graph is provided](../inference-graph.md). - -![graph](./graph.png) - -## Deploy and Serve Predictions - -You can use ```kubectl``` to deploy your ML service like any other Kubernetes resource. This is discussed [here](../deploying.md). - -## Next Steps - - * [Jupyter notebooks showing worked examples](../../readme.md#quick-start) - diff --git a/docs/getting_started/steps.png b/docs/getting_started/steps.png deleted file mode 100644 index 65c14f9daf..0000000000 Binary files a/docs/getting_started/steps.png and /dev/null differ diff --git a/docs/grpc_max_message_size.md b/docs/grpc_max_message_size.md deleted file mode 100644 index b6b7e6400d..0000000000 --- a/docs/grpc_max_message_size.md +++ /dev/null @@ -1,85 +0,0 @@ -# GRPC Max Message Size - -GRPC has a default max message size of 4MB. If you need to run models whose input features or output predictions are larger than this you can configure Seldon Core to run with gRPC server/clients that handle a larger message size with annotations. - -Add the annotation ```seldon.io/grpc-max-message-size``` with the number of bytes of the largest expected message. For example the SeldonDeployment resource below sets this to 10MB: - -``` -{ - "apiVersion": "machinelearning.seldon.io/v1alpha2", - "kind": "SeldonDeployment", - "metadata": { - "labels": { - "app": "seldon" - }, - "name": "seldon-deployment-example" - }, - "spec": { - "annotations": { - "project_name": "FX Market Prediction", - "deployment_version": "v1", - "seldon.io/grpc-max-message-size":"10485760" - }, - "name": "test-deployment", - "oauth_key": "oauth-key", - "oauth_secret": "oauth-secret", - "predictors": [ - { - "componentSpecs": [{ - "spec": { - "containers": [ - { - "image": "seldonio/mock_classifier_grpc:1.0", - "imagePullPolicy": "IfNotPresent", - "name": "classifier", - "resources": { - "requests": { - "memory": "1Mi" - } - } - } - ], - "terminationGracePeriodSeconds": 20 - } - }], - "graph": { - "children": [], - "name": "classifier", - "endpoint": { - "type" : "GRPC" - }, - "type": "MODEL" - }, - "name": "fx-market-predictor", - "replicas": 1, - "annotations": { - "predictor_version" : "v1" - } - } - ] - } -} - -``` - -## API OAUTH Gateway - -If you are using the default API OAUTH Gateway you will also need to update your Helm or Ksonnet install: - -For Helm add to your Helm values, for example: - -``` -apife: - annotations: - seldon.io/grpc-max-message-size: "10485760" -``` - -For Ksonnet set the parameters grpcMaxMessageSize: - -``` -ks param set seldon-core grpcMaxMessageSize '10485760' --as-string -``` - -## Example - -To see a worked example, run the Jupyter notebook [here](../notebooks/max_grpc_msg_size.ipynb). diff --git a/docs/helm.md b/docs/helm.md deleted file mode 100644 index d626dd8950..0000000000 --- a/docs/helm.md +++ /dev/null @@ -1,64 +0,0 @@ -# Helm Chart Configuration - -The core choice in using the helm chart is to decide if you want to use Ambassador or the internal API OAuth gateway for ingress. - -## Seldon Core Chart Configuration - -### Seldon Core API OAuth Gateway (apife) - -|Parameter | Description | Default | -|----------|-------------|---------| -|apife.enabled| Whether to enable the default Oauth API gateway | true | -|apife.image.name | The image to use for the API gateway | `````` | -|apife.image.pull_policy | The pull policy for apife image | IfNotPresent | -|apife.service_type | The expose service type, e.g. NodePort, LoadBalancer | NodePort | -|apife.annotations | Configuration annotations | empty | - -### Seldon Core Operator (ClusterManager) - -|Parameter | Description | Default | -|----------|-------------|---------| -| cluster_manager.image.name | Image to use for Operator | ``````| -| cluster_manager.image.pull_policy | Pull policy for image | IfNotPresent | -| cluster_manager.java_opts | Extra Java Opts to pass to app | empty | -| cluster_manager.spring_opts | Spring specific opts to pass to app | empty | - -### Service Orchestrator (engine) - -|Parameter | Description | Default | -|----------|-------------|---------| -| engine.image.name | Image to use for service orchestrator | `````` | - -### Ambassador Reverse Proxy - -|Parameter | Description | Default | -|----------|-------------|---------| -| ambassador.enabled | Whether to enable the ambbassador reverse proxy | false | -| ambassador.annotations | Configuration for Ambassador | default | -| ambassador.image.name | Image to use for ambassador | `````` | -| ambassador.resources | resource limits and requests | default | -| ambassador.service_type | How to expose the ambassador service, e.g. NodePort, LoadBalancer | NodePort | -| ambassador.statsd.image.name | Image to use for statsd | default | - -### General Role Based Access Control Settings - -These settings should generally be left untouched from their defaults. Use of non-rbac clusters will not be supported in the future. Most of these settings are used by the Google marketplace one click installation for Seldon Core as in this setting Google will create the service account and role bindings. - -|Parameter | Description | Default | -|----------|-------------|---------| -| rbac.enabed | Whether to enabled RBAC | true | -| rbac.rolebinding.create | Whether to include role binding k8s settings | true | -| rbac.service_account.create | Whether to create the service account to use | true | -| rbac.service_account.name | The name of the service account to use | seldon | - - -### Redis - -Redis is used by Seldon Core for : - * Holding OAuth tokens for the API gateway - * Saving state of some components - -|Parameter | Description | Default | -|----------|-------------|---------| -| redis.image.name | Image to use for Redis | ``````| - diff --git a/docs/inf-graph.png b/docs/inf-graph.png deleted file mode 100644 index a225981d9d..0000000000 Binary files a/docs/inf-graph.png and /dev/null differ diff --git a/docs/inference-graph.md b/docs/inference-graph.md deleted file mode 100644 index ffadb2e18a..0000000000 --- a/docs/inference-graph.md +++ /dev/null @@ -1,46 +0,0 @@ -# Inference Graph - -Seldon Core extends Kubernetes with its own custom resource SeldonDeployment where you can define your runtime inference graph made up of models and other components that Seldon will manage. - -A SeldonDeployment is a JSON or YAML file that allows you to define your graph of component images and the resources each of those images will need to run (using a Kubernetes PodTemplateSpec). The parts of a SeldonDeployment are shown below: - -![inference-graph](./inf-graph.png) - -A minimal example for a single model, this time in YAML, is shown below: -``` -apiVersion: machinelearning.seldon.io/v1alpha2 -kind: SeldonDeployment -metadata: - name: seldon-model -spec: - name: test-deployment - predictors: - - componentSpecs: - - spec: - containers: - - image: seldonio/mock_classifier:1.0 - graph: - children: [] - endpoint: - type: REST - name: classifier - type: MODEL - name: example - replicas: 1 -``` - -The key components are: - - * A list of Predictors, each with a specification for the number of replicas. - * Each defines a graph and its set of deployments. Multiple predictors is useful when you want to split traffic between a main graph and a canary or for other production rollout scenarios. - * For each predictor a list of componentSpecs. Each componentSpec is a Kubernetes PodTemplateSpec which Seldon will build into a Kubernetes Deployment. Place here the images from your graph and their requirements, e.g. Volumes, ImagePullSecrets, Resources Requests etc. - * A graph specification that describes how your components are joined together. - -To understand the inference graph definition in detail see [here](crd/readme.md) - -## Next Steps - - * [Jupyter notebooks showing worked examples](../readme.md#quick-start) - * [Templated Helm Charts](../helm-charts/README.md#seldon-core-inference-graph-templates) - * [Integration with other machine learning frameworks](../readme.md#integrations) - diff --git a/docs/install.md b/docs/install.md deleted file mode 100644 index 82dd4accee..0000000000 --- a/docs/install.md +++ /dev/null @@ -1,93 +0,0 @@ -# Install Seldon-Core - -To install seldon-core on a Kubernetes cluster you have several choices: - - * If you have a Google Cloud Platform account you can install via the [GCP Marketplace](https://console.cloud.google.com/marketplace/details/seldon-portal/seldon-core). - -For CLI installs: - - * Decide on which package manager to use, we support: - * Helm - * Ksonnet - * Decide on how you wish APIs to be exposed, we support: - * Ambassador reverse proxy - * Seldon's built-in OAuth API Gateway - * Decide on whether you wish to contribute anonymous usage metrics. We encourage you to allow anonymous usage metrics to help us improve the project by understanding the deployment environments. More details can be found [here](/docs/developer/readme.md#usage-reporting) - * Does your Kubernetes cluster have RBAC enabled? - * If not then disable Seldon RBAC setup - -Follow one of the methods below: - -## With Helm - - * [Install Helm](https://docs.helm.sh) - * Install Seldon CRD. Set: - * ```usage_metrics.enabled``` as appropriate. - -``` -helm install seldon-core-crd --name seldon-core-crd --repo https://storage.googleapis.com/seldon-charts \ - --set usage_metrics.enabled=true -``` - * Install seldon-core components. Set - * ```apife.enabled``` : (default true) set to ```false``` if you have installed Ambassador. - * ```rbac.enabled``` : (default true) set to ```false``` if running an old Kubernetes cluster without RBAC. - * ```ambassador.enabled``` : (default false) set to ```true``` if you want to run with an Ambassador reverse proxy. - * ```single_namespace``` : (default true) if set to ```true``` then Seldon Core's permissions are restricted to the single namespace it is created within. If set to ```false``` then RBAC cluster roles will be created to allow a single Seldon Core installation to control all namespaces. The installer must have permissions to create the appropriate RBAC roles. (>=0.2.5) -``` -helm install seldon-core --name seldon-core --repo https://storage.googleapis.com/seldon-charts \ - --set apife.enabled= \ - --set rbac.enabled= \ - --set ambassador.enabled= - --set single_namespace= -``` - -Notes - - * You can use ```--namespace``` to install seldon-core to a particular namespace - * For full configuration options see [here](helm.md) - -## With Ksonnet - - * [install Ksonnet](https://ksonnet.io/) - * Create a seldon ksonnet app - ``` - ks init my-ml-deployment --api-spec=version:v1.8.0 - ``` - * Install seldon-core. Set: - * ```withApife``` set to ```false``` if you are using Ambassador - * ```withAmbassador``` set to ```true``` if you want to use Ambassador reverse proxy - * ```withRbac``` set to ```true``` if your cluster has RBAC enabled - * ```singleNamespace``` (default true) if set to ```true``` then Seldon Core's permissions are restricted to the single namespace it is created within. If set to ```false``` then RBAC cluster roles will be created to allow a single Seldon Core installation to control all namespaces. The installer must have permissions to create the appropriate RBAC roles. (>=0.2.5) - -``` -cd my-ml-deployment && \ - ks registry add seldon-core github.com/SeldonIO/seldon-core/tree/master/seldon-core && \ - ks pkg install seldon-core/seldon-core@master && \ - ks generate seldon-core seldon-core \ - --withApife= \ - --withAmbassador= \ - --withRbac= \ - --singleNamespace= -``` - * Launch components onto cluster - ``` - ks apply default - ``` -Notes - - * You can use ```--namespace``` to install seldon-core to a particular namespace - -## Other Options - -### Install with Kubeflow - - * [Install Seldon as part of Kubeflow.](https://www.kubeflow.org/docs/guides/components/seldon/#seldon-serving) - * Kubeflow presently runs 0.1 version of seldon-core. This will be updated to 0.2 in the near future. - - -## Next Steps - - * [Jupyter notebooks showing worked examples](../readme.md#quick-start) - * Seldon Core Analytics (example Prometheus and Grafana) - * [Helm Chart](../helm-charts/seldon-core-analytics) - * [Ksonnet Package](../seldon-core/seldon-core-analytics) \ No newline at end of file diff --git a/docs/istio.md b/docs/istio.md deleted file mode 100644 index 7f5986d92a..0000000000 --- a/docs/istio.md +++ /dev/null @@ -1,18 +0,0 @@ -# Istio and Seldon - -[Istio](https://istio.io/) provides service mesh functionality and can be a useful addition to Seldon to provide extra traffic management, end-to-end security and policy enforcement in your runtime machine learning deployment graph. Seldon-core can be seen as providing a service graph for machine learning deployments. As part of that it provides an Operator which takes your ML deployment graph definition described as a SeldonDeployment Kubernetes resource and deploys and manages it on a Kubernetes cluster so you can connect your business applications that need to access machine learning services. Data scientists can focus on building pluggable docker containers for parts of their runtime machine learning graph, such as runtime inference, transformations, outlier detection, ensemblers etc. These can be composed together as needed to satisfy your runtime ML functionality. To allow modules to be built without knowing what service graph they will exist in means Seldon also deploys a Service Orchestrator as part of each deployment which manages the request/response flow to satisfy the defined ML service graph for multi-component graphs. - -The components are illustrated below. A user's graph resource definition (graph.yaml) is sent over the Kubernetes API and the Seldon Operator manages the creation and update of the underlying components including the Seldon service orchestrator which manages the request/response flow logic through the deployed graph. - -![svc-graph](./svc-graph.png) - -Out of the box Seldon provides rolling updates to SeldonDeployment service graphs provided by the underlying Kubernetes functionality. However, there are cases where you want to manage updates to your ML deployments in a more controlled way with fine grained traffic management including canary updates, blue-green deployments and shadowing. This is where Istio can help in combination with Seldon. - -The addition of Istio is complementary to Seldon and is illustrated below where Envoy sidecars are injected into the defined Kubernetes Deployments and the user can manage the service mesh using the Istio control plane. - -![svc-graph-istio](./svc-graph-istio.png) - -# Worked Examples - -[An example step-by-step guide to canary deployments using Istio and Seldon is provided](../examples/istio/canary_update/canary.ipynb). - diff --git a/docs/jaeger-ui-grpc-example.png b/docs/jaeger-ui-grpc-example.png deleted file mode 100644 index 7bc6c0f3e5..0000000000 Binary files a/docs/jaeger-ui-grpc-example.png and /dev/null differ diff --git a/docs/jaeger-ui-rest-example.png b/docs/jaeger-ui-rest-example.png deleted file mode 100644 index 7b4b2823fe..0000000000 Binary files a/docs/jaeger-ui-rest-example.png and /dev/null differ diff --git a/docs/ksonnet.md b/docs/ksonnet.md deleted file mode 100644 index 7d410343ad..0000000000 --- a/docs/ksonnet.md +++ /dev/null @@ -1,17 +0,0 @@ -# Ksonnet Configuration - -|Parameter|Description|Default| -|---------|-----------|-------| -| namespace | Namespace to use for the components. It is automatically inherited from the environment if not set. | default or from env | -| withRbac | Whether to include RBAC setup | true | -| withApife | Whether to include builtin API OAuth gateway server for ingress | true | -| withAmbassador | Whether to include Ambassador reverse proxy | false | -| apifeImage | Default image for API Front End | `````` | -| apifeServiceType | API Front End Service Type | NodePort | -| operatorImage | Seldon cluster manager image version | `````` | -| operatorSpringOpts | Operator spring opts | empty | -| operatorJavaOpts | Operator | java opts | empty | -| engineImage | Seldon engine image version | `````` | -| grpcMaxMessageSize | Max gRPC message size | 4MB | - - diff --git a/docs/private_registries.md b/docs/private_registries.md deleted file mode 100644 index 6144f3282d..0000000000 --- a/docs/private_registries.md +++ /dev/null @@ -1,47 +0,0 @@ -# Pulling from Private Docker Registries - -To pull images from private Docker registries simply add imagePullSecrets to the podTemplateSpecs for your SeldonDeployment resources. For example, show below is a simple model which uses a private image ```private-docker-repo/my-image```. You will need to have created the Kubernetes docker registry secret ```myreposecret``` before applying the resource to your cluster. - -``` -{ - apiVersion: "machinelearning.seldon.io/v1alpha2", - kind: "SeldonDeployment", - metadata: { - name: private-model, - }, - spec: { - name: private-model-example, - predictors: [ - { - componentSpecs: [{ - spec: { - containers: [ - { - image: private-docker-repo/my-image, - name: private-model, - }, - ], - imagePullSecrets: [ - { - name: myreposecret - }, - ], - }, - }], - graph: { - children: [], - endpoint: { - type: REST, - }, - name: private-model, - type: "MODEL", - }, - name: private-model, - replicas: 1, - }, - ], - }, -} -``` - -To create the docker registry secret see the [Kubernetes docs](https://kubernetes.io/docs/concepts/containers/images/#creating-a-secret-with-a-docker-config). diff --git a/docs/production.md b/docs/production.md deleted file mode 100644 index 282f4064ef..0000000000 --- a/docs/production.md +++ /dev/null @@ -1,10 +0,0 @@ - -# Running in Production - -This page will discuss the various added functionality you might need for running Seldon Core in a production environment. - - * [Pulling from Private Docker Registries](private_registries.md) - * [gRPC max message size](grpc_max_message_size.md) - * [Custom Metrics](custom_metrics.md) - - diff --git a/docs/proposals/custom_metrics.md b/docs/proposals/custom_metrics.md deleted file mode 100644 index 5052bd54d0..0000000000 --- a/docs/proposals/custom_metrics.md +++ /dev/null @@ -1,77 +0,0 @@ -# Custom Metrics - -## Summary - -Allow users to easily add custom metrics to their Seldon Core components. For example, pass back extra metrics from a wrapped python model that can be collected by Prometheus and displayed on Grafana dashboards. - -## Proposal - -Extend the SeldonMessage proto buffer to have a "metrics" element in the meta data part, e.g., - -```JSON -{ -"meta" : { - "metrics" : [ - { "key" : "my_metric_1", "type" : "COUNTER", "value" : 1 }, - { "key" : "my_metric_2", "type" : "GAUGE", "value" : 223 } - ] -} -} -``` - -These metrics would be automaticaly exposed to prometheus from the Seldon Orchestrator Engine. - -The wrappers would need to be updated to allow users to not just return a prediction but also optionally provide metrics to return. - -## Metrics definition - -The extended meta data section would be: - -``` -message Meta { - string puid = 1; - map tags = 2; - map routing = 3; - map requestPath = 4; - repeated Metric metrics = 5; -} - -message Metric { - enum MetricType { - COUNTER = 0; - GAUGE = 1; - TIMER = 2; - } - string key = 1; - MetricType type = 2; - float value = 3; -} -``` - - -## Metric Types - Histogram Complexities -We use [Micrometer](https://micrometer.io) for exposing metrics. Counter and gauge are pretty standard but Prometheus has Histogram and Summary. Histogram seems most advantagous as you can summarize the data on prometheus later. However, you need to set the number of buckets you want to collect statistics. For micrometer the default is to set a min and max range and it will create a set of buckets for you. For Micrometer Timers there are defaults in [Micrometer](https://micrometer.io/docs/concepts#_histograms_and_percentiles) set for the range 1ms - 1minute. The trouble with general histograms is you would need to expose this setting which is probably too complex. - - * Suggest we support just "TIMER" which is essentially a Prometheus histogram for durations with a range 1ms-1minute and a default set of buckets - - -## Engine Implementation - - 1. For each component if there is a metrics section parse and expose via prometheus each metric of the appropriate type. - 2. Merge all metrics into final set for returning externally - -## Wrapper Implementations - -### Python Wrapper - -Add optional new function in class user defines - -``` -def metrics(self): - return [ - { "key" : "my_metric_1", "type" : "COUNTER", "value" : self.counter1 }, - { "key" : "my_metric_2", "type" : "GAUGE", "value" : self.guage1 } - ] -``` - - diff --git a/docs/proposals/roadmap.md b/docs/proposals/roadmap.md deleted file mode 100644 index aff51b1076..0000000000 --- a/docs/proposals/roadmap.md +++ /dev/null @@ -1,58 +0,0 @@ -# Seldon Core Roadmap - -The high level roadmap for Seldon Core. Feedback and additions very welcome. - -☑ : Done or In Progress - -## Core (required for 1.0) - - - SeldonDeployment K8S CRD stability - - API stability - - External API (predict, feedback endpoints) - - Internal microservice API (predict, transform, route etc.) - - Wrapper stability for core wrappers - - Python reference wrapper - - Metrics and logging stability - - Benchmarks for ongoing optimisation - -## Reference Examples (desired for 1.0) - - - ☑ ML Examples for various toolkits - - Core components - - Outlier Detection - - Concept drift - - Model Explanation - - E2E pipeline examples - - Kubeflow/Kubeflow Pipelines - - ☑ CI/CD - - GPU/TPU - - ☑ Deployment - - Rolling, Canary, Red-Green - -## Developer (required for 1.0) - - - ☑ Unit Test Coverage - - ☑ E2E tests - - ☑ CI pipelines (Prow) - - Automated release process - - Core, Python PyPI, AWS Marketplace, Google Marketplace, Kubeflow, helm charts, ksonnet - - New Documentation site - -## Specific Functionality - -Please provide feedback and additions for helping us decide priorities. - - - ☑ Autoscaling - - Kubeflow Pipelines integration - - No service orchestrator for single model deployments - - Wider range of image build integrations: e.g., Kaniko, img, GCB - - NVIDIA Rapids, Dali integration - - Julia, C++ wrappers - - ☑ Shadow deployments - - Edge deployments - - Use of Apache Arrow for zero-copy, zero bandwidth RPC - - Further Istio integration - - Nginx-ingress - - Openshift integration - - Serverless/KNative - \ No newline at end of file diff --git a/docs/python/python_module.md b/docs/python/python_module.md deleted file mode 100644 index 85053a568e..0000000000 --- a/docs/python/python_module.md +++ /dev/null @@ -1,61 +0,0 @@ -# Seldon Core Python Package - -Seldon Core has a python package `seldon_core` available on PyPI. The package makes it easier to work with Seldon Core if you are using python and is the basis of the Python S2I wrapper. The module provides: - - * `seldon-core-microservice` executable to serve microservice components in Seldon Core. This is used by the Python Wrapper for Seldon Core. - * `seldon-core-microservice-tester` executable to test running Seldon Core microservices over REST or gRPC. - * `seldon-core-api-tester` executable to test the external API for running Seldon Deployment inference graphs over REST or gRPC. - * `seldon_core.seldon_client` library. Core reference API module to call Seldon Core services (internal microservices or the external API). This is used by the testing executable and can be used by users which to build their own clients to Seldon Core in Python. - -## Install - -Install from PyPI with: - -``` -pip install seldon-core -``` - -## Seldon Core Microservices - -Seldon allows you to easily take your runtime inference code and create a Docker container that can be managed by Seldon Core. Follow the instructions [here](../wrappers/python.md) to wrap your code using the S2I tool. - -You can also create your own image and utilise the `seldon-core-microservice` executable to run your model code. - -## Testing Seldon Core Microservices - -To test your microservice standalone or your running Seldon Deployment inside Kubernetes you can follow the [API testing docs](../api-testing.md) - - -## Seldon Core Python API Client - -The python package contains a module that provides a reference python client for the internal Seldon Core microservice API and the external APIs. More specifically it provides: - - * Internal microservice API - * Make REST or gRPC calls - * Test all methods: `predict`, `transform-input`, `transform-output`, `route`, `aggregate` - * Provide a numpy array, binary data or string data as payload or get random data generated as payload for given shape - * Send data as tensor, TFTensor or ndarray - * External API - * Make REST or gRPC calls - * Call the API via Ambassador or Seldon's OAUTH API gateway. - * Test `predict` or `feedback` endpoints - * Provide a numpy array, binary data or string data as payload or get random data generated as payload for given shape - * Send data as tensor, TFTensor or ndarray - -Basic usage of the client is to create a `SeldonClient` object first. For example for a Seldon Deployment called "mymodel` running in the namespace `seldon` with Ambassador endpoint at "localhost:8003" (i.e., via port-forwarding): - -```python -from seldon_core.seldon_client import SeldonClient -sc = SeldonClient(deployment_name="mymodel",namespace="seldon", ambassador_endpoint="localhost:8003") -``` - -Then make calls of various types. For example, to make a random prediction via the Ambassador gateway using REST: - -```python -r = sc.predict(gateway="ambassador",transport="rest") -print(r) -``` - -Examples of using the `seldon_client` module can be found in the [example notebook](https://github.com/SeldonIO/seldon-core/blob/master/notebooks/helm_examples.ipynb). - -Full python docs will be available shortly. diff --git a/docs/readme.md b/docs/readme.md deleted file mode 100644 index 68fbf430c6..0000000000 --- a/docs/readme.md +++ /dev/null @@ -1,16 +0,0 @@ -# Seldon Documentation - -## Deployment Guide - -![API](./deploy.png) - -Three steps: - - 1. [Wrap your runtime prediction model](./wrappers/readme.md). - 1. [Define your runtime inference graph in a seldon deployment custom resource](./crd/readme.md). - 1. [Deploy the graph](./deploying.md). - -## Reference - - - [Prediction API](./reference/prediction.md) - - [Seldon Deployment Custom Resource](./reference/seldon-deployment.md) diff --git a/docs/reference/api.png b/docs/reference/api.png deleted file mode 100644 index 0f3f7d9109..0000000000 Binary files a/docs/reference/api.png and /dev/null differ diff --git a/docs/reference/external-prediction.md b/docs/reference/external-prediction.md deleted file mode 100644 index 1f70388a92..0000000000 --- a/docs/reference/external-prediction.md +++ /dev/null @@ -1,32 +0,0 @@ -# External Prediction API - -![API](./api.png) - -The Seldon Core exposes a generic external API to connect your ML runtime prediction to external business applications. - -## REST API - -### Prediction - - - endpoint : POST /api/v0.1/predictions - - payload : JSON representation of ```SeldonMessage``` - see [proto definition](./prediction.md/#proto-buffer-and-grpc-definition) - - example payload : - ```json - {"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}} - ``` -### Feedback - - - endpoint : POST /api/v0.1/feedback - - payload : JSON representation of ```Feedback``` - see [proto definition](./prediction.md/#proto-buffer-and-grpc-definition) - -## gRPC - -``` -service Seldon { - rpc Predict(SeldonMessage) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; - } -``` - -see full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition) - diff --git a/docs/reference/graph.png b/docs/reference/graph.png deleted file mode 100644 index 35e21117c6..0000000000 Binary files a/docs/reference/graph.png and /dev/null differ diff --git a/docs/reference/internal-api.md b/docs/reference/internal-api.md deleted file mode 100644 index 473734ceac..0000000000 --- a/docs/reference/internal-api.md +++ /dev/null @@ -1,197 +0,0 @@ -# Internal Microservice API - -![graph](./graph.png) - -To add microservice components to a runtime prediction graph users need to create service that respects the internal API. The API provides a default service for each type of component within the system: - - - [Model](#model) - - [Router](#router) - - [Combiner](#combiner) - - [Transformer](#transformer) - - [Output_Transformer](#output_transformer) - - -## Model - -A service to return predictions. - -### REST API - -#### Predict - - | | | - | - |- | - | Endpoint | POST /predict | - | Request | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - -Example request payload: - -```json -{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}} -``` - -### gRPC - -```protobuf -service Model { - rpc Predict(SeldonMessage) returns (SeldonMessage) {}; - } -``` - -See full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition). - -## Router - -A service to route requests to one of its children and receive feedback rewards for them. - -### REST API - -#### Route - - | | | - | - |- | - | Endpoint | POST /route | - | Request | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - -Example request payload: - -```json -{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}} -``` - -#### Send Feedback - - | | | - | - |- | - | Endpoint | POST /send-feedback | - | Request | JSON representation of [```Feedback```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - -Example request payload: - -```json - "request": { - "data": { - "names": ["a", "b"], - "tensor": { - "shape": [1, 2], - "values": [0, 1] - } - } - }, - "response": { - "data": { - "names": ["a", "b"], - "tensor": { - "shape": [1, 1], - "values": [0.9] - } - } - }, - "reward": 1.0 -} -``` - - -### gRPC - -```protobuf -service Router { - rpc Route(SeldonMessage) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; - } -``` - -See full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition). - -## Combiner - -A service to combine responses from its children into a single response. - -### REST API - -#### Combine - - | | | - | - |- | - | Endpoint | POST /combine | - | Request | JSON representation of [```SeldonMessageList```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - - -### gRPC - -```protobuf -service Combiner { - rpc Aggregate(SeldonMessageList) returns (SeldonMessage) {}; -} -``` - -See full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition). - - - -## Transformer - -A service to transform its input. - -### REST API - -#### Transform - - | | | - | - |- | - | Endpoint | POST /transform-input | - | Request | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - -Example request payload: - -```json -{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}} -``` - -### gRPC - -```protobuf -service Transformer { - rpc TransformInput(SeldonMessage) returns (SeldonMessage) {}; -} -``` - -See full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition). - - - -## Output_Transformer - -A service to transform the response from its child. - -### REST API - -#### Transform - - | | | - | - |- | - | Endpoint | POST /transform-output | - | Request | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - | Response | JSON representation of [```SeldonMessage```](./prediction.md/#proto-buffer-and-grpc-definition) | - -Example request payload: - -```json -{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}} -``` - -### gRPC - -```protobuf -service OutputTransformer { - rpc TransformOutput(SeldonMessage) returns (SeldonMessage) {}; -} -``` - -See full [proto definition](./prediction.md/#proto-buffer-and-grpc-definition). - diff --git a/docs/reference/prediction.md b/docs/reference/prediction.md deleted file mode 100644 index 30cad78d5d..0000000000 --- a/docs/reference/prediction.md +++ /dev/null @@ -1,143 +0,0 @@ -# Prediction API - -Seldon Core uses REST and gRPC APIs exposed externally for business applications to connect to and also internally for microservices to implement models, routers, combiners and transformers. - - - [External Prediction API](external-prediction.md) - - Read this if you want to connect external business applications - - [Internal Prediction API](internal-api.md) - - Read this if you want to build a microservice to wrap a model or build another type of component such as a router, combiner or transformer - - -## Proto Buffer and gRPC Definition - -```proto -syntax = "proto3"; - -import "google/protobuf/struct.proto"; -import "tensorflow/core/framework/tensor.proto"; - -package seldon.protos; - -option java_package = "io.seldon.protos"; -option java_outer_classname = "PredictionProtos"; - -// [START Messages] - -message SeldonMessage { - - Status status = 1; - Meta meta = 2; - oneof data_oneof { - DefaultData data = 3; - bytes binData = 4; - string strData = 5; - } -} - -message DefaultData { - repeated string names = 1; - oneof data_oneof { - Tensor tensor = 2; - google.protobuf.ListValue ndarray = 3; - tensorflow.TensorProto tftensor = 4; - } -} - -message Tensor { - repeated int32 shape = 1 [packed=true]; - repeated double values = 2 [packed=true]; -} - -message Meta { - string puid = 1; - map tags = 2; - map routing = 3; - map requestPath = 4; - repeated Metric metrics = 5; -} - -message Metric { - enum MetricType { - COUNTER = 0; - GAUGE = 1; - TIMER = 2; - } - string key = 1; - MetricType type = 2; - float value = 3; - map tags = 4; -} - -message SeldonMessageList { - repeated SeldonMessage seldonMessages = 1; -} - -message Status { - - enum StatusFlag { - SUCCESS = 0; - FAILURE = 1; - } - - int32 code = 1; - string info = 2; - string reason = 3; - StatusFlag status = 4; -} - -message Feedback { - SeldonMessage request = 1; - SeldonMessage response = 2; - float reward = 3; - SeldonMessage truth = 4; -} - -message RequestResponse { - SeldonMessage request = 1; - SeldonMessage response = 2; -} - -// [END Messages] - - -// [START Services] - -service Generic { - rpc TransformInput(SeldonMessage) returns (SeldonMessage) {}; - rpc TransformOutput(SeldonMessage) returns (SeldonMessage) {}; - rpc Route(SeldonMessage) returns (SeldonMessage) {}; - rpc Aggregate(SeldonMessageList) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; -} - -service Model { - rpc Predict(SeldonMessage) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; - } - -service Router { - rpc Route(SeldonMessage) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; - } - -service Transformer { - rpc TransformInput(SeldonMessage) returns (SeldonMessage) {}; -} - -service OutputTransformer { - rpc TransformOutput(SeldonMessage) returns (SeldonMessage) {}; -} - -service Combiner { - rpc Aggregate(SeldonMessageList) returns (SeldonMessage) {}; -} - - -service Seldon { - rpc Predict(SeldonMessage) returns (SeldonMessage) {}; - rpc SendFeedback(Feedback) returns (SeldonMessage) {}; - } - -// [END Services] -``` - diff --git a/docs/reference/readme.md b/docs/reference/readme.md deleted file mode 100644 index c9adf6dd23..0000000000 --- a/docs/reference/readme.md +++ /dev/null @@ -1,7 +0,0 @@ -# Reference - - - [Prediction API](./prediction.md) - - [External API](external-prediction.md) - - [Internal Microservice API](internal-api.md) - - [Seldon Deployment](./seldon-deployment.md) custom resource. - \ No newline at end of file diff --git a/docs/reference/seldon-deployment.md b/docs/reference/seldon-deployment.md deleted file mode 100644 index 38e500898f..0000000000 --- a/docs/reference/seldon-deployment.md +++ /dev/null @@ -1,211 +0,0 @@ -# Seldon Deployment - -A SeldonDeployment is defined as a custom resource definition within Kubernetes. - - * [Proto Buffer Definiton](#definition) - * [Examples](#examples) - -## Proto Buffer Definition -The Seldon Deployment Custom Resource is defined using Proto Buffers. - -```proto -syntax = "proto2"; -package seldon.protos; - -import "k8s.io/apimachinery/pkg/apis/meta/v1/generated.proto"; -import "v1.proto"; - -option java_package = "io.seldon.protos"; -option java_outer_classname = "DeploymentProtos"; - -message SeldonDeployment { - required string apiVersion = 1; - required string kind = 2; - optional k8s.io.apimachinery.pkg.apis.meta.v1.ObjectMeta metadata = 3; - required DeploymentSpec spec = 4; - optional DeploymentStatus status = 5; -} - -/** - * Status for seldon deployment - */ -message DeploymentStatus { - optional string state = 1; // A short status value for the deployment. - optional string description = 2; // A longer description describing the current state. - repeated PredictorStatus predictorStatus = 3; // A list of individual statuses for each running predictor. -} - -message PredictorStatus { - required string name = 1; // The name of the predictor. - optional string status = 2; // A short status value. - optional string description = 3; // A longer description of the current status. - optional int32 replicas = 4; // The number of replicas requested. - optional int32 replicasAvailable = 5; // The number of replicas available. -} - - -message DeploymentSpec { - optional string name = 1; // A unique name within the namespace. - repeated PredictorSpec predictors = 2; // A list of 1 or more predictors describing runtime machine learning deployment graphs. - optional string oauth_key = 3; // The oauth key for external users to use this deployment via an API. - optional string oauth_secret = 4; // The oauth secret for external users to use this deployment via an API. - map annotations = 5; // Arbitrary annotations. -} - -message PredictorSpec { - required string name = 1; // A unique name not used by any other predictor in the deployment. - required PredictiveUnit graph = 2; // A graph describing how the predictive units are connected together. - repeated k8s.io.api.core.v1.PodTemplateSpec componentSpecs = 3; // A description of the set of containers used by the graph. One for each microservice defined in the graph. Can be split over 1 or more PodTemplateSpecs. - optional int32 replicas = 4; // The number of replicas of the predictor to create. - map annotations = 5; // Arbitrary annotations. - optional k8s.io.api.core.v1.ResourceRequirements engineResources = 6 [deprecated=true]; // Optional set of resources for the Seldon engine which is added to each Predictor graph to manage the request/response flow - map labels = 7; // labels to be attached to entry deplyment for this predictor - optional SvcOrchSpec svcOrchSpec = 8; // Service Orchestrator configuration -} - -message SvcOrchSpec { - optional k8s.io.api.core.v1.ResourceRequirements resources = 1; - repeated k8s.io.api.core.v1.EnvVar env = 2; -} - - -/** - * Represents a unit in a runtime prediction graph that performs a piece of functionality within the prediction request/response calls. - */ -message PredictiveUnit { - - /** - * The main type of the predictive unit. Routers decide where requests are sent, e.g. AB Tests and Multi-Armed Bandits. Combiners ensemble responses from their children. Models are leaft nodes in the predictive tree and provide request/reponse functionality encapsulating a machine learning model. Transformers alter the request features. - */ - enum PredictiveUnitType { - // Each one of these defines a default combination of Predictive Unit Methods - UNKNOWN_TYPE = 0; - ROUTER = 1; // Route + send feedback - COMBINER = 2; // Aggregate - MODEL = 3; // Transform input - TRANSFORMER = 4; // Transform input (alias) - OUTPUT_TRANSFORMER = 5; // Transform output - } - - enum PredictiveUnitImplementation { - // Each one of these are hardcoded in the engine, no microservice is used - UNKNOWN_IMPLEMENTATION = 0; // No implementation (microservice used) - SIMPLE_MODEL = 1; // An internal model stub for testing. - SIMPLE_ROUTER = 2; // An internal router for testing. - RANDOM_ABTEST = 3; // A A-B test that sends traffic 50% to each child randomly. - AVERAGE_COMBINER = 4; // A default combiner that returns the average of its children responses. - } - - enum PredictiveUnitMethod { - TRANSFORM_INPUT = 0; - TRANSFORM_OUTPUT = 1; - ROUTE = 2; - AGGREGATE = 3; - SEND_FEEDBACK = 4; - } - - required string name = 1; //must match container name of component if no implementation - repeated PredictiveUnit children = 2; // The child predictive units. - optional PredictiveUnitType type = 3; - optional PredictiveUnitImplementation implementation = 4; - repeated PredictiveUnitMethod methods = 5; - optional Endpoint endpoint = 6; // The exposed endpoint for this unit. - repeated Parameter parameters = 7; // Customer parameter to pass to the unit. -} - -message Endpoint { - - enum EndpointType { - REST = 0; // REST endpoints with JSON payloads - GRPC = 1; // gRPC endpoints - } - - optional string service_host = 1; // Hostname for endpoint. - optional int32 service_port = 2; // The port to connect to the service. - optional EndpointType type = 3; // The protocol handled by the endpoint. -} - -message Parameter { - - enum ParameterType { - INT = 0; - FLOAT = 1; - DOUBLE = 2; - STRING = 3; - BOOL = 4; - } - - required string name = 1; - required string value = 2; - required ParameterType type = 3; - -} - - -``` - - -## Examples - -## Single Model - - * The model is contained in the image ```seldonio/mock_classifier:1.0``` - * The model requests 1 MB of memory - * The model defines oauth key and secret for use with seldon-core's built in API gateway. - * The model supports a REST API - -```json -{ - "apiVersion": "machinelearning.seldon.io/v1alpha2", - "kind": "SeldonDeployment", - "metadata": { - "labels": { - "app": "seldon" - }, - "name": "seldon-deployment-example" - }, - "spec": { - "annotations": { - "project_name": "FX Market Prediction", - "deployment_version": "v1" - }, - "name": "test-deployment", - "oauth_key": "oauth-key", - "oauth_secret": "oauth-secret", - "predictors": [ - { - "componentSpecs": [{ - "spec": { - "containers": [ - { - "image": "seldonio/mock_classifier:1.0", - "imagePullPolicy": "IfNotPresent", - "name": "classifier", - "resources": { - "requests": { - "memory": "1Mi" - } - } - } - ], - "terminationGracePeriodSeconds": 20 - } - }], - "graph": { - "children": [], - "name": "classifier", - "endpoint": { - "type" : "REST" - }, - "type": "MODEL" - }, - "name": "fx-market-predictor", - "replicas": 1, - "annotations": { - "predictor_version" : "v1" - } - } - ] - } -} -``` diff --git a/docs/seldon.png b/docs/seldon.png deleted file mode 100644 index 9df9c959c0..0000000000 Binary files a/docs/seldon.png and /dev/null differ diff --git a/docs/serving.md b/docs/serving.md deleted file mode 100644 index 8cfaffffd6..0000000000 --- a/docs/serving.md +++ /dev/null @@ -1,79 +0,0 @@ -# Serving Predictions - -Depending on whether you deployed Seldon Core with Ambassador or the API Gateway you can access your models as discussed below: - -## Ambassador - -### Ambassador REST - -Assuming Ambassador is exposed at `````` and with a Seldon deployment name ``````: - - * A REST endpoint will be exposed at : ```http:///seldon//api/v0.1/predictions``` - - -### Ambassador gRPC - -Assuming Ambassador is exposed at `````` and with a Seldon deployment name ``````: - - * A gRPC endpoint will be exposed at `````` and you should send metadata in your request with key ```seldon``` and value ``````. - - -## API OAuth Gateway - -The HTTP and OAuth endpoints will be on separate ports, default is 8080 (HTTP) and 5000 (gRPC). - -### OAuth REST - -Assuming the API Gateway is exposed at `````` - - 1. You should get an OAuth token from ```/oauth/token``` - 1. You should make prediction requests to ```/api/v0.1/predictions``` with the OAuth token in the header as ```Authorization: Bearer ``` - -### OAuth gRPC - -Assuming the API gRPC Gateway is exposed at `````` - - 1. You should get an OAuth token from ```/oauth/token``` - 1. Send gRPC requests to `````` with the OAuth token in the meta data as ```oauth_token: ``` - -## Client Implementations - -### Curl Examples - -#### Ambassador REST - -Assuming a SeldonDeplotment ```mymodel``` with Ambassador exposed on 0.0.0.0:8003: - -``` -curl -v 0.0.0.0:8003/seldon/mymodel/api/v0.1/predictions -d '{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}' -H "Content-Type: application/json" -``` - - -#### API OAuth Gateway REST - -Assume server is accessible at 0.0.0.0:8002. - -Get a token. Assuming the OAuth key is ```oauth-key``` and OAuth secret is ```oauth-secret``` as specified in the SeldonDeployment graph you created: - -``` -TOKENJSON=$(curl -XPOST -u oauth-key:oauth-secret 0.0.0.0:8002/oauth/token -d 'grant_type=client_credentials') -TOKEN=$(echo $TOKENJSON | jq ".access_token" -r) -``` - -Get predictions -``` -curl -w "%{http_code}\n" --header "Authorization: Bearer $TOKEN" 0.0.0.0:8002/api/v0.1/predictions -d '{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}' -H "Content-Type: application/json" -``` - -### OpenAPI REST - -Use Swagger to generate a client for you from the [OpenAPI specifications](../openapi/README.md). - -### gRPC - -Use [gRPC](https://grpc.io/) tools in your desired language from the [proto buffer specifications](../proto/prediction.proto). - -#### Example Python - -See [example python code](../notebooks/seldon_utils.py). - diff --git a/docs/svc-graph-istio.png b/docs/svc-graph-istio.png deleted file mode 100644 index caf8a197dd..0000000000 Binary files a/docs/svc-graph-istio.png and /dev/null differ diff --git a/docs/svc-graph.png b/docs/svc-graph.png deleted file mode 100644 index 19a77fc4b6..0000000000 Binary files a/docs/svc-graph.png and /dev/null differ diff --git a/docs/usage-reporting.md b/docs/usage-reporting.md deleted file mode 100644 index 195af839cf..0000000000 --- a/docs/usage-reporting.md +++ /dev/null @@ -1,68 +0,0 @@ -## Usage Reporting with Spartakus - -An important part of the development process is to better understand the real user environment that the application will run in. - -We provide an option to use an anonymous metrics collection tool provided by the Kubernetes project called [Spartakus](https://github.com/kubernetes-incubator/spartakus). - -### Enable Usage Reporting - -To help support the development of seldon-core, the voluntary reporting of usage data can be enabled whenever the "seldon-core-crd" helm chart is used by setting the "--set usage_metrics.enabled=true" option. - -```bash -helm install seldon-core-crd --name seldon-core-crd \ - --repo https://storage.googleapis.com/seldon-charts --set usage_metrics.enabled=true -``` - -The information that is reported is anonymous and only contains some information about each node in the cluster, including OS version, kubelet version, docker version, and CPU and memory capacity. - -An example of what's reported: -```json -{ - "clusterID": "846db7e9-c861-43d7-8d08-31578af59878", - "extensions": [ - { - "name": "seldon-core-version", - "value": "0.1.5" - } - ], - "masterVersion": "v1.9.3-gke.0", - "nodes": [ - { - "architecture": "amd64", - "capacity": [ - { - "resource": "cpu", - "value": "4" - }, - { - "resource": "memory", - "value": "15405960Ki" - }, - { - "resource": "pods", - "value": "110" - } - ], - "cloudProvider": "gce", - "containerRuntimeVersion": "docker://17.3.2", - "id": "33082e677f61a199c195553e52bbd65a", - "kernelVersion": "4.4.111+", - "kubeletVersion": "v1.9.3-gke.0", - "operatingSystem": "linux", - "osImage": "Container-Optimized OS from Google" - } - ], - "timestamp": "1522059083", - "version": "v1.0.0-5d3377f6946c3ce9159cc9e7589cfbf1de26e0df" -} -``` - -### Disable Usage Reporting - -Reporting of usage data is disabled by default, just use "seldon-core-crd" as normal. - -```bash -helm install seldon-core-crd --name seldon-core-crd \ - --repo https://storage.googleapis.com/seldon-charts -``` - diff --git a/docs/v1alpha2_update.md b/docs/v1alpha2_update.md deleted file mode 100644 index 7f94e12b20..0000000000 --- a/docs/v1alpha2_update.md +++ /dev/null @@ -1,123 +0,0 @@ -# V1Alpha2 Update - - * The ```PredictorSpec componentSpec``` is now ```componentSpecs``` and takes a list of ```PodTemplateSpecs``` allowing you to split your runtime graph into separate Kubernetes Deployments as needed. See the new [proto definition](./proto/seldon_deployment.proto). To update existing resources: - * Change ``` "apiVersion": "machinelearning.seldon.io/v1alpha1"``` to ``` "apiVersion": "machinelearning.seldon.io/v1alpha2"``` - * Change ```componentSpec``` -> ```componentSpecs``` and enclose the existing ```PodTemplateSpec``` in a single element list - -For example change: - -``` -{ - "apiVersion": "machinelearning.seldon.io/v1alpha1", - "kind": "SeldonDeployment", - "metadata": { - "labels": { - "app": "seldon" - }, - "name": "seldon-deployment-example" - }, - "spec": { - "annotations": { - "project_name": "FX Market Prediction", - "deployment_version": "v1" - }, - "name": "test-deployment", - "oauth_key": "oauth-key", - "oauth_secret": "oauth-secret", - "predictors": [ - { - "componentSpec": { - "spec": { - "containers": [ - { - "image": "seldonio/mock_classifier:1.0", - "imagePullPolicy": "IfNotPresent", - "name": "classifier", - "resources": { - "requests": { - "memory": "1Mi" - } - } - } - ], - "terminationGracePeriodSeconds": 20 - } - }, - "graph": { - "children": [], - "name": "classifier", - "endpoint": { - "type" : "REST" - }, - "type": "MODEL" - }, - "name": "fx-market-predictor", - "replicas": 1, - "annotations": { - "predictor_version" : "v1" - } - } - ] - } -} - -``` - -to - -``` -{ - "apiVersion": "machinelearning.seldon.io/v1alpha2", - "kind": "SeldonDeployment", - "metadata": { - "labels": { - "app": "seldon" - }, - "name": "seldon-deployment-example" - }, - "spec": { - "annotations": { - "project_name": "FX Market Prediction", - "deployment_version": "v1" - }, - "name": "test-deployment", - "oauth_key": "oauth-key", - "oauth_secret": "oauth-secret", - "predictors": [ - { - "componentSpecs": [{ - "spec": { - "containers": [ - { - "image": "seldonio/mock_classifier:1.0", - "imagePullPolicy": "IfNotPresent", - "name": "classifier", - "resources": { - "requests": { - "memory": "1Mi" - } - } - } - ], - "terminationGracePeriodSeconds": 20 - } - }], - "graph": { - "children": [], - "name": "classifier", - "endpoint": { - "type" : "REST" - }, - "type": "MODEL" - }, - "name": "fx-market-predictor", - "replicas": 1, - "annotations": { - "predictor_version" : "v1" - } - } - ] - } -} - -``` diff --git a/docs/wrappers/java.md b/docs/wrappers/java.md deleted file mode 100644 index ffb4e72cca..0000000000 --- a/docs/wrappers/java.md +++ /dev/null @@ -1,216 +0,0 @@ -# Packaging a Java model for Seldon Core using s2i - - -In this guide, we illustrate the steps needed to wrap your own Java model in a docker image ready for deployment with Seldon Core using [source-to-image app s2i](https://github.com/openshift/source-to-image). - -If you are not familiar with s2i you can read [general instructions on using s2i](./s2i.md) and then follow the steps below. - - -## Step 1 - Install s2i - - [Download and install s2i](https://github.com/openshift/source-to-image#installation) - - * Prerequisites for using s2i are: - * Docker - * Git (if building from a remote git repo) - -To check everything is working you can run - -```bash -s2i usage seldonio/seldon-core-s2i-java-build:0.1 -``` - -## Step 2 - Create your source code - -To use our s2i builder image to package your Java model you will need: - - * A Maven project that depends on ```io.seldon.wrapper``` library - * A Spring Boot configuration class - * A class that implements ```io.seldon.wrapper.SeldonPredictionService``` for the type of component you are creating - * .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -We will go into detail for each of these steps: - -### Maven Project -Create a Spring Boot Maven project and include the dependency: - -```XML - - io.seldon.wrapper - seldon-core-wrapper - 0.1.3 - -``` - -A full example can be found at ```wrappers/s2i/java/test/model-template-app/pom.xml```. - -### Spring Boot Intialization - -Create a main App class: - * Add @EnableAsync annotation (to allow the embedded gRPC server to start at Spring Boot startup) - * include the ```io.seldon.wrapper``` in the scan base packages list along with your App's package, in the example below the Apps's package is ```io.seldon.example```. - * Import the config class at ```io.seldon.wrapper.config.AppConfig.class``` - -For example: - -```java -@EnableAsync -@SpringBootApplication(scanBasePackages = {"io.seldon.wrapper","io.seldon.example"}) -@Import({ io.seldon.wrapper.config.AppConfig.class }) -public class App { - public static void main(String[] args) throws Exception { - SpringApplication.run(App.class, args); - } -} -``` - -### Prediction Class -To handle requests to your model or other component you need to implement one or more of the methods in ```io.seldon.wrapper.SeldonPredictionService```, in particular: - -```java -default public SeldonMessage predict(SeldonMessage request); -default public SeldonMessage route(SeldonMessage request); -default public SeldonMessage sendFeedback(Feedback request); -default public SeldonMessage transformInput(SeldonMessage request); -default public SeldonMessage transformOutput(SeldonMessage request); -default public SeldonMessage aggregate(SeldonMessageList request); -``` - -Your implementing class should be created as a Spring Component so it will be managed by Spring. There is a full H2O example in ```examples/models/h2o_mojo/src/main/java/io/seldon/example/h2o/model```, whose implementation is shown below: - -```java -@Component -public class H2OModelHandler implements SeldonPredictionService { - private static Logger logger = LoggerFactory.getLogger(H2OModelHandler.class.getName()); - EasyPredictModelWrapper model; - - public H2OModelHandler() throws IOException { - MojoReaderBackend reader = - MojoReaderBackendFactory.createReaderBackend( - getClass().getClassLoader().getResourceAsStream( - "model.zip"), - MojoReaderBackendFactory.CachingStrategy.MEMORY); - MojoModel modelMojo = ModelMojoReader.readFrom(reader); - model = new EasyPredictModelWrapper(modelMojo); - logger.info("Loaded model"); - } - - @Override - public SeldonMessage predict(SeldonMessage payload) { - List rows = H2OUtils.convertSeldonMessage(payload.getData()); - List predictions = new ArrayList<>(); - for(RowData row : rows) - { - try - { - BinomialModelPrediction p = model.predictBinomial(row); - predictions.add(p); - } catch (PredictException e) { - logger.info("Error in prediction ",e); - } - } - DefaultData res = H2OUtils.convertH2OPrediction(predictions, payload.getData()); - - return SeldonMessage.newBuilder().setData(res).build(); - } - -} - -``` - -The above code: - - * loads a model from the local resources folder on startup - * Converts the proto buffer message into H2O RowData using provided utility classes. - * Runs a BionomialModel prediction and converts the result back into a ```SeldonMessage``` for return - -#### H2O Helper Classes - -We provide H2O utility class ```io.seldon.wrapper.utils.H2OUtils``` in seldon-core-wrapper to convert to and from the seldon-core proto buffer message types. - -#### DL4J Helper Classes - -We provide a DL4J utility class ```io.seldon.wrapper.utils.DL4JUtils``` in seldon-core-wrapper to convert to and from the seldon-core proto buffer message types. - -### .s2i/environment - -Define the core parameters needed by our R builder image to wrap your model. An example is: - -```bash -API_TYPE=REST -SERVICE_TYPE=MODEL -``` - -These values can also be provided or overridden on the command line when building the image. - -## Step 3 - Build your image -Use ```s2i build``` to create your Docker image from source code. You will need Docker installed on the machine and optionally git if your source code is in a public git repo. - -Using s2i you can build directly from a git repo or from a local source folder. See the [s2i docs](https://github.com/openshift/source-to-image/blob/master/docs/cli.md#s2i-build) for further details. The general format is: - -```bash -s2i build seldonio/seldon-core-s2i-java-build:0.1 --runtime-image seldonio/seldon-core-s2i-java-runtime:0.1 -s2i build seldonio/seldon-core-s2i-java-build:0.1 --runtime-image seldonio/seldon-core-s2i-java-runtime:0.1 -``` - -An example invocation using the test template model inside seldon-core: - -```bash -s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/python/test/model-template-app seldonio/seldon-core-s2i-java-build:0.1 h2o-test:0.1 --runtime-image seldonio/seldon-core-s2i-java-runtime:0.1 -``` - -The above s2i build invocation: - - * uses the GitHub repo: https://github.com/seldonio/seldon-core.git and the directory ```wrappers/s2i/R/test/model-template-app``` inside that repo. - * uses the builder image ```seldonio/seldon-core-s2i-java-build``` - * uses the runtime image ```seldonio/seldon-core-s2i-java-runtime``` - * creates a docker image ```seldon-core-template-model``` - - -For building from a local source folder, an example where we clone the seldon-core repo: - -```bash -git clone https://github.com/seldonio/seldon-core.git -cd seldon-core -s2i build wrappers/s2i/R/test/model-template-app seldonio/seldon-core-s2i-java-build:0.1 h2o-test:0.1 --runtime-image seldonio/seldon-core-s2i-java-runtime:0.1 -``` - -For more help see: - -``` -s2i usage seldonio/seldon-core-s2i-java-build:0.1 -s2i build --help -``` - -## Reference - -### Environment Variables -The required environment variables understood by the builder image are explained below. You can provide them in the ```.s2i/environment``` file or on the ```s2i build``` command line. - - -#### API_TYPE - -API type to create. Can be REST or GRPC. - -#### SERVICE_TYPE - -The service type being created. Available options are: - - * MODEL - * ROUTER - * TRANSFORMER - * COMBINER - - -### Creating different service types - -#### MODEL - - * [A minimal skeleton for model source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/java/test/model-template-app) - * [Example H2O MOJO](https://github.com/SeldonIO/seldon-core/tree/master/examples/models/h2o-mojo/README.md) - - - - - - diff --git a/docs/wrappers/nodejs.md b/docs/wrappers/nodejs.md deleted file mode 100644 index 514c8f923c..0000000000 --- a/docs/wrappers/nodejs.md +++ /dev/null @@ -1,156 +0,0 @@ -# Packaging a NodeJS model for Seldon Core using s2i - -In this guide, we illustrate the steps needed to wrap your own JS model running on a node engine in a docker image ready for deployment with Seldon Core using [source-to-image app s2i](https://github.com/openshift/source-to-image). - -If you are not familiar with s2i you can read [general instructions on using s2i](./s2i.md) and then follow the steps below. - -## Step 1 - Install s2i - -[Download and install s2i](https://github.com/openshift/source-to-image#installation) - -- Prerequisites for using s2i are: - - Docker - - Git (if building from a remote git repo) - -To check everything is working you can run - -```bash -s2i usage seldonio/seldon-core-s2i-nodejs:0.1 -``` - -## Step 2 - Create your source code - -To use our s2i builder image to package your NodeJS model you will need: - -- An JS file which provides an ES5 Function object or an ES6 class for your model and that has appropriate generics for your component, i.e. an `init` and a `predict` for the model. -- A package.json that contains all the dependencies and meta data for the model -- .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -We will go into detail for each of these steps: - -### NodeJS Runtime Model file - -Your source code should which provides an ES5 Function object or an ES6 class for your model. For example, looking at our skeleton JS structure: - -```js -let MyModel = function() {}; - -MyModel.prototype.init = async function() { - // A mandatory init method for the class to load run-time dependencies - this.model = "My Awesome model"; -}; - -MyModel.prototype.predict = function(newdata, feature_names) { - //A mandatory predict function for the model predictions - console.log("Predicting ..."); - return newdata; -}; - -module.exports = MyModel; -``` - -Also the model could be an ES6 class as follows - -```js -class MyModel { - async init() { - // A mandatory init method for the class to load run-time dependencies - this.model = "My Awesome ES6 model"; - } - predict(newdata, feature_names) { - //A mandatory predict function for the model predictions - console.log("ES6 Predicting ..."); - return newdata; - } -} -module.exports = MyModel; -``` - -- A `init` method for the model object. This will be called on startup and you can use this to load any parameters your model needs. This function may also be an async,for example in case if it has to load the model weights from a remote location. -- A generic `predict` method is created for my model class. This will be called with a `newdata` field with the data object to be predicted. - -### package.json - -Populate an `package.json` with any software dependencies your code requires using an `npm init` command and save your dependencies to the file. - -### .s2i/environment - -Define the core parameters needed by our node JS builder image to wrap your model. An example is: - -```bash -MODEL_NAME=MyModel.js -API_TYPE=REST -SERVICE_TYPE=MODEL -PERSISTENCE=0 -``` - -These values can also be provided or overridden on the command line when building the image. - -## Step 3 - Build your image - -Use `s2i build` to create your Docker image from source code. You will need Docker installed on the machine and optionally git if your source code is in a public git repo. - -Using s2i you can build directly from a git repo or from a local source folder. See the [s2i docs](https://github.com/openshift/source-to-image/blob/master/docs/cli.md#s2i-build) for further details. The general format is: - -```bash -s2i build seldonio/seldon-core-s2i-nodejs:0.1 -s2i build seldonio/seldon-core-s2i-nodejs:0.1 -``` - -An example invocation using the test template model inside seldon-core: - -```bash -s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/nodejs/test/model-template-app seldonio/seldon-core-s2i-nodejs:0.1 seldon-core-template-model -``` - -The above s2i build invocation: - -- uses the GitHub repo: https://github.com/seldonio/seldon-core.git and the directory `wrappers/s2i/nodejs/test/model-template-app` inside that repo. -- uses the builder image `seldonio/seldon-core-s2i-nodejs` -- creates a docker image `seldon-core-template-model` - -For building from a local source folder, an example where we clone the seldon-core repo: - -```bash -git clone https://github.com/seldonio/seldon-core.git -cd seldon-core -s2i build wrappers/s2i/nodejs/test/model-template-app seldonio/seldon-core-s2i-nodejs:0.1 seldon-core-template-model -``` - -For more help see: - -``` -s2i usage seldonio/seldon-core-s2i-nodejs:0.1 -s2i build --help -``` - -## Reference - -### Environment Variables - -The required environment variables understood by the builder image are explained below. You can provide them in the `.s2i/environment` file or on the `s2i build` command line. - -#### MODEL_NAME - -The name of the JS file containing the model. - -#### API_TYPE - -API type to create. Can be REST or GRPC. - -#### SERVICE_TYPE - -The service type being created. Available options are: - -- MODEL -- TRANSFORMER - -#### PERSISTENCE - -Can only by 0 at present. - -### Creating different service types - -#### MODEL - -- [Example model](https://github.com/SeldonIO/seldon-core/tree/master/examples/models/nodejs_tensorflow) diff --git a/docs/wrappers/python.md b/docs/wrappers/python.md deleted file mode 100644 index 3ad5b2d17e..0000000000 --- a/docs/wrappers/python.md +++ /dev/null @@ -1,278 +0,0 @@ -# Packaging a Python model for Seldon Core using s2i - - -In this guide, we illustrate the steps needed to wrap your own python model in a docker image ready for deployment with Seldon Core using [source-to-image app s2i](https://github.com/openshift/source-to-image). - -If you are not familiar with s2i you can read [general instructions on using s2i](./s2i.md) and then follow the steps below. - - -## Step 1 - Install s2i - - [Download and install s2i](https://github.com/openshift/source-to-image#installation) - - * Prerequisites for using s2i are: - * Docker - * Git (if building from a remote git repo) - -To check everything is working you can run - -```bash -s2i usage seldonio/seldon-core-s2i-python3:0.5.1 -``` - - -## Step 2 - Create your source code - -To use our s2i builder image to package your python model you will need: - - * A python file with a class that runs your model - * requirements.txt or setup.py - * .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -We will go into detail for each of these steps: - -### Python file -Your source code should contain a python file which defines a class of the same name as the file. For example, looking at our skeleton python model file at ```wrappers/s2i/python/test/model-template-app/MyModel.py```: - -```python -class MyModel(object): - """ - Model template. You can load your model parameters in __init__ from a location accessible at runtime - """ - - def __init__(self): - """ - Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest. - """ - print("Initializing") - - def predict(self,X,features_names): - """ - Return a prediction. - - Parameters - ---------- - X : array-like - feature_names : array of feature names (optional) - """ - print("Predict called - will run identity function") - return X -``` - - * The file is called MyModel.py and it defines a class MyModel - * The class contains a predict method that takes an array (numpy) X and feature_names and returns an array of predictions. - * You can add any required initialization inside the class init method. - * Your return array should be at least 2-dimensional. - -### requirements.txt -Populate a requirements.txt with any software dependencies your code requires. These will be installed via pip when creating the image. You can instead provide a setup.py if you prefer. - -### .s2i/environment - -Define the core parameters needed by our python builder image to wrap your model. An example is: - -```bash -MODEL_NAME=MyModel -API_TYPE=REST -SERVICE_TYPE=MODEL -PERSISTENCE=0 -``` - -These values can also be provided or overridden on the command line when building the image. - -## Step 3 - Build your image -Use ```s2i build``` to create your Docker image from source code. You will need Docker installed on the machine and optionally git if your source code is in a public git repo. You can choose from three python builder images - - * Python 3.6 : seldonio/seldon-core-s2i-python36:0.5.1, seldonio/seldon-core-s2i-python3:0.5.1 - * Note there are [issues running TensorFlow under Python 3.7](https://github.com/tensorflow/tensorflow/issues/20444) (Nov 2018) and Python 3.7 is not officially supported by TensorFlow (Dec 2018). - * Python 3.6 plus ONNX support via [Intel nGraph](https://github.com/NervanaSystems/ngraph) : seldonio/seldon-core-s2i-python3-ngraph-onnx:0.1 - -Using s2i you can build directly from a git repo or from a local source folder. See the [s2i docs](https://github.com/openshift/source-to-image/blob/master/docs/cli.md#s2i-build) for further details. The general format is: - -```bash -s2i build seldonio/seldon-core-s2i-python3:0.5.1 -``` - -Change to seldonio/seldon-core-s2i-python3 if using python 3. - -An example invocation using the test template model inside seldon-core: - -```bash -s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/python/test/model-template-app seldonio/seldon-core-s2i-python3:0.5.1 seldon-core-template-model -``` - -The above s2i build invocation: - - * uses the GitHub repo: https://github.com/seldonio/seldon-core.git and the directory ```wrappers/s2i/python/test/model-template-app``` inside that repo. - * uses the builder image ```seldonio/seldon-core-s2i-python3``` - * creates a docker image ```seldon-core-template-model``` - - -For building from a local source folder, an example where we clone the seldon-core repo: - -```bash -git clone https://github.com/seldonio/seldon-core.git -cd seldon-core -s2i build wrappers/s2i/python/test/model-template-app seldonio/seldon-core-s2i-python3:0.5.1 seldon-core-template-model -``` - -For more help see: - -``` -s2i usage seldonio/seldon-core-s2i-python3:0.5.1 -s2i build --help -``` - -## Using with Keras/Tensorflow Models - -To ensure Keras models with the Tensorflow backend work correctly you may need to call `_make_predict_function()` on your model after it is loaded. This is because Flask may call the prediction request in a separate thread from the one that initialised your model. See [here](https://github.com/keras-team/keras/issues/6462) for further discussion. - -## Reference - -### Environment Variables -The required environment variables understood by the builder image are explained below. You can provide them in the ```.s2i/environment``` file or on the ```s2i build``` command line. - - -#### MODEL_NAME -The name of the class containing the model. Also the name of the python file which will be imported. - -#### API_TYPE - -API type to create. Can be REST or GRPC - -#### SERVICE_TYPE - -The service type being created. Available options are: - - * MODEL - * ROUTER - * TRANSFORMER - * COMBINER - * OUTLIER_DETECTOR - -#### PERSISTENCE - -Set either to 0 or 1. Default is 0. If set to 1 then your model will be saved periodically to redis and loaded from redis (if exists) or created fresh if not. - - -### Creating different service types - -#### MODEL - - * [A minimal skeleton for model source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/python/test/model-template-app) - * [Example models](https://github.com/SeldonIO/seldon-core/tree/master/examples/models) - -#### ROUTER - * [Description of routers in Seldon Core](../../components/routers/README.md) - * [A minimal skeleton for router source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/python/test/router-template-app) - * [Example routers](https://github.com/SeldonIO/seldon-core/tree/master/examples/routers) - -#### TRANSFORMER - - * [A minimal skeleton for transformer source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/python/test/transformer-template-app) - * [Example transformers](https://github.com/SeldonIO/seldon-core/tree/master/examples/transformers) - - -## Advanced Usage - -### Model Class Arguments -You can add arguments to your component which will be populated from the ```parameters``` defined in the SeldonDeloyment when you deploy your image on Kubernetes. For example, our [Python TFServing proxy](https://github.com/SeldonIO/seldon-core/tree/master/integrations/tfserving) has the class init method signature defined as below: - -``` -class TfServingProxy(object): - -def __init__(self,rest_endpoint=None,grpc_endpoint=None,model_name=None,signature_name=None,model_input=None,model_output=None): -``` - -These arguments can be set when deploying in a Seldon Deployment. An example can be found in the [MNIST TFServing example](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/tfserving-mnist/tfserving-mnist.ipynb) where the arguments are defined in the [SeldonDeployment](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/tfserving-mnist/mnist_tfserving_deployment.json.template) which is partly show below: - -``` - "graph": { - "name": "tfserving-proxy", - "endpoint": { "type" : "REST" }, - "type": "MODEL", - "children": [], - "parameters": - [ - { - "name":"grpc_endpoint", - "type":"STRING", - "value":"localhost:8000" - }, - { - "name":"model_name", - "type":"STRING", - "value":"mnist-model" - }, - { - "name":"model_output", - "type":"STRING", - "value":"scores" - }, - { - "name":"model_input", - "type":"STRING", - "value":"images" - }, - { - "name":"signature_name", - "type":"STRING", - "value":"predict_images" - } - ] -}, -``` - - -The allowable ```type``` values for the parameters are defined in the [proto buffer definition](https://github.com/SeldonIO/seldon-core/blob/44f7048efd0f6be80a857875058d23efc4221205/proto/seldon_deployment.proto#L117-L131). - - -### Local Python Dependencies -```from version 0.5-SNAPSHOT``` - -To use a private repository for installing Python dependencies use the following build command: - -```bash -s2i build -i :/whl seldonio/seldon-core-s2i-python3:0.6-SNAPSHOT -``` - -This command will look for local Python wheels in the `````` and use these before searching PyPI. - -### Custom Metrics -```from version 0.3``` - -To add custom metrics to your response you can define an optional method ```metrics``` in your class that returns a list of metric dicts. An example is shown below: - -``` -class MyModel(object): - - def predict(self,X,features_names): - return X - - def metrics(self): - return [{"type":"COUNTER","key":"mycounter","value":1}] -``` - -For more details on custom metrics and the format of the metric dict see [here](../custom_metrics.md). - -There is an [example notebook illustrating a model with custom metrics in python](../../examples/models/template_model_with_metrics/modelWithMetrics.ipynb). - -### Custom Meta Data -```from version 0.3``` - -To add custom meta data you can add an optional method ```tags``` which can return a dict of custom meta tags as shown in the example below: - -``` -class UserObject(object): - - def predict(self,X,features_names): - return X - - def tags(self): - return {"mytag":1} -``` - - - - - diff --git a/docs/wrappers/r.md b/docs/wrappers/r.md deleted file mode 100644 index 8775b476ff..0000000000 --- a/docs/wrappers/r.md +++ /dev/null @@ -1,157 +0,0 @@ -# Packaging an R model for Seldon Core using s2i - -In this guide, we illustrate the steps needed to wrap your own R model in a docker image ready for deployment with Seldon Core using [source-to-image app s2i](https://github.com/openshift/source-to-image). - -If you are not familiar with s2i you can read [general instructions on using s2i](./s2i.md) and then follow the steps below. - -## Step 1 - Install s2i - -[Download and install s2i](https://github.com/openshift/source-to-image#installation) - -- Prerequisites for using s2i are: - - Docker - - Git (if building from a remote git repo) - -To check everything is working you can run - -```bash -s2i usage seldonio/seldon-core-s2i-r:0.1 -``` - -## Step 2 - Create your source code - -To use our s2i builder image to package your R model you will need: - -- An R file which provides an S3 class for your model via an `initialise_seldon` function and that has appropriate generics for your component, e.g. predict for a model. -- An optional install.R to be run to install any libraries needed -- .s2i/environment - model definitions used by the s2i builder to correctly wrap your model - -We will go into detail for each of these steps: - -### R Runtime Model file - -Your source code should contain an R file which defines an S3 class for your model. For example, looking at our skeleton R model file at `wrappers/s2i/R/test/model-template-app/MyModel.R`: - -```R -library(methods) - -predict.mymodel <- function(mymodel,newdata=list()) { - write("MyModel predict called", stdout()) - newdata -} - - -new_mymodel <- function() { - structure(list(), class = "mymodel") -} - - -initialise_seldon <- function(params) { - new_mymodel() -} -``` - -- A `seldon_initialise` function creates an S3 class for my model via a constructor `new_mymodel`. This will be called on startup and you can use this to load any parameters your model needs. -- A generic `predict` function is created for my model class. This will be called with a `newdata` field with the `data.frame` to be predicted. - -There are similar templates for ROUTERS and TRANSFORMERS. - -### install.R - -Populate an `install.R` with any software dependencies your code requires. For example: - -```R -install.packages('rpart') -``` - -### .s2i/environment - -Define the core parameters needed by our R builder image to wrap your model. An example is: - -```bash -MODEL_NAME=MyModel.R -API_TYPE=REST -SERVICE_TYPE=MODEL -PERSISTENCE=0 -``` - -These values can also be provided or overridden on the command line when building the image. - -## Step 3 - Build your image - -Use `s2i build` to create your Docker image from source code. You will need Docker installed on the machine and optionally git if your source code is in a public git repo. - -Using s2i you can build directly from a git repo or from a local source folder. See the [s2i docs](https://github.com/openshift/source-to-image/blob/master/docs/cli.md#s2i-build) for further details. The general format is: - -```bash -s2i build seldonio/seldon-core-s2i-r:0.1 -s2i build seldonio/seldon-core-s2i-r:0.1 -``` - -An example invocation using the test template model inside seldon-core: - -```bash -s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/R/test/model-template-app seldonio/seldon-core-s2i-r:0.1 seldon-core-template-model -``` - -The above s2i build invocation: - -- uses the GitHub repo: https://github.com/seldonio/seldon-core.git and the directory `wrappers/s2i/R/test/model-template-app` inside that repo. -- uses the builder image `seldonio/seldon-core-s2i-r` -- creates a docker image `seldon-core-template-model` - -For building from a local source folder, an example where we clone the seldon-core repo: - -```bash -git clone https://github.com/seldonio/seldon-core.git -cd seldon-core -s2i build wrappers/s2i/R/test/model-template-app seldonio/seldon-core-s2i-r:0.1 seldon-core-template-model -``` - -For more help see: - -``` -s2i usage seldonio/seldon-core-s2i-r:0.1 -s2i build --help -``` - -## Reference - -### Environment Variables - -The required environment variables understood by the builder image are explained below. You can provide them in the `.s2i/environment` file or on the `s2i build` command line. - -#### MODEL_NAME - -The name of the R file containing the model. - -#### API_TYPE - -API type to create. Can be REST only at present. - -#### SERVICE_TYPE - -The service type being created. Available options are: - -- MODEL -- ROUTER -- TRANSFORMER - -#### PERSISTENCE - -Can only by 0 at present. In future, will allow the state of the component to be saved periodically. - -### Creating different service types - -#### MODEL - -- [A minimal skeleton for model source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/R/test/model-template-app) -- [Example models](https://github.com/SeldonIO/seldon-core/tree/master/examples/models) - -#### ROUTER -- [Description of routers in Seldon Core](../../components/routers/README.md) -- [A minimal skeleton for router source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/R/test/router-template-app) - -#### TRANSFORMER - -- [A minimal skeleton for transformer source code](https://github.com/cliveseldon/seldon-core/tree/s2i/wrappers/s2i/R/test/transformer-template-app) diff --git a/docs/wrappers/readme.md b/docs/wrappers/readme.md deleted file mode 100644 index b93117d30b..0000000000 --- a/docs/wrappers/readme.md +++ /dev/null @@ -1,36 +0,0 @@ -# Wrapping Your Model - -To allow your component (model, router etc.) to be managed by seldon-core it needs - -1. To be built into a Docker container -1. To expose the appropriate [service APIs over REST or gRPC](../reference/internal-api.md). - -To wrap your model follow the instructions for your chosen language or toolkit. - -To test a wrapped components you can use one of our [testing scripts](../api-testing.md). - -## Python - -Python based models, including [TensorFlow](https://www.tensorflow.org/), [Keras](https://keras.io/), [pyTorch](http://pytorch.org/), [StatsModels](http://www.statsmodels.org/stable/index.html), [XGBoost](https://github.com/dmlc/xgboost) and [Scikit-learn](http://scikit-learn.org/stable/) based models. - -- [Python models wrapped using source-to-image](./python.md) - -## R - -- [R models wrapped using source-to-image](r.md) - -## Java - -Java based models including, [H2O](https://www.h2o.ai/), [Deep Learning 4J](https://deeplearning4j.org/), Spark (standalone exported models). - -- [Java models wrapped using source-to-image](java.md) - -## NodeJS - -- [Javascript models wrapped using source-to-image](nodejs.md) - - -## Go (Alpha) - -- [Example Go integration](../../examples/wrappers/go/README.md) - diff --git a/docs/wrappers/s2i.md b/docs/wrappers/s2i.md deleted file mode 100644 index 79a7aa9023..0000000000 --- a/docs/wrappers/s2i.md +++ /dev/null @@ -1,23 +0,0 @@ -# Source to Image (s2i) - -[Source to image](https://github.com/openshift/source-to-image) is a RedHat supported tool to create docker images from source code. We provide builder images to allow you to easily wrap your data science models so they can be managed by seldon-core. - -The general work flow is: - - 1. [Download and install s2i](https://github.com/openshift/source-to-image#installation) - 1. Choose the builder image that is most appropriate for your code and get usage instructions, for example: - ```bash - s2i usage seldonio/seldon-core-s2i-python3 - ``` - 1. Create a source code repo in the form acceptable for the builder image and build your docker container from it. Below we show an example using our seldon-core git repo which has some template examples for python models. - ``` - s2i build https://github.com/seldonio/seldon-core.git --context-dir=wrappers/s2i/python/test/model-template-app seldonio/seldon-core-s2i-python seldon-core-template-model - ``` - -At present we have s2i builder images for - - * [Python (Python2 or Python3)](./python.md) : use this for Tensorflow, Keras, PyTorch or sklearn models. - * [R](r.md) - * [Java](java.md) - * [NodeJS](nodejs.md) -