SeldonIO · axsaucedo · Jun 10, 2020 · Apr 17, 2020 · Apr 18, 2020 · Apr 18, 2020
diff --git a/doc/source/examples/argo_workflows_batch.nblink b/doc/source/examples/argo_workflows_batch.nblink
@@ -0,0 +1,3 @@
+{
+  "path": "../../../examples/batch/argo-workflows-batch/README.ipynb"
+}
diff --git a/doc/source/examples/assets/kubeflow-pipeline.jpg b/doc/source/examples/assets/kubeflow-pipeline.jpg
diff --git a/doc/source/examples/assets/seldon-kubeflow-batch.gif b/doc/source/examples/assets/seldon-kubeflow-batch.gif
diff --git a/doc/source/examples/kubeflow_pipelines_batch.nblink b/doc/source/examples/kubeflow_pipelines_batch.nblink
@@ -0,0 +1,3 @@
+{
+  "path": "../../../examples/batch/kubeflow-pipelines-batch/README.ipynb"
+}
diff --git a/doc/source/examples/notebooks.rst b/doc/source/examples/notebooks.rst
@@ -71,6 +71,15 @@ Advanced Machine Learning Insights
    Tabular, Text and Image Model Explainers <explainer_examples>
    Outlier Detection on CIFAR10 <outlier_cifar10>
 
+Batch Processing with Seldon Core
+-----
+
+.. toctree::
+   :titlesonly:
+
+   Batch Processing with Argo Workflows <argo_workflows_batch>
+   Batch Processing with Kubeflow Pipelines <kubeflow_pipelines_batch>
+
 
 MLOps: Scaling and Monitoring and Observability
 -----

diff --git a/doc/source/images/batch-processor.jpg b/doc/source/images/batch-processor.jpg
diff --git a/doc/source/images/batch-workflow-manager-integration.jpg b/doc/source/images/batch-workflow-manager-integration.jpg
diff --git a/doc/source/images/batch-workflow-managers.jpg b/doc/source/images/batch-workflow-managers.jpg
diff --git a/doc/source/images/stream-processing-knative.jpg b/doc/source/images/stream-processing-knative.jpg
diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -64,20 +64,38 @@ Documentation Index
 
 .. toctree::
    :maxdepth: 1
-   :caption: Language Wrappers (Production)
+   :caption: Production
+
+   Supported API Protocols <graph/protocols.md>
+   CI/CD MLOps at Scale <analytics/cicd-mlops.md>
+   Metrics with Prometheus <analytics/analytics.md>
+   Payload Logging with ELK <analytics/logging.md>
+   Distributed Tracing with Jaeger <graph/distributed-tracing.md>
+   Replica Scaling  <graph/scaling.md>
+   Custom Inference Servers <servers/custom.md>
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Batch Processing with Seldon
+
+   Overview of Batch Processing <servers/batch.md>
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Language Wrappers
 
-   Python Language Wrapper [Production] <python/index.rst>
+   Python Language Wrapper <python/index.rst>
 
 .. toctree::
    :maxdepth: 1
    :caption: Incubating Projects
 
-   Java Language Wrapper [Incubating] <java/README.md>
+   Java Language Wrapper <java/README.md>
+   Metadata <reference/apis/metadata.md>
    R Language Wrapper [ALPHA] <R/README.md>
    NodeJS Language Wrapper [ALPHA] <nodejs/README.md>
    Go Language Wrapper [ALPHA] <go/go_wrapper_link.rst>
    Stream Processing with KNative <streaming/knative_eventing.md>
-   Metadata [Incubating] <reference/apis/metadata.md>
 
 .. toctree::
    :maxdepth: 1
@@ -86,18 +104,6 @@ Documentation Index
    Ambassador Ingress <ingress/ambassador.md>
    Istio Ingress <ingress/istio.md>
 
-.. toctree::
-   :maxdepth: 1
-   :caption: Production
-
-   Supported API Protocols <graph/protocols.md>
-   CI/CD MLOps at Scale <analytics/cicd-mlops.md>
-   Metrics with Prometheus <analytics/analytics.md>
-   Payload Logging with ELK <analytics/logging.md>
-   Distributed Tracing with Jaeger <graph/distributed-tracing.md>
-   Replica Scaling  <graph/scaling.md>
-   Custom Inference Servers <servers/custom.md>
-
 .. toctree::
    :maxdepth: 1
    :caption: Advanced Inference

diff --git a/doc/source/python/api/seldon_core.rst b/doc/source/python/api/seldon_core.rst
@@ -24,6 +24,14 @@ seldon\_core.api\_tester module
    :undoc-members:
    :show-inheritance:
 
+seldon\_core.batch\_processor module
+------------------------------------
+
+.. automodule:: seldon_core.batch_processor
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
 seldon\_core.flask\_utils module
 --------------------------------
 

diff --git a/doc/source/servers/batch.md b/doc/source/servers/batch.md
@@ -0,0 +1,149 @@
+# Batch Processing with Seldon Core
+
+Seldon Core provides a command line component that allows for highly parallelizable batch processing with the horizontally scalable seldon core kubernetes model deployments.
+
+For stream processing with Seldon Core please see [Stream Processing with KNative Eventing](../streaming/knative_eventing.md).
+
+![](../images/batch-processor.jpg)
+
+## Horizontally Scalable Workers and Replicas
+
+The parallelizable batch processor worker allows for high throughput as it is able to leverage the Seldon Core horizontal scaling replicas as well as autoscaling, and hence providing flexibility to the user to optimize their configuration as required. 
+
+The diagram below shows a standard workflow where data can be downloaded and then uploaded through an object store, and the Seldon model can be created and deleted when the job finishes successfully.
+
+![](../images/batch-workflow-manager-integration.jpg)
+
+## Integration with ETL & Workflow Managers
+
+The Seldon Batch component has been built to be modular and flexible such that it can be integrated across any workflow managers.
+
+This allows you to leverage Seldon on a large number of batch applications, including triggers that have to take place on a scheduled basis (e.g. once a day, once a month, etc), or jobs that can be triggered programmatically.
+
+![](../images/batch-workflow-managers.jpg)
+
+## Hands on Examples
+
+We have provided a set of examples that show you how you can use the Seldon batch processing component:
+
+* [Batch Processing with Argo Workflows](../examples/argo_workflows_batch.html)
+* [Batch Processing with Kubeflow Pipelines Example](../examples/kubeflow_pipelines_batch.html)
+
+## High Level Implementation Details
+
+### CLI Parameters
+
+To get more insights on each of the commands available you can interact with the batch processor component as follows:
+
+```bash
+$ seldon-batch-processor --help
+
+Usage: seldon-batch-processor [OPTIONS]
+
+  Command line interface for Seldon Batch Processor, which can be used to
+  send requests through configurable parallel workers to Seldon Core models.
+  It is recommended that the respective Seldon Core model is also optimized
+  with number of replicas to distribute and scale out the batch processing
+  work. The processor is able to process data from local filestore input
+  file in various formats supported by the SeldonClient module. It is also
+  suggested to use the batch processor component integrated with an ETL
+  Workflow Manager such as Kubeflow, Argo Pipelines, Airflow, etc. which
+  would allow for extra setup / teardown steps such as downloading the data
+  from object store or starting a seldon core model with replicas. See the
+  Seldon Core examples folder for implementations of this batch module with
+  Seldon Core.
+
+Options:
+  -d, --deployment-name TEXT      The name of the SeldonDeployment to send the
+                                  requests to  [required]
+
+  -g, --gateway-type [ambassador|istio|seldon]
+                                  The gateway type for the seldon model, which
+                                  can be through the ingress provider
+                                  (istio/ambassador) or directly through the
+                                  service (seldon)
+
+  -n, --namespace TEXT            The Kubernetes namespace where the
+                                  SeldonDeployment is deployed in
+
+  -h, --host TEXT                 The hostname for the seldon model to send
+                                  the request to, which can be the ingress of
+                                  the Seldon model or the service itself
+
+  -t, --transport [rest|grpc]     The transport type of the SeldonDeployment
+                                  model which can be REST or GRPC
+
+  -a, --data-type [data|json|str]
+                                  Whether to use json, strData or Seldon Data
+                                  type for the payload to send to the
+                                  SeldonDeployment which aligns with the
+                                  SeldonClient format
+
+  -p, --payload-type [ndarray|tensor|tftensor]
+                                  The payload type expected by the
+                                  SeldonDeployment and hence the expected
+                                  format for the data in the input file which
+                                  can be an array
+
+  -w, --workers INTEGER           The number of parallel request processor
+                                  workers to run for parallel processing
+
+  -r, --retries INTEGER           The number of retries for each request
+                                  before marking an error
+
+  -i, --input-data-path PATH      The local filestore path where the input
+                                  file with the data to process is located
+
+  -o, --output-data-path PATH     The local filestore path where the output
+                                  file should be written with the outputs of
+                                  the batch processing
+
+  -m, --method [predict]          The method of the SeldonDeployment to send
+                                  the request to which currently only supports
+                                  the predict method
+
+  -l, --log-level [debug|info|warning|error]
+                                  The log level for the batch processor
+  -b, --benchmark                 If true the batch processor will print the
+                                  elapsed time taken to run the process
+
+  -u, --batch-id TEXT             Unique batch ID to identify all datapoints
+                                  processed in this batch, if not provided is
+                                  auto generated
+
+  --help                          Show this message and exit.
+
+```
+
+### Identifiers
+
+Each data point that is sent to the Seldon Core model contains the following identifiers in the request metadata:
+* Batch ID - A unique identifier which can be provided through CLI or is automatically generated
+* Batch Instance ID - A generated unique identifier for each datapoint processed
+* Batch Index - The local ordered descending index for the datapoint relative to the input file location
+
+These identifiers are added on each request as follows:
+
+```
+seldon_request = {
+    <data>: <current_batch_instance>,
+    "meta": {
+        "tags": {
+            "batch_id": <BATCH_ID>
+            "batch_instance_id": <BATCH_INSTANCE_ID>
+            "batch_index": <BATCH_INDEX>
+            }
+        }
+    }
+```
+
+This allows the requests to be identified and matched against the initial request in the data.
+
+### Performance
+
+The implementation of the module is done leveraging Python's Threading system. 
+
+Benchmarking was carried out using vanilla Python requests module to assess performance of Threading vs Twisted vs AsyncIO. The results showed better performance with Asyncio, however given that the logic in the worker is quite minimal (ie sending a request) and most of the time is waiting for the response, the implementation with Python's native threading was able to perform at speeds that were efficient enough to very easily scale to thousands of workers.
+
+However currently the implementation uses the Seldon Client which does not leverage quite a few optimization requirements to increase the performance of processing, such as re-using a requests.py session. However even without these optimisations the worker will still reach a highly concurrent performance, and these optimizations will be introduced as adoption of this component (and feedback) grows.
+
diff --git a/doc/source/streaming/knative_eventing.md b/doc/source/streaming/knative_eventing.md
@@ -4,6 +4,8 @@ Seldon has an integration with KNative eventing that allows for real time proces
 
 This allow Seldon Core users to connect SeldonDeployments through triggers that will receive any relevant Cloudevents.
 
+![](../images/stream-processing-knative.jpg)
+
 ## Triggers
 
 The way that KNative Eventing works is by creating triggers that send any relevant Cloudevents that match a specific setup into the relevant addressable location.

diff --git a/examples/batch/argo-workflows-batch/.gitignore b/examples/batch/argo-workflows-batch/.gitignore
@@ -0,0 +1,2 @@
+assets/input-data.txt
+assets/output-data.txt