diff --git a/docs/assets/feast_model_inference_architecture.png b/docs/assets/feast_model_inference_architecture.png new file mode 100644 index 0000000000..3ea4fba4d0 Binary files /dev/null and b/docs/assets/feast_model_inference_architecture.png differ diff --git a/docs/getting-started/architecture/model-inference.md b/docs/getting-started/architecture/model-inference.md index 3a061603c1..582657dbc4 100644 --- a/docs/getting-started/architecture/model-inference.md +++ b/docs/getting-started/architecture/model-inference.md @@ -1,5 +1,13 @@ # Feature Serving and Model Inference +![](../../assets/feast_model_inference_architecture.png) + + +{% hint style="info" %} +**Note:** this ML Infrastructure diagram highlights an orchestration pattern that is driven by a client application. +This is not the only approach that can be taken and different patterns will result in different trade-offs. +{% endhint %} + Production machine learning systems can choose from four approaches to serving machine learning predictions (the output of model inference): 1. Online model inference with online features