From 45b0ff3893f3a2fd30093128a7bda5ba512ff568 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Tue, 4 Jun 2024 07:02:41 +0900 Subject: [PATCH] Machine Learning section improvements (#263) --- spiceaidocs/docs/machine-learning/index.md | 13 ++-- .../docs/machine-learning/inference/index.md | 78 +++++++++++-------- .../model-deployment/filesystem.md | 17 ++++ .../model-deployment/huggingface.md | 18 +++-- .../model-deployment/index.md | 30 +++---- .../model-deployment/local.md | 16 ---- .../model-deployment/spiceai.md | 29 ++++--- 7 files changed, 118 insertions(+), 83 deletions(-) create mode 100644 spiceaidocs/docs/machine-learning/model-deployment/filesystem.md delete mode 100644 spiceaidocs/docs/machine-learning/model-deployment/local.md diff --git a/spiceaidocs/docs/machine-learning/index.md b/spiceaidocs/docs/machine-learning/index.md index 7e8c2dcd..4e901f8a 100644 --- a/spiceaidocs/docs/machine-learning/index.md +++ b/spiceaidocs/docs/machine-learning/index.md @@ -8,11 +8,14 @@ pagination_prev: null :::warning[Early Preview] -The Spice ML runtime is in its early preview phase and is subject to modifications. +Machine Learning (ML) is in preview and is subject to modifications. ::: -Machine learning models can be added to the Spice runtime similarly to datasets. The Spice runtime will load it, just like a dataset. +ML models can be defined similarly to [Datasets](../reference/spicepod/datasets.md). The runtime will load the model for inference. + +Example: + ```yaml name: my_spicepod version: v1beta1 @@ -33,6 +36,6 @@ datasets: - from: spice.ai/eth.recent_blocks name: eth_recent_blocks acceleration: - enabled: true - refresh_mode: append -``` \ No newline at end of file + enabled: true + refresh_mode: append +``` diff --git a/spiceaidocs/docs/machine-learning/inference/index.md b/spiceaidocs/docs/machine-learning/inference/index.md index 3a695373..52d61892 100644 --- a/spiceaidocs/docs/machine-learning/inference/index.md +++ b/spiceaidocs/docs/machine-learning/inference/index.md @@ -1,6 +1,6 @@ --- -title: 'Machine Learning Inference' -sidebar_label: 'Machine Learning Inference' +title: 'Machine Learning Predictions' +sidebar_label: 'Machine Learning Predictions' description: '' sidebar_position: 2 pagination_prev: 'machine-learning/model-deployment/index' @@ -10,17 +10,24 @@ pagination_next: null import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -The Spice ML runtime currently supports prediction via an API in the Spice runtime. +Spice includes dedicated predictions APIs. + +## GET `/v1/models/:name/predict` + +Make a prediction using a specific [deployed model](../model-deployment/index.md). + +Example: -### GET `/v1/models/:name/predict` ```shell curl "http://localhost:3000/v1/models/my_model_name/predict" ``` -Where: - - `name`: References the name provided in the `spicepod.yaml`. +Parameters: + +- `name`: References the model name defined in the `spicepod.yaml`. + +### Response -#### Response ```json @@ -58,8 +65,12 @@ Where: -### POST `/v1/predict` -It's also possible to run multiple prediction models in parallel, useful for ensembling or A/B testing. +## POST `/v1/predict` + +Make predictions using all loaded forecasting models in parallel, useful for ensembling or A/B testing. + +Example: + ```shell curl --request POST \ --url http://localhost:3000/v1/predict \ @@ -74,34 +85,39 @@ curl --request POST \ ] }' ``` -Where: - - Each `model_name` provided references a model `name` in the Spicepod. -#### +Parameters: + +- `model_name`: References a model name defined in the `spicepod.yaml`. + ```json { - "duration_ms": 81, - "predictions": [{ - "status": "Success", - "model_name": "drive_stats_a", - "model_version": "1.0", - "lookback": 30, - "prediction": [0.45, 0.5, 0.55], - "duration_ms": 42 - }, { - "status": "Success", - "model_name": "drive_stats_b", - "model_version": "1.0", - "lookback": 30, - "prediction": [0.43, 0.51, 0.53], - "duration_ms": 42 - }] + "duration_ms": 81, + "predictions": [ + { + "status": "Success", + "model_name": "drive_stats_a", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.45, 0.5, 0.55], + "duration_ms": 42 + }, + { + "status": "Success", + "model_name": "drive_stats_b", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.43, 0.51, 0.53], + "duration_ms": 42 + } + ] } ``` :::warning[Limitations] -- Univariate predictions only -- Multiple covariates + +- Univariate predictions only. +- Multiple covariates. - Covariate and output variate must have a fixed time frequency. - No support for discrete or exogenous variables. -::: \ No newline at end of file +- ::: diff --git a/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md b/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md new file mode 100644 index 00000000..98474d2a --- /dev/null +++ b/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md @@ -0,0 +1,17 @@ +--- +title: 'Filesystem' +sidebar_label: 'Filesystem' +sidebar_position: 3 +--- + +To use a model hosted on a filesystem, specify the file path in `from`. + +Example: + +```yaml +models: + - from: file://absolute/path/to/my/model.onnx + name: local_fs_model + datasets: + - taxi_trips +``` diff --git a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md index 6d83a818..d82ce690 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md @@ -1,12 +1,13 @@ --- -title: "Huggingface" -sidebar_label: "Huggingface" +title: 'HuggingFace' +sidebar_label: 'HuggingFace' sidebar_position: 1 --- -To define a model component from HuggingFace, specify it in the `from` key. +To use a model hosted on HuggingFace, specify the `huggingface.co` path in the `from` key. ### Example + ```yaml models: - from: huggingface:huggingface.co/spiceai/darts:latest @@ -18,22 +19,27 @@ models: ``` ### `from` Format + The `from` key follows the following regex format: + ```regex \A(huggingface:)(huggingface\.co\/)?(?[\w\-]+)\/(?[\w\-]+)(:(?[\w\d\-\.]+))?\z ``` + #### Examples + - `huggingface:username/modelname`: Implies the latest version of `modelname` hosted by `username`. - `huggingface:huggingface.co/username/modelname:revision`: Specifies a particular `revision` of `modelname` by `username`, including the optional domain. #### Specification + 1. **Prefix:** The value must start with `huggingface:`. -2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported. +2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported. 3. **Organization/User:** The HuggingFace organisation (`org`). 4. **Model Name:** After a `/`, the model name (`model`). 5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`). - :::warning[Limitations] + - ONNX format support only -::: \ No newline at end of file + ::: diff --git a/spiceaidocs/docs/machine-learning/model-deployment/index.md b/spiceaidocs/docs/machine-learning/model-deployment/index.md index d4d9a3b5..256f28cd 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/index.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/index.md @@ -1,28 +1,30 @@ --- -title: 'ML Model Deployment' -sidebar_label: 'ML Model Deployment' +title: 'Model Deployment' +sidebar_label: 'Model Deployment' description: '' sidebar_position: 1 pagination_next: 'machine-learning/inference/index' --- -Models can be loaded from a variety of sources: -- Local filesystem: Local ONNX files. -- HuggingFace: Models Hosted on HuggingFace. -- SpiceAI: Models trained on the Spice.AI Cloud Platform +Models can be loaded from: -A model component, within a Spicepod, has the following format. +- **Filesystem**: [ONNX](https://onnx.ai) models. +- **HuggingFace**: ONNX and [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) models hosted on [HuggingFace](https://huggingface.co). +- **Spice Cloud Platform**: Models hosted on the [Spice Cloud Platform](https://docs.spice.ai) +Defined in the `spicepod.yml`, a `model` component has the following format. | field | Description | -| ----------------- | ------------------------------------------------------------------- | -| `name` | Unique, readable name for the model within the Spicepod. | -| `from` | Source-specific address to uniquely identify a model | -| `datasets` | Datasets that the model depends on for inference | -| `files` (HF only) | Specify an individual file within the HuggingFace repository to use | - +| ----------------- | ------------------------------------------------------------------- | +| `name` | Unique, readable name for the model within the Spicepod. | +| `from` | Source-specific address to uniquely identify a model | +| `datasets` | Datasets that the model depends on for inference | +| `files` (HF only) | Specify an individual file within the HuggingFace repository to use | + +For more detail, refer to the `model` [reference specification](../../reference/spicepod/models.md). + ## Model Source Docs import DocCardList from '@theme/DocCardList'; - \ No newline at end of file + diff --git a/spiceaidocs/docs/machine-learning/model-deployment/local.md b/spiceaidocs/docs/machine-learning/model-deployment/local.md deleted file mode 100644 index ddc46faa..00000000 --- a/spiceaidocs/docs/machine-learning/model-deployment/local.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: "Local" -sidebar_label: "Local" -sidebar_position: 3 ---- - -Local models can be used by specifying the file's path in `from` key. - -### Example -```yaml -models: - - from: file:/absolute/path/to/my/model.onnx - name: local_model - datasets: - - taxi_trips -``` \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md index a4dcc5b0..790976c7 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md @@ -1,11 +1,13 @@ --- -title: "SpiceAI" -sidebar_label: "SpiceAI" +title: 'Spice Cloud Platform' +sidebar_label: 'Spice Cloud Platform' sidebar_position: 2 --- -### Example -To run a model trained on the Spice.AI platform, specify it in the `from` key. +To use a model hosted on the Spice Cloud Platform, specify the `spice.ai` path in the `from` key. + +Example: + ```yaml models: - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats @@ -14,33 +16,38 @@ models: - drive_stats_inferencing ``` -This configuration allows for specifying models hosted by Spice AI, including their versions or specific training run IDs. +Specific versions can be used by refencing a version label or Training Run ID. + ```yaml models: - - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Git-like tagging + - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Label name: drive_stats_a datasets: - drive_stats_inferencing - - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Specific training run ID + - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Training Run ID name: drive_stats_b datasets: - drive_stats_inferencing ``` ### `from` Format + The from key must conform to the following regex format: + ```regex \A(?:spice\.ai\/)?(?[\w\-]+)\/(?[\w\-]+)(?:\/models)?\/(?[\w\-]+):(?[\w\d\-\.]+)\z ``` -#### Examples +Examples: + - `spice.ai/lukekim/smart/models/drive_stats:latest`: Refers to the latest version of the drive_stats model in the smart application by the user or organization lukekim. - `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID. -#### Specification +### Specification + 1. **Prefix (Optional):** The value must start with `spice.ai/`. 1. **Organization/User:** The name of the organization or user (`org`) hosting the model. 1. **Application Name**: The name of the application (`app`) which the model belongs to. -4. **Model Name:** The name of the model (`model`). -5. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID. \ No newline at end of file +1. **Model Name:** The name of the model (`model`). +1. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID.