From 45b0ff3893f3a2fd30093128a7bda5ba512ff568 Mon Sep 17 00:00:00 2001
From: Luke Kim <80174+lukekim@users.noreply.github.com>
Date: Tue, 4 Jun 2024 07:02:41 +0900
Subject: [PATCH] Machine Learning section improvements (#263)
---
spiceaidocs/docs/machine-learning/index.md | 13 ++--
.../docs/machine-learning/inference/index.md | 78 +++++++++++--------
.../model-deployment/filesystem.md | 17 ++++
.../model-deployment/huggingface.md | 18 +++--
.../model-deployment/index.md | 30 +++----
.../model-deployment/local.md | 16 ----
.../model-deployment/spiceai.md | 29 ++++---
7 files changed, 118 insertions(+), 83 deletions(-)
create mode 100644 spiceaidocs/docs/machine-learning/model-deployment/filesystem.md
delete mode 100644 spiceaidocs/docs/machine-learning/model-deployment/local.md
diff --git a/spiceaidocs/docs/machine-learning/index.md b/spiceaidocs/docs/machine-learning/index.md
index 7e8c2dcd..4e901f8a 100644
--- a/spiceaidocs/docs/machine-learning/index.md
+++ b/spiceaidocs/docs/machine-learning/index.md
@@ -8,11 +8,14 @@ pagination_prev: null
:::warning[Early Preview]
-The Spice ML runtime is in its early preview phase and is subject to modifications.
+Machine Learning (ML) is in preview and is subject to modifications.
:::
-Machine learning models can be added to the Spice runtime similarly to datasets. The Spice runtime will load it, just like a dataset.
+ML models can be defined similarly to [Datasets](../reference/spicepod/datasets.md). The runtime will load the model for inference.
+
+Example:
+
```yaml
name: my_spicepod
version: v1beta1
@@ -33,6 +36,6 @@ datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
acceleration:
- enabled: true
- refresh_mode: append
-```
\ No newline at end of file
+ enabled: true
+ refresh_mode: append
+```
diff --git a/spiceaidocs/docs/machine-learning/inference/index.md b/spiceaidocs/docs/machine-learning/inference/index.md
index 3a695373..52d61892 100644
--- a/spiceaidocs/docs/machine-learning/inference/index.md
+++ b/spiceaidocs/docs/machine-learning/inference/index.md
@@ -1,6 +1,6 @@
---
-title: 'Machine Learning Inference'
-sidebar_label: 'Machine Learning Inference'
+title: 'Machine Learning Predictions'
+sidebar_label: 'Machine Learning Predictions'
description: ''
sidebar_position: 2
pagination_prev: 'machine-learning/model-deployment/index'
@@ -10,17 +10,24 @@ pagination_next: null
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
-The Spice ML runtime currently supports prediction via an API in the Spice runtime.
+Spice includes dedicated predictions APIs.
+
+## GET `/v1/models/:name/predict`
+
+Make a prediction using a specific [deployed model](../model-deployment/index.md).
+
+Example:
-### GET `/v1/models/:name/predict`
```shell
curl "http://localhost:3000/v1/models/my_model_name/predict"
```
-Where:
- - `name`: References the name provided in the `spicepod.yaml`.
+Parameters:
+
+- `name`: References the model name defined in the `spicepod.yaml`.
+
+### Response
-#### Response
```json
@@ -58,8 +65,12 @@ Where:
-### POST `/v1/predict`
-It's also possible to run multiple prediction models in parallel, useful for ensembling or A/B testing.
+## POST `/v1/predict`
+
+Make predictions using all loaded forecasting models in parallel, useful for ensembling or A/B testing.
+
+Example:
+
```shell
curl --request POST \
--url http://localhost:3000/v1/predict \
@@ -74,34 +85,39 @@ curl --request POST \
]
}'
```
-Where:
- - Each `model_name` provided references a model `name` in the Spicepod.
-####
+Parameters:
+
+- `model_name`: References a model name defined in the `spicepod.yaml`.
+
```json
{
- "duration_ms": 81,
- "predictions": [{
- "status": "Success",
- "model_name": "drive_stats_a",
- "model_version": "1.0",
- "lookback": 30,
- "prediction": [0.45, 0.5, 0.55],
- "duration_ms": 42
- }, {
- "status": "Success",
- "model_name": "drive_stats_b",
- "model_version": "1.0",
- "lookback": 30,
- "prediction": [0.43, 0.51, 0.53],
- "duration_ms": 42
- }]
+ "duration_ms": 81,
+ "predictions": [
+ {
+ "status": "Success",
+ "model_name": "drive_stats_a",
+ "model_version": "1.0",
+ "lookback": 30,
+ "prediction": [0.45, 0.5, 0.55],
+ "duration_ms": 42
+ },
+ {
+ "status": "Success",
+ "model_name": "drive_stats_b",
+ "model_version": "1.0",
+ "lookback": 30,
+ "prediction": [0.43, 0.51, 0.53],
+ "duration_ms": 42
+ }
+ ]
}
```
:::warning[Limitations]
-- Univariate predictions only
-- Multiple covariates
+
+- Univariate predictions only.
+- Multiple covariates.
- Covariate and output variate must have a fixed time frequency.
- No support for discrete or exogenous variables.
-:::
\ No newline at end of file
+- :::
diff --git a/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md b/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md
new file mode 100644
index 00000000..98474d2a
--- /dev/null
+++ b/spiceaidocs/docs/machine-learning/model-deployment/filesystem.md
@@ -0,0 +1,17 @@
+---
+title: 'Filesystem'
+sidebar_label: 'Filesystem'
+sidebar_position: 3
+---
+
+To use a model hosted on a filesystem, specify the file path in `from`.
+
+Example:
+
+```yaml
+models:
+ - from: file://absolute/path/to/my/model.onnx
+ name: local_fs_model
+ datasets:
+ - taxi_trips
+```
diff --git a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md
index 6d83a818..d82ce690 100644
--- a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md
+++ b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md
@@ -1,12 +1,13 @@
---
-title: "Huggingface"
-sidebar_label: "Huggingface"
+title: 'HuggingFace'
+sidebar_label: 'HuggingFace'
sidebar_position: 1
---
-To define a model component from HuggingFace, specify it in the `from` key.
+To use a model hosted on HuggingFace, specify the `huggingface.co` path in the `from` key.
### Example
+
```yaml
models:
- from: huggingface:huggingface.co/spiceai/darts:latest
@@ -18,22 +19,27 @@ models:
```
### `from` Format
+
The `from` key follows the following regex format:
+
```regex
\A(huggingface:)(huggingface\.co\/)?(?[\w\-]+)\/(?[\w\-]+)(:(?[\w\d\-\.]+))?\z
```
+
#### Examples
+
- `huggingface:username/modelname`: Implies the latest version of `modelname` hosted by `username`.
- `huggingface:huggingface.co/username/modelname:revision`: Specifies a particular `revision` of `modelname` by `username`, including the optional domain.
#### Specification
+
1. **Prefix:** The value must start with `huggingface:`.
-2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported.
+2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported.
3. **Organization/User:** The HuggingFace organisation (`org`).
4. **Model Name:** After a `/`, the model name (`model`).
5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`).
-
:::warning[Limitations]
+
- ONNX format support only
-:::
\ No newline at end of file
+ :::
diff --git a/spiceaidocs/docs/machine-learning/model-deployment/index.md b/spiceaidocs/docs/machine-learning/model-deployment/index.md
index d4d9a3b5..256f28cd 100644
--- a/spiceaidocs/docs/machine-learning/model-deployment/index.md
+++ b/spiceaidocs/docs/machine-learning/model-deployment/index.md
@@ -1,28 +1,30 @@
---
-title: 'ML Model Deployment'
-sidebar_label: 'ML Model Deployment'
+title: 'Model Deployment'
+sidebar_label: 'Model Deployment'
description: ''
sidebar_position: 1
pagination_next: 'machine-learning/inference/index'
---
-Models can be loaded from a variety of sources:
-- Local filesystem: Local ONNX files.
-- HuggingFace: Models Hosted on HuggingFace.
-- SpiceAI: Models trained on the Spice.AI Cloud Platform
+Models can be loaded from:
-A model component, within a Spicepod, has the following format.
+- **Filesystem**: [ONNX](https://onnx.ai) models.
+- **HuggingFace**: ONNX and [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) models hosted on [HuggingFace](https://huggingface.co).
+- **Spice Cloud Platform**: Models hosted on the [Spice Cloud Platform](https://docs.spice.ai)
+Defined in the `spicepod.yml`, a `model` component has the following format.
| field | Description |
-| ----------------- | ------------------------------------------------------------------- |
-| `name` | Unique, readable name for the model within the Spicepod. |
-| `from` | Source-specific address to uniquely identify a model |
-| `datasets` | Datasets that the model depends on for inference |
-| `files` (HF only) | Specify an individual file within the HuggingFace repository to use |
-
+| ----------------- | ------------------------------------------------------------------- |
+| `name` | Unique, readable name for the model within the Spicepod. |
+| `from` | Source-specific address to uniquely identify a model |
+| `datasets` | Datasets that the model depends on for inference |
+| `files` (HF only) | Specify an individual file within the HuggingFace repository to use |
+
+For more detail, refer to the `model` [reference specification](../../reference/spicepod/models.md).
+
## Model Source Docs
import DocCardList from '@theme/DocCardList';
-
\ No newline at end of file
+
diff --git a/spiceaidocs/docs/machine-learning/model-deployment/local.md b/spiceaidocs/docs/machine-learning/model-deployment/local.md
deleted file mode 100644
index ddc46faa..00000000
--- a/spiceaidocs/docs/machine-learning/model-deployment/local.md
+++ /dev/null
@@ -1,16 +0,0 @@
----
-title: "Local"
-sidebar_label: "Local"
-sidebar_position: 3
----
-
-Local models can be used by specifying the file's path in `from` key.
-
-### Example
-```yaml
-models:
- - from: file:/absolute/path/to/my/model.onnx
- name: local_model
- datasets:
- - taxi_trips
-```
\ No newline at end of file
diff --git a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md
index a4dcc5b0..790976c7 100644
--- a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md
+++ b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md
@@ -1,11 +1,13 @@
---
-title: "SpiceAI"
-sidebar_label: "SpiceAI"
+title: 'Spice Cloud Platform'
+sidebar_label: 'Spice Cloud Platform'
sidebar_position: 2
---
-### Example
-To run a model trained on the Spice.AI platform, specify it in the `from` key.
+To use a model hosted on the Spice Cloud Platform, specify the `spice.ai` path in the `from` key.
+
+Example:
+
```yaml
models:
- from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats
@@ -14,33 +16,38 @@ models:
- drive_stats_inferencing
```
-This configuration allows for specifying models hosted by Spice AI, including their versions or specific training run IDs.
+Specific versions can be used by refencing a version label or Training Run ID.
+
```yaml
models:
- - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Git-like tagging
+ - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Label
name: drive_stats_a
datasets:
- drive_stats_inferencing
- - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Specific training run ID
+ - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Training Run ID
name: drive_stats_b
datasets:
- drive_stats_inferencing
```
### `from` Format
+
The from key must conform to the following regex format:
+
```regex
\A(?:spice\.ai\/)?(?[\w\-]+)\/(?[\w\-]+)(?:\/models)?\/(?[\w\-]+):(?[\w\d\-\.]+)\z
```
-#### Examples
+Examples:
+
- `spice.ai/lukekim/smart/models/drive_stats:latest`: Refers to the latest version of the drive_stats model in the smart application by the user or organization lukekim.
- `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID.
-#### Specification
+### Specification
+
1. **Prefix (Optional):** The value must start with `spice.ai/`.
1. **Organization/User:** The name of the organization or user (`org`) hosting the model.
1. **Application Name**: The name of the application (`app`) which the model belongs to.
-4. **Model Name:** The name of the model (`model`).
-5. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID.
\ No newline at end of file
+1. **Model Name:** The name of the model (`model`).
+1. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID.