From 8bbe314472f251bd64df16b2087d1d6ae62e6f89 Mon Sep 17 00:00:00 2001 From: jeadie Date: Thu, 28 Mar 2024 07:46:06 +1100 Subject: [PATCH 1/6] draft of ML docs --- spiceaidocs/docs/machine-learning/index.md | 25 ++++++++ .../docs/machine-learning/inference/index.md | 60 +++++++++++++++++++ .../model-deployment/huggingface.md | 35 ++++++++++- .../model-deployment/index.md | 16 +++++ .../model-deployment/local.md | 16 +++++ .../model-deployment/spiceai.md | 43 ++++++++++++- 6 files changed, 193 insertions(+), 2 deletions(-) create mode 100644 spiceaidocs/docs/machine-learning/model-deployment/local.md diff --git a/spiceaidocs/docs/machine-learning/index.md b/spiceaidocs/docs/machine-learning/index.md index 5273497bc..8af9cf643 100644 --- a/spiceaidocs/docs/machine-learning/index.md +++ b/spiceaidocs/docs/machine-learning/index.md @@ -10,3 +10,28 @@ sidebar_position: 8 The Spice ML runtime is in its early preview phase and is subject to modifications. ::: + +Machine learning models can be added to the Spice runtime similarily to datasets. The Spice runtime will load it, just like a dataset. +```yaml +name: my_spicepod +version: v1beta1 +kind: Spicepod + +models: + - from: file:/model_path.onnx + name: my_model_name + datasets: + - my_inference_view + +datasets: + - from: localhost + name: my_inference_view + sql_ref: inference.sql + + # All your other datasets + - from: spice.ai/eth.recent_blocks + name: eth_recent_blocks + acceleration: + enabled: true + refresh_mode: append +``` \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/inference/index.md b/spiceaidocs/docs/machine-learning/inference/index.md index fd562726f..cb537e680 100644 --- a/spiceaidocs/docs/machine-learning/inference/index.md +++ b/spiceaidocs/docs/machine-learning/inference/index.md @@ -4,3 +4,63 @@ sidebar_label: 'Machine Learning Inference' description: '' sidebar_position: 2 --- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +The Spice ML runtime currently supports prediction via an API in the runtime. +```shell +GET /v1/models/:name/predict +``` +Where + - `name`: references the name provided in the `spicepod.yaml`. + + +### Examples +```shell +curl "http://localhost:3000/v1/models/my_model_name/predict" +``` + +#### Response + + + ```json + { + "status": "Success", + "model_name": "my_model_name", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.45, 0.50, 0.55], + "duration_ms": 123 + } + ``` + + + ```json + { + "status": "BadRequest", + "error_message": "You have me a bad request :(", + "model_name": "my_model_name", + "lookback": 30, + "duration_ms": 12 + } + ``` + + + ```json + { + "status": "InternalError", + "error_message": "Oops, the server couldn't predict", + "model_name": "my_model_name", + "lookback": 30, + "duration_ms": 12 + } + ``` + + + +### Limitations +- Univariate predictions only +- Multiple covariates +- Covariate and output variate must have a fixed time frequency. +- No support for discrete or exogenous variables. \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md index bb7f4c9f7..c7ae54ae6 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md @@ -2,4 +2,37 @@ title: "Huggingface" sidebar_label: "Huggingface" sidebar_position: 1 ---- \ No newline at end of file +--- + +To define a model component from HuggingFace, specify it in the `from` key. + +### Example +```yaml +models: + - from: huggingface:huggingface.co/spiceai/darts:latest + name: hf_model + files: + - model.onnx + datasets: + - taxi_trips +``` + +### `from` Format +The `from` key follows the following regex format: +``` +\A(huggingface:)(huggingface\.co\/)?(?[\w\-]+)\/(?[\w\-]+)(:(?[\w\d\-\.]+))?\z +``` +#### Examples +- `huggingface:username/modelname`: Implies the latest version of `modelname` hosted by `username`. +- `huggingface:huggingface.co/username/modelname:revision`: Specifies a particular `revision` of `modelname` by `username`, including the optional domain. + +#### Specification +1. **Prefix:** The value must start with `huggingface:`. +2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported. +3. **Organization/User:** The HuggingFace organisation (`org`). +4. **Model Name:** After a `/`, the model name (`model`). +5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`). + + +### Limitations +- Supports only ONNX format files. \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/index.md b/spiceaidocs/docs/machine-learning/model-deployment/index.md index 954698fbb..8027c4c9b 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/index.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/index.md @@ -4,3 +4,19 @@ sidebar_label: 'ML Model Deployment' description: '' sidebar_position: 1 --- + +Models can be loaded from a variety of sources: +- Local filesystem: Local ONNX files. +- HuggingFace: Models Hosted on HuggingFace. +- SpiceAI: Models trained on the Spice.AI Cloud Platform + +A model component, within a spicepod, has the following format. + + +| field | Description | +| ----------------- | ------------------------------------------------------------------- | +| `name` | Unique, readable name for the model within the Spicepod. | +| `from` | Provider specific address to uniquely identify a model | +| `datasets` | Datasets that the model depends on for inference | +| `files` (HF only) | Specify an individual file within the HuggingFace repository to use | + \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/local.md b/spiceaidocs/docs/machine-learning/model-deployment/local.md new file mode 100644 index 000000000..ddc46faad --- /dev/null +++ b/spiceaidocs/docs/machine-learning/model-deployment/local.md @@ -0,0 +1,16 @@ +--- +title: "Local" +sidebar_label: "Local" +sidebar_position: 3 +--- + +Local models can be used by specifying the file's path in `from` key. + +### Example +```yaml +models: + - from: file:/absolute/path/to/my/model.onnx + name: local_model + datasets: + - taxi_trips +``` \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md index c878c3bf2..63d1cfb17 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md @@ -2,4 +2,45 @@ title: "SpiceAI" sidebar_label: "SpiceAI" sidebar_position: 2 ---- \ No newline at end of file +--- + +### Example +To run a model trained on the Spice.AI platform, specifiy it in the from key () from HuggingFace, specify it in the `from` key. +```yaml +models: + - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats + name: drive_stats + datasets: + - drive_stats_inferencing +``` + +This configuration allows for specifying models hosted by Spice AI, including their versions or specific training run IDs. +```yaml +models: + - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Git-like tagging + name: drive_stats_a + datasets: + - drive_stats_inferencing + + - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Specific training run ID + name: drive_stats_b + datasets: + - drive_stats_inferencing +``` + +### `from` Format +The from key must conform to the following regex format: +```regex +\A(?:spice\.ai\/)?(?[\w\-]+)\/(?[\w\-]+)(?:\/models)?\/(?[\w\-]+):(?[\w\d\-\.]+)\z +``` + +#### Examples +- `spice.ai/lukekim/smart/models/drive_stats:latest`: Refers to the latest version of the drive_stats model in the smart application by the user or organization lukekim. +- `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID, bypassing the /models path for conciseness. + +#### Specification +1. **Prefix (Optional):** The value must start with `spice.ai/`. +1. **Organization/User:** The name of the organization or user (`org`) hosting the model. +1. **Application Name**: The name of the application (`app`) which the model belongs to. +4. **Model Name:** The name of the model (`model`). +5. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID. \ No newline at end of file From 592e781975ae325fb769d9c10c43c2a35f9e4c8c Mon Sep 17 00:00:00 2001 From: jeadie Date: Thu, 28 Mar 2024 08:07:40 +1100 Subject: [PATCH 2/6] improvements and POST /predict --- .../docs/machine-learning/inference/index.md | 68 +++++++++++++++---- .../model-deployment/huggingface.md | 4 +- .../model-deployment/index.md | 4 +- .../model-deployment/spiceai.md | 2 +- 4 files changed, 58 insertions(+), 20 deletions(-) diff --git a/spiceaidocs/docs/machine-learning/inference/index.md b/spiceaidocs/docs/machine-learning/inference/index.md index cb537e680..97d45c543 100644 --- a/spiceaidocs/docs/machine-learning/inference/index.md +++ b/spiceaidocs/docs/machine-learning/inference/index.md @@ -8,30 +8,27 @@ sidebar_position: 2 import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -The Spice ML runtime currently supports prediction via an API in the runtime. -```shell -GET /v1/models/:name/predict -``` -Where - - `name`: references the name provided in the `spicepod.yaml`. +The Spice ML runtime currently supports prediction via an API in the Spice runtime. - -### Examples +### GET `/v1/models/:name/predict` ```shell curl "http://localhost:3000/v1/models/my_model_name/predict" ``` +Where: + - `name`: References the name provided in the `spicepod.yaml`. + #### Response ```json { - "status": "Success", - "model_name": "my_model_name", - "model_version": "1.0", - "lookback": 30, - "prediction": [0.45, 0.50, 0.55], - "duration_ms": 123 + "status": "Success", + "model_name": "my_model_name", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.45, 0.50, 0.55], + "duration_ms": 123 } ``` @@ -59,7 +56,48 @@ curl "http://localhost:3000/v1/models/my_model_name/predict" -### Limitations +### POST `/v1/predict` +It's also possible to run multiple prediction models in parallel, useful for ensembling or A/B testing. +```shell +curl --request POST \ + --url http://localhost:3000/v1/predict \ + --data '{ + "predictions": [ + { + "model_name": "drive_stats_a" + }, + { + "model_name": "drive_stats_b" + } + ] +}' +``` +Where: + - Each `model_name` provided references a model `name` in the Spicepod. + +#### +```json +{ + "duration_ms": 81, + "predictions": [{ + "status": "Success", + "model_name": "drive_stats_a", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.45, 0.5, 0.55], + "duration_ms": 42 + }, { + "status": "Success", + "model_name": "drive_stats_b", + "model_version": "1.0", + "lookback": 30, + "prediction": [0.43, 0.51, 0.53], + "duration_ms": 42 + }] +} +``` + +## Limitations - Univariate predictions only - Multiple covariates - Covariate and output variate must have a fixed time frequency. diff --git a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md index c7ae54ae6..15078095f 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/huggingface.md @@ -19,7 +19,7 @@ models: ### `from` Format The `from` key follows the following regex format: -``` +```regex \A(huggingface:)(huggingface\.co\/)?(?[\w\-]+)\/(?[\w\-]+)(:(?[\w\d\-\.]+))?\z ``` #### Examples @@ -35,4 +35,4 @@ The `from` key follows the following regex format: ### Limitations -- Supports only ONNX format files. \ No newline at end of file +- ONNX format support only \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/index.md b/spiceaidocs/docs/machine-learning/model-deployment/index.md index 8027c4c9b..6fbd18952 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/index.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/index.md @@ -10,13 +10,13 @@ Models can be loaded from a variety of sources: - HuggingFace: Models Hosted on HuggingFace. - SpiceAI: Models trained on the Spice.AI Cloud Platform -A model component, within a spicepod, has the following format. +A model component, within a Spicepod, has the following format. | field | Description | | ----------------- | ------------------------------------------------------------------- | | `name` | Unique, readable name for the model within the Spicepod. | -| `from` | Provider specific address to uniquely identify a model | +| `from` | Source-specific address to uniquely identify a model | | `datasets` | Datasets that the model depends on for inference | | `files` (HF only) | Specify an individual file within the HuggingFace repository to use | \ No newline at end of file diff --git a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md index 63d1cfb17..c32109df4 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md @@ -5,7 +5,7 @@ sidebar_position: 2 --- ### Example -To run a model trained on the Spice.AI platform, specifiy it in the from key () from HuggingFace, specify it in the `from` key. +To run a model trained on the Spice.AI platform, specifiy it in the `from` key. ```yaml models: - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats From aed71da6b0529bfed61f68b11b35eb454118ca7e Mon Sep 17 00:00:00 2001 From: jeadie Date: Thu, 28 Mar 2024 08:14:20 +1100 Subject: [PATCH 3/6] code formatting for bash --- spiceaidocs/docusaurus.config.ts | 1 + 1 file changed, 1 insertion(+) diff --git a/spiceaidocs/docusaurus.config.ts b/spiceaidocs/docusaurus.config.ts index e8c31eb5e..3445f8be1 100644 --- a/spiceaidocs/docusaurus.config.ts +++ b/spiceaidocs/docusaurus.config.ts @@ -137,6 +137,7 @@ const config: Config = { prism: { theme: prismThemes.github, darkTheme: prismThemes.dracula, + additionalLanguages: ['bash'], }, algolia: { appId: '0SP8I8JTL8', From a2e0535dfff1722747c7380d7bb82d71e573cad0 Mon Sep 17 00:00:00 2001 From: jeadie Date: Thu, 28 Mar 2024 08:14:57 +1100 Subject: [PATCH 4/6] and JSON --- spiceaidocs/docusaurus.config.ts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docusaurus.config.ts b/spiceaidocs/docusaurus.config.ts index 3445f8be1..5fb78aae7 100644 --- a/spiceaidocs/docusaurus.config.ts +++ b/spiceaidocs/docusaurus.config.ts @@ -137,7 +137,7 @@ const config: Config = { prism: { theme: prismThemes.github, darkTheme: prismThemes.dracula, - additionalLanguages: ['bash'], + additionalLanguages: ['bash', 'json'], }, algolia: { appId: '0SP8I8JTL8', From a1292a34caf218e817ff2dd06195a76d041a42bf Mon Sep 17 00:00:00 2001 From: Phillip LeBlanc Date: Thu, 28 Mar 2024 07:54:35 +0900 Subject: [PATCH 5/6] Apply suggestions from code review --- spiceaidocs/docs/machine-learning/model-deployment/spiceai.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md index c32109df4..a4dcc5b0b 100644 --- a/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md +++ b/spiceaidocs/docs/machine-learning/model-deployment/spiceai.md @@ -5,7 +5,7 @@ sidebar_position: 2 --- ### Example -To run a model trained on the Spice.AI platform, specifiy it in the `from` key. +To run a model trained on the Spice.AI platform, specify it in the `from` key. ```yaml models: - from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats @@ -36,7 +36,7 @@ The from key must conform to the following regex format: #### Examples - `spice.ai/lukekim/smart/models/drive_stats:latest`: Refers to the latest version of the drive_stats model in the smart application by the user or organization lukekim. -- `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID, bypassing the /models path for conciseness. +- `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID. #### Specification 1. **Prefix (Optional):** The value must start with `spice.ai/`. From 14c2184b7504210a00faf43cc6851f4cd44babc3 Mon Sep 17 00:00:00 2001 From: Phillip LeBlanc Date: Thu, 28 Mar 2024 07:54:59 +0900 Subject: [PATCH 6/6] Update spiceaidocs/docs/machine-learning/index.md --- spiceaidocs/docs/machine-learning/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/machine-learning/index.md b/spiceaidocs/docs/machine-learning/index.md index 8af9cf643..dc8e22e07 100644 --- a/spiceaidocs/docs/machine-learning/index.md +++ b/spiceaidocs/docs/machine-learning/index.md @@ -11,7 +11,7 @@ The Spice ML runtime is in its early preview phase and is subject to modificatio ::: -Machine learning models can be added to the Spice runtime similarily to datasets. The Spice runtime will load it, just like a dataset. +Machine learning models can be added to the Spice runtime similarly to datasets. The Spice runtime will load it, just like a dataset. ```yaml name: my_spicepod version: v1beta1