Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML docs #159

Merged
merged 6 commits into from
Mar 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions spiceaidocs/docs/machine-learning/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,28 @@ sidebar_position: 8
The Spice ML runtime is in its early preview phase and is subject to modifications.

:::

Machine learning models can be added to the Spice runtime similarly to datasets. The Spice runtime will load it, just like a dataset.
```yaml
name: my_spicepod
version: v1beta1
kind: Spicepod

models:
- from: file:/model_path.onnx
name: my_model_name
datasets:
- my_inference_view

datasets:
- from: localhost
name: my_inference_view
sql_ref: inference.sql

# All your other datasets
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
acceleration:
enabled: true
refresh_mode: append
```
98 changes: 98 additions & 0 deletions spiceaidocs/docs/machine-learning/inference/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,101 @@ sidebar_label: 'Machine Learning Inference'
description: ''
sidebar_position: 2
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

The Spice ML runtime currently supports prediction via an API in the Spice runtime.

### GET `/v1/models/:name/predict`
```shell
curl "http://localhost:3000/v1/models/my_model_name/predict"
```
Where:
- `name`: References the name provided in the `spicepod.yaml`.


#### Response
<Tabs>
<TabItem value="Success" label="Success" default>
```json
{
"status": "Success",
"model_name": "my_model_name",
"model_version": "1.0",
"lookback": 30,
"prediction": [0.45, 0.50, 0.55],
"duration_ms": 123
}
```
</TabItem>
<TabItem value="Bad Request" label="Bad Request">
```json
{
"status": "BadRequest",
"error_message": "You have me a bad request :(",
"model_name": "my_model_name",
"lookback": 30,
"duration_ms": 12
}
```
</TabItem>
<TabItem value="Internal Error" label="Internal Error">
```json
{
"status": "InternalError",
"error_message": "Oops, the server couldn't predict",
"model_name": "my_model_name",
"lookback": 30,
"duration_ms": 12
}
```
</TabItem>
</Tabs>

### POST `/v1/predict`
It's also possible to run multiple prediction models in parallel, useful for ensembling or A/B testing.
```shell
curl --request POST \
--url http://localhost:3000/v1/predict \
--data '{
"predictions": [
{
"model_name": "drive_stats_a"
},
{
"model_name": "drive_stats_b"
}
]
}'
```
Where:
- Each `model_name` provided references a model `name` in the Spicepod.

####
```json
{
"duration_ms": 81,
"predictions": [{
"status": "Success",
"model_name": "drive_stats_a",
"model_version": "1.0",
"lookback": 30,
"prediction": [0.45, 0.5, 0.55],
"duration_ms": 42
}, {
"status": "Success",
"model_name": "drive_stats_b",
"model_version": "1.0",
"lookback": 30,
"prediction": [0.43, 0.51, 0.53],
"duration_ms": 42
}]
}
```

## Limitations
- Univariate predictions only
- Multiple covariates
- Covariate and output variate must have a fixed time frequency.
- No support for discrete or exogenous variables.
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,37 @@
title: "Huggingface"
sidebar_label: "Huggingface"
sidebar_position: 1
---
---

To define a model component from HuggingFace, specify it in the `from` key.

### Example
```yaml
models:
- from: huggingface:huggingface.co/spiceai/darts:latest
name: hf_model
files:
- model.onnx
datasets:
- taxi_trips
```

### `from` Format
The `from` key follows the following regex format:
```regex
\A(huggingface:)(huggingface\.co\/)?(?<org>[\w\-]+)\/(?<model>[\w\-]+)(:(?<revision>[\w\d\-\.]+))?\z
```
#### Examples
- `huggingface:username/modelname`: Implies the latest version of `modelname` hosted by `username`.
- `huggingface:huggingface.co/username/modelname:revision`: Specifies a particular `revision` of `modelname` by `username`, including the optional domain.

#### Specification
1. **Prefix:** The value must start with `huggingface:`.
2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported.
3. **Organization/User:** The HuggingFace organisation (`org`).
4. **Model Name:** After a `/`, the model name (`model`).
5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`).


### Limitations
- ONNX format support only
16 changes: 16 additions & 0 deletions spiceaidocs/docs/machine-learning/model-deployment/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,19 @@ sidebar_label: 'ML Model Deployment'
description: ''
sidebar_position: 1
---

Models can be loaded from a variety of sources:
- Local filesystem: Local ONNX files.
- HuggingFace: Models Hosted on HuggingFace.
- SpiceAI: Models trained on the Spice.AI Cloud Platform

A model component, within a Spicepod, has the following format.


| field | Description |
| ----------------- | ------------------------------------------------------------------- |
| `name` | Unique, readable name for the model within the Spicepod. |
| `from` | Source-specific address to uniquely identify a model |
| `datasets` | Datasets that the model depends on for inference |
| `files` (HF only) | Specify an individual file within the HuggingFace repository to use |

16 changes: 16 additions & 0 deletions spiceaidocs/docs/machine-learning/model-deployment/local.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Local"
sidebar_label: "Local"
sidebar_position: 3
---

Local models can be used by specifying the file's path in `from` key.

### Example
```yaml
models:
- from: file:/absolute/path/to/my/model.onnx
name: local_model
datasets:
- taxi_trips
```
43 changes: 42 additions & 1 deletion spiceaidocs/docs/machine-learning/model-deployment/spiceai.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,45 @@
title: "SpiceAI"
sidebar_label: "SpiceAI"
sidebar_position: 2
---
---

### Example
To run a model trained on the Spice.AI platform, specify it in the `from` key.
```yaml
models:
- from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats
name: drive_stats
datasets:
- drive_stats_inferencing
```

This configuration allows for specifying models hosted by Spice AI, including their versions or specific training run IDs.
```yaml
models:
- from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:latest # Git-like tagging
name: drive_stats_a
datasets:
- drive_stats_inferencing

- from: spice.ai/taxi_tech_co/taxi_drives/models/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf # Specific training run ID
name: drive_stats_b
datasets:
- drive_stats_inferencing
```

### `from` Format
The from key must conform to the following regex format:
```regex
\A(?:spice\.ai\/)?(?<org>[\w\-]+)\/(?<app>[\w\-]+)(?:\/models)?\/(?<model>[\w\-]+):(?<version>[\w\d\-\.]+)\z
```

#### Examples
- `spice.ai/lukekim/smart/models/drive_stats:latest`: Refers to the latest version of the drive_stats model in the smart application by the user or organization lukekim.
- `spice.ai/lukekim/smart/drive_stats:60cb80a2-d59b-45c4-9b68-0946303bdcaf`: Specifies a model with a unique training run ID.

#### Specification
1. **Prefix (Optional):** The value must start with `spice.ai/`.
1. **Organization/User:** The name of the organization or user (`org`) hosting the model.
1. **Application Name**: The name of the application (`app`) which the model belongs to.
4. **Model Name:** The name of the model (`model`).
5. **Version (Optional):** A colon (`:`) followed by the version identifier (`version`), which could be a semantic version, `latest` for the most recent version, or a specific training run ID.
1 change: 1 addition & 0 deletions spiceaidocs/docusaurus.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ const config: Config = {
prism: {
theme: prismThemes.github,
darkTheme: prismThemes.dracula,
additionalLanguages: ['bash', 'json'],
},
algolia: {
appId: '0SP8I8JTL8',
Expand Down