Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ML Model Serving feature #619

Merged
merged 2 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion spiceaidocs/docs/features/data-ingestion/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Data Ingestion'
sidebar_label: 'Data Ingestion'
description: 'Learn how to ingest data in Spice.'
sidebar_position: 7
sidebar_position: 8
pagination_prev: null
pagination_next: null
---
Expand Down
44 changes: 44 additions & 0 deletions spiceaidocs/docs/features/ml-model-serving/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
title: 'ML Model Serving'
sidebar_label: 'ML Model Serving'
description: 'Learn how to load and serve machine learning models for inference using Spice.'
sidebar_position: 6
pagination_prev: null
pagination_next: null
---

Spice supports loading and serving ONNX models and GGUF LLMs from various sources for embeddings and inference, including local filesystems, Hugging Face, and the Spice Cloud platform.

Example `spicepod.yml` loading a GGUF LLM from HuggingFace:
lukekim marked this conversation as resolved.
Show resolved Hide resolved

```yaml
models:
- name: llama_3.2_1B
from: huggingface:huggingface.co/meta-llama/Llama-3.2-1B
params:
hf_token: ${ secrets:HF_TOKEN }
```

Example `spicepod.yml` loading an ONNX model from HuggingFace:

```yaml
models:
- from: huggingface:huggingface.co/spiceai/darts:latest
name: hf_model
files:
- path: model.onnx
datasets:
- taxi_trips
```

## Filesystem

Models can be hosted on a local filesystem and referenced directly in the configuration. For more details, see the [Filesystem Model Component](/components/models/filesystem.md).

## Hugging Face

Spice integrates with Hugging Face, enabling you to use a wide range of pre-trained models. For more information, see the [Hugging Face Model Component](/components/models/huggingface.md).

## Spice Cloud Platform

The Spice Cloud platform provides a scalable environment for training, hosting, and managing your models. For further details, see the [Spice Cloud Platform Model Component](/components/models/spice-cloud.md).
28 changes: 17 additions & 11 deletions spiceaidocs/docs/features/search/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Search Functionality'
sidebar_label: 'Search'
description: 'Learn how Spice can search across datasets using database-native and vector-search methods.'
sidebar_position: 6
sidebar_position: 7
pagination_prev: null
pagination_next: null
---
Expand Down Expand Up @@ -113,7 +113,7 @@ curl -XPOST http://localhost:8090/v1/search \

Response:

```json
````json
{
"matches": [
{
Expand All @@ -137,30 +137,35 @@ Response:
],
"duration_ms": 45
}
```
````

### Pre-Existing Embeddings

Datasets that already include embeddings can utilize the same functionalities (e.g., vector search) as those augmented with embeddings using Spice. To ensure compatibility, these table columns must adhere to the following constraints:

1. **Underlying Column Presence:**
- The underlying column must exist in the table, and be of `string` [Arrow data type](reference/datatypes.md) .

- The underlying column must exist in the table, and be of `string` [Arrow data type](reference/datatypes.md) .

2. **Embeddings Column Naming Convention:**
- For each underlying column, the corresponding embeddings column must be named as `<column_name>_embedding`. For example, a `customer_reviews` table with a `review` column must have a `review_embedding` column.

- For each underlying column, the corresponding embeddings column must be named as `<column_name>_embedding`. For example, a `customer_reviews` table with a `review` column must have a `review_embedding` column.

3. **Embeddings Column Data Type:**
- The embeddings column must have the following [Arrow data type](reference/datatypes.md) when loaded into Spice:
1. `FixedSizeList[Float32 or Float64, N]`, where `N` is the dimension (size) of the embedding vector. `FixedSizeList` is used for efficient storage and processing of fixed-size vectors.
2. If the column is [**chunked**](#chunking-support), use `List[FixedSizeList[Float32 or Float64, N]]`.

- The embeddings column must have the following [Arrow data type](reference/datatypes.md) when loaded into Spice:
1. `FixedSizeList[Float32 or Float64, N]`, where `N` is the dimension (size) of the embedding vector. `FixedSizeList` is used for efficient storage and processing of fixed-size vectors.
2. If the column is [**chunked**](#chunking-support), use `List[FixedSizeList[Float32 or Float64, N]]`.

4. **Offset Column for Chunked Data:**
- If the underlying column is chunked, there must be an additional offset column named `<column_name>_offsets` with the following Arrow data type:
1. `List[FixedSizeList[Int32, 2]]`, where each element is a pair of integers `[start, end]` representing the start and end indices of the chunk in the underlying text column. This offset column maps each chunk in the embeddings back to the corresponding segment in the underlying text column.
- *For instance, `[[0, 100], [101, 200]]` indicates two chunks covering indices 0–100 and 101–200, respectively.*
- If the underlying column is chunked, there must be an additional offset column named `<column_name>_offsets` with the following Arrow data type:
1. `List[FixedSizeList[Int32, 2]]`, where each element is a pair of integers `[start, end]` representing the start and end indices of the chunk in the underlying text column. This offset column maps each chunk in the embeddings back to the corresponding segment in the underlying text column.
- _For instance, `[[0, 100], [101, 200]]` indicates two chunks covering indices 0–100 and 101–200, respectively._

By following these guidelines, you can ensure that your dataset with pre-existing embeddings is fully compatible with the vector search and other embedding functionalities provided by Spice.

#### Example

A table `sales` with an `address` column and corresponding embedding column(s).

```shell
Expand All @@ -187,6 +192,7 @@ sql> describe sales;
```

The same table if it was chunked:

```shell
sql> describe sales;
+-------------------+-----------------------------------------+-------------+
Expand Down
Loading