From db16b7e01d8c47301fd9d5262023c71ae99686a9 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Fri, 8 Nov 2024 16:09:51 -0800 Subject: [PATCH 1/7] Improvements to Models section --- .../docs/components/models/anthropic.md | 34 +++++++-------- .../docs/components/models/filesystem.md | 33 +++++++++++++-- .../docs/components/models/huggingface.md | 11 +++-- spiceaidocs/docs/components/models/index.md | 31 ++++++-------- spiceaidocs/docs/components/models/openai.md | 42 +++++++++++++++---- spiceaidocs/docs/components/models/spiceai.md | 1 + 6 files changed, 98 insertions(+), 54 deletions(-) diff --git a/spiceaidocs/docs/components/models/anthropic.md b/spiceaidocs/docs/components/models/anthropic.md index 23320f80..05fe6955 100644 --- a/spiceaidocs/docs/components/models/anthropic.md +++ b/spiceaidocs/docs/components/models/anthropic.md @@ -1,36 +1,30 @@ --- -title: 'Anthropic Language Models' +title: 'Anthropic Models' +description: 'Instructions for using language models hosted on Anthropic with Spice.' sidebar_label: 'Anthropic' sidebar_position: 5 --- -To use a language model hosted on Anthropic, specify `anthropic` in `from`. +To use a language model hosted on Anthropic, specify `anthropic` in the `from` field. -For a specific model, include it as the model ID in `from` (see example below). Defaults to `"claude-3-5-sonnet-20240620"`. -These parameters are specific to Anthropic models: +To use a specific model, include its model ID in the `from` field (see example below). If not specified, the default model is `"claude-3-5-sonnet-20240620"`. -| Param | Description | Default | -| ----- | ----------- | ------- | -| `anthropic_api_key` | The Anthropic API key. | - | -| `anthropic_auth_token` | The Anthropic auth token. | - | -| `endpoint` | The Anthropic API base endpoint. | `https://api.anthropic.com/v1` | +The following parameters are specific to Anthropic models: -Example: +| Parameter | Description | Default | +| ---------------------- | -------------------------------- | ------------------------------ | +| `anthropic_api_key` | The Anthropic API key. | - | +| `anthropic_auth_token` | The Anthropic auth token. | - | +| `endpoint` | The Anthropic API base endpoint. | `https://api.anthropic.com/v1` | + +Example `spicepod.yml` configuration: ```yaml models: - - from: anthropic:claude-3-5-sonnet-20240620 + - from: anthropic:claude-3-5-sonnet name: claude_3_5_sonnet params: anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY } ``` -## Supported Models - -- `claude-3-5-sonnet-20240620` -- `claude-3-opus-20240229` -- `claude-3-sonnet-20240229` -- `claude-3-haiku-20240307` -- `claude-2.1` -- `claude-2.0` -- `claude-instant-1.2` +See [Anthropic Model Names](https://docs.anthropic.com/en/docs/about-claude/models#model-names) for a list of supported model names. diff --git a/spiceaidocs/docs/components/models/filesystem.md b/spiceaidocs/docs/components/models/filesystem.md index 70d88b57..9550580a 100644 --- a/spiceaidocs/docs/components/models/filesystem.md +++ b/spiceaidocs/docs/components/models/filesystem.md @@ -1,17 +1,42 @@ --- title: 'Filesystem' +description: 'Instructions for using machine learning models hosted on a filesystem with Spice.' sidebar_label: 'Filesystem' sidebar_position: 3 --- -To use a ML model hosted on a filesystem, specify the file path in `from`. +To use a model hosted on a filesystem, specify the path to the model file in `from`. -Example: +Supported formats include ONNX for traditional machine learning models and GGUF, GGML, and SafeTensor for large language models (LLMs). + +### Example: Loading an ONNX Model ```yaml models: - from: file://absolute/path/to/my/model.onnx name: local_fs_model - datasets: - - taxi_trips +``` + +### Example: Loading a GGUF Model + +```yaml +models: + - from: file://absolute/path/to/my/model.ggml + name: local_ggml_model +``` + +### Example: Loading a GGML Model + +```yaml +models: + - from: file://absolute/path/to/my/model.ggml + name: local_ggml_model +``` + +### Example: Loading a SafeTensor Model + +```yaml +models: + - from: file://absolute/path/to/my/model.safetensor + name: local_safetensor_model ``` diff --git a/spiceaidocs/docs/components/models/huggingface.md b/spiceaidocs/docs/components/models/huggingface.md index b99646f2..bfb6e446 100644 --- a/spiceaidocs/docs/components/models/huggingface.md +++ b/spiceaidocs/docs/components/models/huggingface.md @@ -1,12 +1,13 @@ --- title: 'HuggingFace' +description: 'Instructions for using machine learning models hosted on HuggingFace with Spice.' sidebar_label: 'HuggingFace' sidebar_position: 1 --- -To use a ML model hosted on HuggingFace, specify the `huggingface.co` path in `from` along with the files to include. +To use a model hosted on HuggingFace, specify the `huggingface.co` path in the `from` field along with the files to include. -Example: +Example configuration: ```yaml models: @@ -40,9 +41,11 @@ The `from` key follows the following regex format: 5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`). ### Access Tokens + Access tokens can be provided for Huggingface models in two ways: - 1. In the Huggingface token cache (i.e. `~/.cache/huggingface/token`). Default. - 1. Via model params (see below). + +1. In the Huggingface token cache (i.e. `~/.cache/huggingface/token`). Default. +1. Via model params (see below). ```yaml models: diff --git a/spiceaidocs/docs/components/models/index.md b/spiceaidocs/docs/components/models/index.md index 150e017d..3c9fcadd 100644 --- a/spiceaidocs/docs/components/models/index.md +++ b/spiceaidocs/docs/components/models/index.md @@ -1,26 +1,21 @@ --- -title: 'AI/ML Models' -sidebar_label: 'AI/ML Models' -description: '' +title: 'Model Providers' +sidebar_label: 'Model Providers' +description: 'Overview of supported model providers for ML and LLMs in Spice.' sidebar_position: 5 --- -Spice supports traditional machine learning (ML) models and language models (LLMs). +Spice supports various model providers for traditional machine learning (ML) models and large language models (LLMs). -- **Filesystem**: [ONNX](https://onnx.ai) models. -- **HuggingFace**: ONNX models hosted on [HuggingFace](https://huggingface.co). -- **Spice Cloud Platform**: Models hosted on the [Spice Cloud Platform](https://docs.spice.ai/building-blocks/spice-models). -- **OpenAI**: OpenAI (or compatible) LLM endpoints. +| Source | Description | ML Format(s) | LLM Format(s)\* | +| ------------- | ----------------------------------------------------------------------------------------------- | ------------ | ---------------------- | +| `file` | Local filesystem | ONNX | GGUF, GGML, SafeTensor | +| `huggingface` | Models hosted on [HuggingFace](https://huggingface.co) | ONNX | GGUF, GGML, SafeTensor | +| `spice.ai` | Models hosted on the [Spice Cloud Platform](https://docs.spice.ai/building-blocks/spice-models) | ONNX | - | +| `openai` | OpenAI (or compatible) LLM endpoint | - | Remote HTTP endpoint | +| `anthropic` | Models hosted on [Anthropic](https://www.anthropic.com) | - | Remote HTTP endpoint | +| `grok` | Coming soon | - | Remote HTTP endpoint | -### Model Sources - -| Name | Description | ML Format(s) | LLM Format(s)* | -| ---------------------------- | ---------------- | ------------ | ----------------------- | -| `file` | Local filesystem | ONNX | GGUF, GGML, SafeTensor | -| `huggingface:huggingface.co` | Models hosted on [HuggingFace](https://huggingface.co) | ONNX | GGUF, GGML, SafeTensor | -| `spice.ai` | Models hosted on the [Spice Cloud Platform](https://docs.spice.ai/building-blocks/spice-models) | ONNX | - | -| `openai` | OpenAI (or compatible) LLM endpoint | - | Remote HTTP endpoint | - -* LLM Format(s) may require additional files (e.g. `tokenizer_config.json`). +- LLM Format(s) may require additional files (e.g. `tokenizer_config.json`). The model type is inferred based on the model source and files. For more detail, refer to the `model` [reference specification](/reference/spicepod/models.md). diff --git a/spiceaidocs/docs/components/models/openai.md b/spiceaidocs/docs/components/models/openai.md index 24bbf6ed..0ec53d19 100644 --- a/spiceaidocs/docs/components/models/openai.md +++ b/spiceaidocs/docs/components/models/openai.md @@ -1,21 +1,21 @@ --- title: 'OpenAI (or Compatible) Language Models' +description: 'Instructions for using language models hosted on OpenAI or compatible services with Spice.' sidebar_label: 'OpenAI' sidebar_position: 4 --- -To use a language model hosted on OpenAI (or compatible), specify the `openai` path in `from`. +To use a language model hosted on OpenAI (or compatible), specify the `openai` path in `from`. For a specific model, include it as the model ID in `from` (see example below). Defaults to `"gpt-3.5-turbo"`. These parameters are specific to OpenAI models: -| Param | Description | Default | -| ----- | ----------- | ------- | -| `openai_api_key` | The OpenAI API key. | - | -| `openai_org_id` | The OpenAI organization id. | - | -| `openai_project_id` | The OpenAI project id. | - | -| `endpoint` | The OpenAI API base endpoint. | `https://api.openai.com/v1` | - +| Param | Description | Default | +| ------------------- | ----------------------------- | --------------------------- | +| `openai_api_key` | The OpenAI API key. | - | +| `openai_org_id` | The OpenAI organization id. | - | +| `openai_project_id` | The OpenAI project id. | - | +| `endpoint` | The OpenAI API base endpoint. | `https://api.openai.com/v1` | Example: @@ -25,10 +25,36 @@ models: name: local_fs_model params: openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY } +``` + +## Supported OpenAI Compatible Providers + +Spice supports several OpenAI compatible providers. Specify the appropriate endpoint in the params section. +### Groq + +Groq provides OpenAI compatible endpoints. Use the following configuration: + +```yaml +models: - from: openai:llama3-groq-70b-8192-tool-use-preview name: groq-llama params: endpoint: https://api.groq.com/openai/v1 openai_api_key: ${ secrets:SPICE_GROQ_API_KEY } ``` + +### Parasail + +Parasail also offers OpenAI compatible endpoints. Use the following configuration: + +```yaml +models: + - from: openai:parasail-model-id + name: parasail_model + params: + endpoint: https://api.parasail.com/v1 + openai_api_key: ${ secrets:SPICE_PARASAIL_API_KEY } +``` + +Refer to the respective provider documentation for more details on available models and configurations. diff --git a/spiceaidocs/docs/components/models/spiceai.md b/spiceaidocs/docs/components/models/spiceai.md index 8dc16801..a64507dd 100644 --- a/spiceaidocs/docs/components/models/spiceai.md +++ b/spiceaidocs/docs/components/models/spiceai.md @@ -1,5 +1,6 @@ --- title: 'Spice Cloud Platform' +description: 'Instructions for using models hosted on the Spice Cloud Platform with Spice.' sidebar_label: 'Spice Cloud Platform' sidebar_position: 2 --- From f1bad00a9fae3f173d19a1227d5d818ebac15860 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Sat, 9 Nov 2024 12:25:11 -0800 Subject: [PATCH 2/7] Update spiceaidocs/docs/components/models/filesystem.md --- spiceaidocs/docs/components/models/filesystem.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/models/filesystem.md b/spiceaidocs/docs/components/models/filesystem.md index 9550580a..aa3c4556 100644 --- a/spiceaidocs/docs/components/models/filesystem.md +++ b/spiceaidocs/docs/components/models/filesystem.md @@ -1,6 +1,6 @@ --- title: 'Filesystem' -description: 'Instructions for using machine learning models hosted on a filesystem with Spice.' +description: 'Instructions for using models hosted on a filesystem with Spice.' sidebar_label: 'Filesystem' sidebar_position: 3 --- From e3b20a981e5fcba2b2877dfebbe55f9918915324 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Sat, 9 Nov 2024 12:25:18 -0800 Subject: [PATCH 3/7] Update spiceaidocs/docs/components/models/anthropic.md --- spiceaidocs/docs/components/models/anthropic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/models/anthropic.md b/spiceaidocs/docs/components/models/anthropic.md index 05fe6955..c26e4b29 100644 --- a/spiceaidocs/docs/components/models/anthropic.md +++ b/spiceaidocs/docs/components/models/anthropic.md @@ -7,7 +7,7 @@ sidebar_position: 5 To use a language model hosted on Anthropic, specify `anthropic` in the `from` field. -To use a specific model, include its model ID in the `from` field (see example below). If not specified, the default model is `"claude-3-5-sonnet-20240620"`. +To use a specific model, include its model ID in the `from` field (see example below). If not specified, the default model is `"claude-3-5-sonnet-latest"`. The following parameters are specific to Anthropic models: From 8f4c2223712560519c84a5c2168bbc385425e6b0 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Sat, 9 Nov 2024 12:25:25 -0800 Subject: [PATCH 4/7] Update spiceaidocs/docs/components/models/filesystem.md Co-authored-by: Jack Eadie --- spiceaidocs/docs/components/models/filesystem.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/models/filesystem.md b/spiceaidocs/docs/components/models/filesystem.md index aa3c4556..6752d1fb 100644 --- a/spiceaidocs/docs/components/models/filesystem.md +++ b/spiceaidocs/docs/components/models/filesystem.md @@ -21,7 +21,7 @@ models: ```yaml models: - - from: file://absolute/path/to/my/model.ggml + - from: file://absolute/path/to/my/model.gguf name: local_ggml_model ``` From 55ab093154842ce3a1665904fae6d43a5410d310 Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Sat, 9 Nov 2024 12:26:56 -0800 Subject: [PATCH 5/7] Update example --- spiceaidocs/docs/components/models/filesystem.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/spiceaidocs/docs/components/models/filesystem.md b/spiceaidocs/docs/components/models/filesystem.md index 6752d1fb..ff6c115f 100644 --- a/spiceaidocs/docs/components/models/filesystem.md +++ b/spiceaidocs/docs/components/models/filesystem.md @@ -37,6 +37,12 @@ models: ```yaml models: - - from: file://absolute/path/to/my/model.safetensor - name: local_safetensor_model + - name: safety + from: file:models/llms/llama3.2-1b-instruct/model.safetensors + params: + model_type: llama3 + files: + - path: models/llms/llama3.2-1b-instruct/tokenizer.json + - path: models/llms/llama3.2-1b-instruct/tokenizer_config.json + - path: models/llms/llama3.2-1b-instruct/config.json ``` From cd573b50a61abfba1e1ebd14d4aa26f86d78bc2e Mon Sep 17 00:00:00 2001 From: Luke Kim <80174+lukekim@users.noreply.github.com> Date: Sat, 9 Nov 2024 12:28:57 -0800 Subject: [PATCH 6/7] Updates --- spiceaidocs/docs/components/models/filesystem.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/spiceaidocs/docs/components/models/filesystem.md b/spiceaidocs/docs/components/models/filesystem.md index ff6c115f..21a6951d 100644 --- a/spiceaidocs/docs/components/models/filesystem.md +++ b/spiceaidocs/docs/components/models/filesystem.md @@ -31,6 +31,10 @@ models: models: - from: file://absolute/path/to/my/model.ggml name: local_ggml_model + files: + - path: models/llms/ggml/tokenizer.json + - path: models/llms/ggml/tokenizer_config.json + - path: models/llms/ggml/config.json ``` ### Example: Loading a SafeTensor Model From 5567764fc44ba716c229d7df547ca1302e582b38 Mon Sep 17 00:00:00 2001 From: Jack Eadie Date: Mon, 11 Nov 2024 11:14:19 +1000 Subject: [PATCH 7/7] Update spiceaidocs/docs/components/models/anthropic.md --- spiceaidocs/docs/components/models/anthropic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/models/anthropic.md b/spiceaidocs/docs/components/models/anthropic.md index c26e4b29..dbed221a 100644 --- a/spiceaidocs/docs/components/models/anthropic.md +++ b/spiceaidocs/docs/components/models/anthropic.md @@ -21,7 +21,7 @@ Example `spicepod.yml` configuration: ```yaml models: - - from: anthropic:claude-3-5-sonnet + - from: anthropic:claude-3-5-sonnet-latest name: claude_3_5_sonnet params: anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY }