Skip to content

Commit

Permalink
Improve structure for v0.9 release (#122)
Browse files Browse the repository at this point in the history
* Restructure for v0.9.1 release

* Additional content

* Tweak what is spice

* Fix links

* Update Twitter

* Fix links
  • Loading branch information
lukekim authored Mar 20, 2024
1 parent 2704cb2 commit 4507c9e
Show file tree
Hide file tree
Showing 23 changed files with 325 additions and 199 deletions.
2 changes: 1 addition & 1 deletion spiceaidocs/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ footer_about_disable = false
# End user relevant links. These will show up on left side of footer and in the community page if you have one.
[[params.links.developer]]
name ="Twitter"
url = "https://twitter.com/SpiceAIHQ"
url = "https://twitter.com/spice_ai"
icon = "fab fa-twitter"
desc = "Follow us on Twitter to get the latest news!"
# Developer relevant links. These will show up on right side of footer and in the community page if you have one.
Expand Down
9 changes: 0 additions & 9 deletions spiceaidocs/content/en/Connectors/_index.md

This file was deleted.

3 changes: 2 additions & 1 deletion spiceaidocs/content/en/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ no_list: true
---

# Spice

## What is Spice?

**Spice** is a small, portable runtime that provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.
**Spice** is a small, portable runtime that provides developers with a unified SQL query interface to locally materialize, accelerate, and query data tables sourced from any database, data warehouse, or data lake.

Spice makes it easy to build data-driven and data-intensive applications by streamlining the use of data and machine learning (ML) in software.

Expand Down
168 changes: 84 additions & 84 deletions spiceaidocs/content/en/acknowledgements/_index.md

Large diffs are not rendered by default.

23 changes: 11 additions & 12 deletions spiceaidocs/content/en/cli/_index.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
type: docs
title: "Spice.ai CLI documentation"
linkTitle: "CLI"
weight: 60
description: "Detailed documentation on the Spice.ai CLI"
title: 'Spice.ai CLI documentation'
linkTitle: 'CLI'
weight: 100
description: 'Detailed documentation on the Spice.ai CLI'
---

The Spice.ai CLI is a set of commands to create and manage Spice.ai pods and interact with the Spice.ai runtime.
Expand Down Expand Up @@ -45,14 +45,13 @@ spice add spiceai/quickstart

Common commands are:

| Command | Description |
| ----------------- | ------------------------------------------------------------------- |
| spice add | Add Pod - adds a pod to the project |
| spice run | Run Spice - starts the Spice runtime, installing if necessary |
| spice version | Spice CLI version |
| spice help | Help about any command |
| spice upgrade | Upgrades the Spice CLI to the latest release |

| Command | Description |
| ------------- | ------------------------------------------------------------- |
| spice add | Add Pod - adds a pod to the project |
| spice run | Run Spice - starts the Spice runtime, installing if necessary |
| spice version | Spice CLI version |
| spice help | Help about any command |
| spice upgrade | Upgrades the Spice CLI to the latest release |

See [Spice CLI command reference]({{<ref "cli/reference">}}) for the full list of available commands.

Expand Down
7 changes: 7 additions & 0 deletions spiceaidocs/content/en/clients/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
type: docs
title: 'Clients and Tools'
linkTitle: 'Clients and Tools'
weight: 110
description: 'Client and tools'
---
File renamed without changes.
33 changes: 33 additions & 0 deletions spiceaidocs/content/en/data-accelerators/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
type: docs
title: 'Data Accelerators'
linkTitle: 'Data Accelerators'
description: ''
weight: 80
---

Data sourced by Data Connectors can be locally materialized and accelerated using a Data Accelerator.

Acceleration is enabled on a dataset by setting the acceleration configuration. E.g.

```yaml
datasets:
- name: accelerated_dataset
acceleration:
enabled: true
```
For the complete reference specification see [datasets]({{<ref "reference/spicepod/datasets">}}).
By default, datasets will be locally materialized using in-memory Arrow records.
Data Accelerators using DuckDB, SQLite, or PostgreSQL engines can be used to materialize data in files or attached databases.
Currently supported Data Accelerators include:
| Engine Name | Description | Status | Engine Modes |
| ---------------------------------------------------- | ----------------------- | ------ | ---------------- |
| `arrow` | In-Memory Arrow Records | Alpha | `memory` |
| `duckdb` | Embedded DuckDB | Alpha | `memory`, `file` |
| `sqlite` | Embedded SQLite | Alpha | `memory`, `file` |
| [`postgres`]({{<ref "data-accelerators/postgres">}}) | Attached PostgreSQL | Alpha | |
66 changes: 66 additions & 0 deletions spiceaidocs/content/en/data-accelerators/postgres/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
type: docs
title: 'PostgreSQL Data Accelerator'
linkTitle: 'PostgreSQL Data Accelerator'
description: 'PostgreSQL Data Accelerator Documentation'
---

To use PostgreSQL as Data Accelerator, specify `postgres` as the `engine` for acceleration.

```yaml
datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
acceleration:
engine: postgres
```
## Configuration
The connection to PostgreSQL can be configured by providing the following `params`:

- `pg_host`: The hostname of the PostgreSQL server.
- `pg_port`: The port of the PostgreSQL server.
- `pg_db`: The name of the database to connect to.
- `pg_user`: The username to connect with.
- `pg_pass_key`: The secret key containing the password to connect with.
- `pg_pass`: The plain-text password to connect with, ignored if `pg_pass_key` is provided.

Configuration `params` are provided either in the top level `dataset` for a dataset source and federated SQL query, or in the `acceleration` section for a data store.

```yaml
datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
acceleration:
engine: postgres
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass_key: my_secret
```

Additionally, an `engine_secret` may be provided when configuring a PostgreSQL data store to allow for using a different secret store to specify the password for a dataset using PostgreSQL as both the data source and data store.

```yaml
datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: data_store
pg_user: my_user
pg_pass_key: my_secret
acceleration:
engine: postgres
engine_secret: pg_backend
params:
pg_host: localhost
pg_port: 5433
pg_db: data_store
pg_user: my_user
pg_pass_key: my_secret
```
22 changes: 22 additions & 0 deletions spiceaidocs/content/en/data-connectors/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
type: docs
title: 'Data Connectors'
linkTitle: 'Data Connectors'
description: ''
weight: 70
---

Data Connectors provide connections to databases, data warehouses, and data lakes for federated SQL queries and data replication.

Currently supported Data Connectors include:

| Name | Description | Status | Protocol/Format | Refresh Modes | Supports Inserts |
| ------------ | ----------- | ------------ | ---------------- | ---------------- | ---------------- |
| `databricks` | Databricks | Alpha | Delta Lake | `full` ||
| `postgres` | PostgreSQL | Alpha | | `full` ||
| `spiceai` | Spice.ai | Alpha | Arrow Flight | `append`, `full` ||
| `s3` | S3 | Alpha | Parquet | `full` ||
| `dremio` | Dremio | Alpha | Arrow Flight SQL | `full` ||
| `snowflake` | Snowflake | Coming soon! | Arrow Flight SQL | `full` ||
| `bigquery` | BigQuery | Coming soon! | Arrow Flight SQL | `full` ||
| `mysql` | MySQL | Coming soon! | | `full` ||
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
---
type: docs
title: "PostgreSQL"
linkTitle: "PostgreSQL"
description: 'PostgreSQL reference'
title: 'PostgreSQL Data Connector'
linkTitle: 'PostgreSQL Data Connector'
description: 'PostgreSQL Data Connector Documentation'
---

PostgreSQL can be used by the Spice runtime as a dataset source, a data store, or for federated SQL query.

## Dataset Source/Federated SQL Query

To use PostgreSQL as a dataset source or for federated SQL query, specify `postgres` as the selector in the `from` value for the dataset.
Expand All @@ -17,18 +15,6 @@ datasets:
name: my_dataset
```
## Data Store
To use PostgreSQL as a data store for dataset acceleration, specify `postgres` as the `engine` for the dataset.

```yaml
datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
acceleration:
engine: postgres
```

## Configuration
The connection to PostgreSQL can be configured by providing the following `params`:
Expand All @@ -42,34 +28,16 @@ The connection to PostgreSQL can be configured by providing the following `param

Configuration `params` are provided either in the top level `dataset` for a dataset source and federated SQL query, or in the `acceleration` section for a data store.

### Dataset Source/Federated SQL Query

```yaml
datasets:
- from: postgres:path.to.my_dataset
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass_key: my_secret
```

### Data Store

```yaml
datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
acceleration:
engine: postgres
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass_key: my_secret
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass_key: my_secret
```

Additionally, an `engine_secret` may be provided when configuring a PostgreSQL data store to allow for using a different secret store to specify the password for a dataset using PostgreSQL as both the data source and data store.
Expand All @@ -79,18 +47,18 @@ datasets:
- from: spiceai:path.to.my_dataset
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: data_store
pg_user: my_user
pg_pass_key: my_secret
acceleration:
engine: postgres
engine_secret: pg_backend
params:
pg_host: localhost
pg_port: 5432
pg_port: 5433
pg_db: data_store
pg_user: my_user
pg_pass_key: my_secret
acceleration:
engine: postgres
engine_secret: pg_backend
params:
pg_host: localhost
pg_port: 5433
pg_db: data_store
pg_user: my_user
pg_pass_key: my_secret
```
```
Original file line number Diff line number Diff line change
@@ -1,41 +1,44 @@
---
type: docs
title: "S3 Data Connector"
linkTitle: "S3 Data Connector"
description: 'S3 Data Connector YAML reference'
title: 'S3 Data Connector'
linkTitle: 'S3 Data Connector'
description: 'S3 Data Connector Documentation'
---

S3 as a connector for federated SQL query across Parquet files stored in S3, or S3-compatible storage solutions (e.g. Minio, Cloudflare R2).

## `params`

- `endpoint`: The S3 endpoint, or equivalent (e.g. Minio endpoint), for the S3-compatible storage.
- `region`: Region of the S3 bucket, if region specific.
- `endpoint`: The S3 endpoint, or equivalent (e.g. Minio endpoint), for the S3-compatible storage.
- `region`: Region of the S3 bucket, if region specific.

## `auth`

Check [Secrets]({{<ref "secrets">}}).
Check [Secrets Stores]({{<ref "secret-stores">}}).

Required attribbutes:

- `key`: The access key authorised to access the S3 data (e.g. `AWS_ACCESS_KEY_ID` for AWS)
- `secret`The secret key authorised to access the S3 data (e.g. `AWS_SECRET_ACCESS_KEY` for AWS)


## Example

### Minio

```yaml
- from: s3://s3-bucket-name/path/to/parquet/cool_dataset.parquet
name: cool_dataset
params:
endpoint: https://my.minio.server
region: "us-east-1" # Best practice for Minio
region: 'us-east-1' # Best practice for Minio
```
#### S3
```yaml
- from: s3://my-startups-data/path/to/parquet/cool_dataset.parquet
name: cool_dataset
params:
endpoint: http://my-startups-data.s3.amazonaws.com
region: "ap-southeast-2"
```
region: 'ap-southeast-2'
```
13 changes: 13 additions & 0 deletions spiceaidocs/content/en/data-ingestion/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
type: docs
title: 'Data Ingestion'
linkTitle: 'Data Ingestion'
description: ''
weight: 40
---

Data can be ingested by the Spice runtime for replication to a Data Connector, like PostgreSQL or the Spice.ai Cloud platform.

By default, the runtime exposes an [OpenTelemety](https://opentelemetry.io) (OTEL) endpoint at grpc://127.0.0.1:50052 for data ingestion.

OTEL metrics will be inserted into datasets with matching names (metric name = dataset name) and optionally replicated to the dataset source.
7 changes: 7 additions & 0 deletions spiceaidocs/content/en/federated-queries/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
type: docs
title: 'Federated Queries'
linkTitle: 'Federated Queries'
description: ''
weight: 20
---
Loading

0 comments on commit 4507c9e

Please sign in to comment.