From e8383c2ef836c284243c29fc1c7dcf145bac72d7 Mon Sep 17 00:00:00 2001 From: Scott Lyons Date: Fri, 1 Nov 2024 13:10:36 -0700 Subject: [PATCH 1/8] Standardizing Duckdb connector documentation --- .../docs/components/data-connectors/duckdb.md | 62 ++++++++++++++++--- 1 file changed, 55 insertions(+), 7 deletions(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index 26f9a1b1..b653621f 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -4,9 +4,9 @@ sidebar_label: 'DuckDB Data Connector' description: 'DuckDB Data Connector Documentation' --- -## Dataset Source +DuckDB is an in-process SQL OLAP database management system designed for analytical query workloads. It is optimized for fast execution and can be embedded directly into applications, providing efficient data processing without the need for a separate database server. -To connect to a DuckDB [persistent database](https://duckdb.org/docs/connect/overview#persistent-database) as a data source, specify `duckdb` as the selector in the `from` value for the dataset. +This connector allows DuckDB [persistent database](https://duckdb.org/docs/connect/overview#persistent-database) to be used as a data source for federated SQL queries. ```yaml datasets: @@ -18,13 +18,53 @@ datasets: ## Configuration +### `from` + +The `from` field supports one of two forms: + +| `from` | Description | +| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `duckdb:database.schema.table` | Read data from a table named `database.schema.table` in the DuckDB file | +| `duckdb:read_*()` | Read data using one of the common [data import](https://duckdb.org/docs/data/overview) DuckDB functions, e.g `read_json`, `read_parquet` or `read_csv`. | + +### `name` + +The dataset name. This will be used as the table name within Spice. + +Example: +```yaml +datasets: + - from: duckdb:database.schema.table + name: cool_dataset + params: + ... +``` + +```sql +SELECT COUNT(*) FROM cool_dataset; +``` + +```shell ++----------+ +| count(*) | ++----------+ +| 6001215 | ++----------+ +``` + +### `params` + The DuckDB data connector can be configured by providing the following `params`: -- `duckdb_open`: The name for the file to back the DuckDB database. +| Parameter Name | Description | +| -------------- | -------------------------------------------------- | +| `duckdb_open` | The name for the file to back the DuckDB database. | Configuration `params` are provided either in the top level `dataset` for a dataset source, or in the `acceleration` section for a data store. -A generic example of DuckDB data connector configuration. +## Examples + +### Reading from a relative path ```yaml datasets: @@ -34,7 +74,7 @@ datasets: duckdb_open: path/to/duckdb_file.duckdb ``` -This example uses a DuckDB database file that is at location /my/path/ +### Reading from an absolute path ```yaml datasets: @@ -44,7 +84,7 @@ datasets: duckdb_open: /my/path/my_database.db ``` -## DuckDB Functions +### DuckDB Functions Common [data import](https://duckdb.org/docs/data/overview) DuckDB functions can also define datasets. Instead of a fixed table reference (e.g. `database.schema.table`), a DuckDB function is provided in the `from:` key. For example @@ -70,7 +110,7 @@ is equivalent to: ```sql -- from_function -SELECT * FROM read_csv('test.csv', header = false) +SELECT * FROM read_csv('test.csv', header = false); ``` Many DuckDB data imports can be rewritten as DuckDB functions, making them usable as Spice datasets. For example: @@ -81,3 +121,11 @@ SELECT * FROM 'todos.json'; -- As a DuckDB function SELECT * FROM read_json('todos.json'); ``` + +## Using secrets + +There are currently three supported [secret stores](/components/secret-stores/index.md): + +* [Environment variables](/components/secret-stores/env) +* [Kubernetes Secret Store](/components/secret-stores/kubernetes) +* [Keyring Secret Store](/components/secret-stores/keyring) From 69ea61d9bd5f1e1f837400e4ab541ee58660691d Mon Sep 17 00:00:00 2001 From: Scott Lyons Date: Sun, 10 Nov 2024 17:52:12 -0800 Subject: [PATCH 2/8] Updating secrets section --- spiceaidocs/docs/components/data-connectors/duckdb.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index b653621f..3225af77 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -122,10 +122,6 @@ SELECT * FROM 'todos.json'; SELECT * FROM read_json('todos.json'); ``` -## Using secrets +## Secrets -There are currently three supported [secret stores](/components/secret-stores/index.md): - -* [Environment variables](/components/secret-stores/env) -* [Kubernetes Secret Store](/components/secret-stores/kubernetes) -* [Keyring Secret Store](/components/secret-stores/keyring) +Spice integrates with multiple secret stores to help manage sensitive data securely. For detailed information on supported secret stores, refer to the [secret stores documentation](/components/secret-stores). Additionally, learn how to use referenced secrets in component parameters by visiting the [using referenced secrets guide](/components/secret-stores#using-secrets). \ No newline at end of file From d534e441b482f18b5dcc28b2332c71e58740d047 Mon Sep 17 00:00:00 2001 From: Scott Lyons Date: Thu, 14 Nov 2024 09:44:05 -0800 Subject: [PATCH 3/8] Update spiceaidocs/docs/components/data-connectors/duckdb.md Co-authored-by: Phillip LeBlanc --- spiceaidocs/docs/components/data-connectors/duckdb.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index 3225af77..d39c4fc9 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -6,7 +6,7 @@ description: 'DuckDB Data Connector Documentation' DuckDB is an in-process SQL OLAP database management system designed for analytical query workloads. It is optimized for fast execution and can be embedded directly into applications, providing efficient data processing without the need for a separate database server. -This connector allows DuckDB [persistent database](https://duckdb.org/docs/connect/overview#persistent-database) to be used as a data source for federated SQL queries. +This connector allows DuckDB [persistent databases](https://duckdb.org/docs/connect/overview#persistent-database) to be used as a data source for federated/accelerated SQL queries. ```yaml datasets: From c1bf274c77011b4c2778f23218b8168510ff23c0 Mon Sep 17 00:00:00 2001 From: Scott Lyons Date: Thu, 14 Nov 2024 09:45:04 -0800 Subject: [PATCH 4/8] Removing secrets section as suggested --- spiceaidocs/docs/components/data-connectors/duckdb.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index d39c4fc9..766d5c54 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -121,7 +121,3 @@ SELECT * FROM 'todos.json'; -- As a DuckDB function SELECT * FROM read_json('todos.json'); ``` - -## Secrets - -Spice integrates with multiple secret stores to help manage sensitive data securely. For detailed information on supported secret stores, refer to the [secret stores documentation](/components/secret-stores). Additionally, learn how to use referenced secrets in component parameters by visiting the [using referenced secrets guide](/components/secret-stores#using-secrets). \ No newline at end of file From 6091c46828a7c6fd1a1c9061e889ccae21ba9a67 Mon Sep 17 00:00:00 2001 From: Scott Lyons Date: Sun, 17 Nov 2024 17:44:28 -0800 Subject: [PATCH 5/8] Correcting the supported functions for DuckDB sources --- spiceaidocs/docs/components/data-connectors/duckdb.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index 766d5c54..a82c8b21 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -22,10 +22,10 @@ datasets: The `from` field supports one of two forms: -| `from` | Description | -| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `duckdb:database.schema.table` | Read data from a table named `database.schema.table` in the DuckDB file | -| `duckdb:read_*()` | Read data using one of the common [data import](https://duckdb.org/docs/data/overview) DuckDB functions, e.g `read_json`, `read_parquet` or `read_csv`. | +| `from` | Description | +| ------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `duckdb:database.schema.table` | Read data from a table named `database.schema.table` in the DuckDB file | +| `duckdb:*` | Read data using any DuckDB function that produces a table. For example one of the [data import](https://duckdb.org/docs/data/overview) functions such as `read_json`, `read_parquet` or `read_csv`. | ### `name` From d0dfe6c5a1069a2c7ca735ae975088bd4d7e13a1 Mon Sep 17 00:00:00 2001 From: Qianqian <130200611+Sevenannn@users.noreply.github.com> Date: Wed, 27 Nov 2024 12:46:49 -0800 Subject: [PATCH 6/8] Update spiceaidocs/docs/components/data-connectors/duckdb.md Co-authored-by: peasee <98815791+peasee@users.noreply.github.com> --- spiceaidocs/docs/components/data-connectors/duckdb.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index a82c8b21..96f4fda7 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -58,7 +58,7 @@ The DuckDB data connector can be configured by providing the following `params`: | Parameter Name | Description | | -------------- | -------------------------------------------------- | -| `duckdb_open` | The name for the file to back the DuckDB database. | +| `duckdb_open` | The name of the DuckDB database to open. | Configuration `params` are provided either in the top level `dataset` for a dataset source, or in the `acceleration` section for a data store. From 178b8b9a1f584446c63a8aee91581802b1260cde Mon Sep 17 00:00:00 2001 From: Qianqian <130200611+Sevenannn@users.noreply.github.com> Date: Wed, 27 Nov 2024 12:46:59 -0800 Subject: [PATCH 7/8] Update spiceaidocs/docs/components/data-connectors/duckdb.md Co-authored-by: peasee <98815791+peasee@users.noreply.github.com> --- spiceaidocs/docs/components/data-connectors/duckdb.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index 96f4fda7..af8dc3bf 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -6,7 +6,7 @@ description: 'DuckDB Data Connector Documentation' DuckDB is an in-process SQL OLAP database management system designed for analytical query workloads. It is optimized for fast execution and can be embedded directly into applications, providing efficient data processing without the need for a separate database server. -This connector allows DuckDB [persistent databases](https://duckdb.org/docs/connect/overview#persistent-database) to be used as a data source for federated/accelerated SQL queries. +This connector supports DuckDB [persistent databases](https://duckdb.org/docs/connect/overview#persistent-database) as a data source for federated SQL queries. ```yaml datasets: From 0a91fb6b70ec24a65403895f52b5893fea09ed2d Mon Sep 17 00:00:00 2001 From: Qianqian <130200611+Sevenannn@users.noreply.github.com> Date: Wed, 27 Nov 2024 12:47:57 -0800 Subject: [PATCH 8/8] Update spiceaidocs/docs/components/data-connectors/duckdb.md --- spiceaidocs/docs/components/data-connectors/duckdb.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index af8dc3bf..54a3f472 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -4,7 +4,7 @@ sidebar_label: 'DuckDB Data Connector' description: 'DuckDB Data Connector Documentation' --- -DuckDB is an in-process SQL OLAP database management system designed for analytical query workloads. It is optimized for fast execution and can be embedded directly into applications, providing efficient data processing without the need for a separate database server. +DuckDB is an in-process SQL OLAP (Online Analytical Processing) database management system designed for analytical query workloads. It is optimized for fast execution and can be embedded directly into applications, providing efficient data processing without the need for a separate database server. This connector supports DuckDB [persistent databases](https://duckdb.org/docs/connect/overview#persistent-database) as a data source for federated SQL queries.