docs: Release DuckDB RC (#629)

* docs: Release DuckDB RC * docs: Include list of supported invalid_type_action connectors
spiceai · Nov 12, 2024 · df23615 · df23615
1 parent e102350
commit df23615
Show file tree

Hide file tree

Showing 3 changed files with 25 additions and 3 deletions.
diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md
@@ -24,6 +24,8 @@ The DuckDB data connector can be configured by providing the following `params`:
 
 Configuration `params` are provided either in the top level `dataset` for a dataset source, or in the `acceleration` section for a data store.
 
+The DuckDB data connector supports specifying an [`invalid_type_action` dataset parameter](../../reference/spicepod/datasets.md#invalid_type_action), modifying the behavior of the Runtime when a data type the connector does not support is encountered.
+
 A generic example of DuckDB data connector configuration.
 
 ```yaml

diff --git a/spiceaidocs/docs/components/data-connectors/index.md b/spiceaidocs/docs/components/data-connectors/index.md
@@ -13,6 +13,7 @@ Currently supported Data Connectors include:
 
 | Name            | Description               | Status            | Protocol/Format                     | Refresh Modes               | Supports [Ingestion][ingestion] | Supports Documents |
 | --------------- | ------------------------- | ----------------- | ----------------------------------- | --------------------------- | ------------------------------- | ------------------ |
+| `duckdb`        | DuckDB                    | Release Candidate |                                     | `append`, `full`            | ❌                              | ❌                 |
 | `github`        | GitHub                    | Release Candidate | GraphQL, REST                       | `append`, `full`            | ❌                              | ❌                 |
 | `mysql`         | MySQL                     | Release Candidate |                                     | `append`, `full`            | Roadmap                         | ❌                 |
 | `postgres`      | PostgreSQL                | Release Candidate |                                     | `append`, `full`            | Roadmap                         | ❌                 |
@@ -26,7 +27,6 @@ Currently supported Data Connectors include:
 | `clickhouse`    | Clickhouse                | Alpha             |                                     | `append`, `full`            | ❌                              | ❌                 |
 | `debezium`      | Debezium                  | Alpha             | CDC, Kafka                          | `append`, `full`, `changes` | ❌                              | ❌                 |
 | `dremio`        | Dremio                    | Alpha             | Arrow Flight SQL                    | `append`, `full`            | ❌                              | ❌                 |
-| `duckdb`        | DuckDB                    | Alpha             |                                     | `append`, `full`            | ❌                              | ❌                 |
 | `file`          | File                      | Alpha             | Parquet, CSV                        | `append`, `full`            | Roadmap                         | ✅                 |
 | `ftp`, `sftp`   | FTP/SFTP                  | Alpha             | Parquet, CSV                        | `append`, `full`            | ❌                              | ✅                 |
 | `graphql`       | GraphQL                   | Alpha             | GraphQL                             | `append`, `full`            | ❌                              | ❌                 |

diff --git a/spiceaidocs/docs/reference/spicepod/datasets.md b/spiceaidocs/docs/reference/spicepod/datasets.md
@@ -144,6 +144,25 @@ Spice emits a warning if the `time_column` from the data source is incompatible
 :::warning[Limitations]
 
 - String-based columns are assumed to be ISO8601 format.
+
+:::
+
+## `invalid_type_action`
+
+Optional. Specifies the action to take when a data type that is not supported by the data connector is encountered.
+
+The following values are supported:
+
+- `error` - Default. Return an error when an unsupported data type is encountered.
+- `warn` - Log a warning and ignore the column containing the unsupported data type.
+- `ignore` - Log nothing and ignore the column containing the unsupported data type.
+
+:::warning[Limitations]
+
+Not all connectors support specifying an `invalid_type_action`. When specified on a connector that does not support the option, the connector will fail to register. The following connectors support `invalid_type_action`:
+
+- [DuckDB](../../components/data-connectors/duckdb.md)
+
 :::
 
 ## `acceleration`
@@ -196,6 +215,7 @@ Must be of the form `SELECT * FROM {name} WHERE {refresh_filter}`. `{name}` is t
 - The refresh SQL only supports filtering data from the current dataset - joining across other datasets is not supported.
 - Selecting a subset of columns isn't supported - the refresh SQL needs to start with `SELECT * FROM {name}`.
 - Queries for data that have been filtered out will not fall back to querying against the federated table.
+
 :::
 
 ## `acceleration.refresh_data_window`
@@ -230,8 +250,8 @@ Optional. Defines the maximum number of retry attempts when refresh retries are
 
 Supports one of two values:
 
-* `on_registration`: Mark the dataset as ready immediately, and queries on this table will fall back to the underlying source directly until the initial acceleration is complete
-* `on_load`: Mark the dataset as ready only after the initial acceleration. Queries against the dataset will return an error before the load has been completed.
+- `on_registration`: Mark the dataset as ready immediately, and queries on this table will fall back to the underlying source directly until the initial acceleration is complete
+- `on_load`: Mark the dataset as ready only after the initial acceleration. Queries against the dataset will return an error before the load has been completed.
 
 ```yaml
 datasets:
-Original file line number
+Diff line change
@@ Expand Up @@
     Configuration `params` are provided either in the top level `dataset` for a dataset source, or in the `acceleration` section for a data store.
+    The DuckDB data connector supports specifying an [`invalid_type_action` dataset parameter](../../reference/spicepod/datasets.md#invalid_type_action), modifying the behavior of the Runtime when a data type the connector does not support is encountered.
     A generic example of DuckDB data connector configuration.
     ```yaml
@@ Expand Down @@