diff --git a/spiceaidocs/docs/components/data-connectors/duckdb.md b/spiceaidocs/docs/components/data-connectors/duckdb.md index 26f9a1b1..cfb6166e 100644 --- a/spiceaidocs/docs/components/data-connectors/duckdb.md +++ b/spiceaidocs/docs/components/data-connectors/duckdb.md @@ -24,6 +24,8 @@ The DuckDB data connector can be configured by providing the following `params`: Configuration `params` are provided either in the top level `dataset` for a dataset source, or in the `acceleration` section for a data store. +The DuckDB data connector supports specifying an [`invalid_type_action` dataset parameter](../../reference/spicepod/datasets.md#invalid_type_action), modifying the behavior of the Runtime when a data type the connector does not support is encountered. + A generic example of DuckDB data connector configuration. ```yaml diff --git a/spiceaidocs/docs/components/data-connectors/index.md b/spiceaidocs/docs/components/data-connectors/index.md index 64a8bdfc..b631482d 100644 --- a/spiceaidocs/docs/components/data-connectors/index.md +++ b/spiceaidocs/docs/components/data-connectors/index.md @@ -13,6 +13,7 @@ Currently supported Data Connectors include: | Name | Description | Status | Protocol/Format | Refresh Modes | Supports [Ingestion][ingestion] | Supports Documents | | --------------- | ------------------------- | ----------------- | ----------------------------------- | --------------------------- | ------------------------------- | ------------------ | +| `duckdb` | DuckDB | Release Candidate | | `append`, `full` | ❌ | ❌ | | `github` | GitHub | Release Candidate | GraphQL, REST | `append`, `full` | ❌ | ❌ | | `mysql` | MySQL | Release Candidate | | `append`, `full` | Roadmap | ❌ | | `postgres` | PostgreSQL | Release Candidate | | `append`, `full` | Roadmap | ❌ | @@ -26,7 +27,6 @@ Currently supported Data Connectors include: | `clickhouse` | Clickhouse | Alpha | | `append`, `full` | ❌ | ❌ | | `debezium` | Debezium | Alpha | CDC, Kafka | `append`, `full`, `changes` | ❌ | ❌ | | `dremio` | Dremio | Alpha | Arrow Flight SQL | `append`, `full` | ❌ | ❌ | -| `duckdb` | DuckDB | Alpha | | `append`, `full` | ❌ | ❌ | | `file` | File | Alpha | Parquet, CSV | `append`, `full` | Roadmap | ✅ | | `ftp`, `sftp` | FTP/SFTP | Alpha | Parquet, CSV | `append`, `full` | ❌ | ✅ | | `graphql` | GraphQL | Alpha | GraphQL | `append`, `full` | ❌ | ❌ | diff --git a/spiceaidocs/docs/reference/spicepod/datasets.md b/spiceaidocs/docs/reference/spicepod/datasets.md index 3d61485a..8e18e6cb 100644 --- a/spiceaidocs/docs/reference/spicepod/datasets.md +++ b/spiceaidocs/docs/reference/spicepod/datasets.md @@ -144,6 +144,25 @@ Spice emits a warning if the `time_column` from the data source is incompatible :::warning[Limitations] - String-based columns are assumed to be ISO8601 format. + +::: + +## `invalid_type_action` + +Optional. Specifies the action to take when a data type that is not supported by the data connector is encountered. + +The following values are supported: + +- `error` - Default. Return an error when an unsupported data type is encountered. +- `warn` - Log a warning and ignore the column containing the unsupported data type. +- `ignore` - Log nothing and ignore the column containing the unsupported data type. + +:::warning[Limitations] + +Not all connectors support specifying an `invalid_type_action`. When specified on a connector that does not support the option, the connector will fail to register. The following connectors support `invalid_type_action`: + +- [DuckDB](../../components/data-connectors/duckdb.md) + ::: ## `acceleration` @@ -196,6 +215,7 @@ Must be of the form `SELECT * FROM {name} WHERE {refresh_filter}`. `{name}` is t - The refresh SQL only supports filtering data from the current dataset - joining across other datasets is not supported. - Selecting a subset of columns isn't supported - the refresh SQL needs to start with `SELECT * FROM {name}`. - Queries for data that have been filtered out will not fall back to querying against the federated table. + ::: ## `acceleration.refresh_data_window` @@ -230,8 +250,8 @@ Optional. Defines the maximum number of retry attempts when refresh retries are Supports one of two values: -* `on_registration`: Mark the dataset as ready immediately, and queries on this table will fall back to the underlying source directly until the initial acceleration is complete -* `on_load`: Mark the dataset as ready only after the initial acceleration. Queries against the dataset will return an error before the load has been completed. +- `on_registration`: Mark the dataset as ready immediately, and queries on this table will fall back to the underlying source directly until the initial acceleration is complete +- `on_load`: Mark the dataset as ready only after the initial acceleration. Queries against the dataset will return an error before the load has been completed. ```yaml datasets: