From 03a792770be3187bd91eb524f555140f71942835 Mon Sep 17 00:00:00 2001 From: sgrebnov Date: Tue, 15 Oct 2024 15:30:31 -0700 Subject: [PATCH 1/2] Add Arrow to PostgreSQL Type Mapping --- .../data-accelerators/postgres/index.md | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/spiceaidocs/docs/components/data-accelerators/postgres/index.md b/spiceaidocs/docs/components/data-accelerators/postgres/index.md index 8c85265a..238c1887 100644 --- a/spiceaidocs/docs/components/data-accelerators/postgres/index.md +++ b/spiceaidocs/docs/components/data-accelerators/postgres/index.md @@ -80,3 +80,33 @@ datasets: - The Postgres federated queries may result in unexpected result types due to the difference in DataFusion and Postgres size increase rules. Please explicitly specify the expected output type of aggregation functions when writing query involving Postgres table in Spice. For example, rewrite `SUM(int_col)` into `CAST (SUM(int_col) as BIGINT`. ::: + +## Arrow to PostgreSQL Type Mapping + +The table below lists the supported [Apache Arrow data types](https://arrow.apache.org/rust/arrow/datatypes/enum.DataType.html) and their mappings to [PostgreSQL types](https://www.postgresql.org/docs/current/datatype.html) when stored + +| Arrow Type | sea_query ColumnType | PostgreSQL Type | +| ------------------------------ | -------------------------- | ---------------------------- | +| Int8 | TinyInteger | smallint | +| Int16 | SmallInteger | smallint | +| Int32 | Integer | integer | +| Int64 | BigInteger | bigint | +| UInt8 | TinyUnsigned | smallint | +| UInt16 | SmallUnsigned | smallint | +| UInt32 | Unsigned | bigint | +| UInt64 | BigUnsigned | numeric | +| Decimal128 / Decimal256 | Decimal | decimal | +| Float32 | Float | real | +| Float64 | Double | double precision | +| Utf8 / LargeUtf8 | Text | text | +| Boolean | Boolean | bool | +| Binary / LargeBinary | VarBinary | bytea | +| FixedSizeBinary | Binary | bytea | +| Timestamp (no Timezone) | Timestamp | timestamp without time zone | +| Timestamp (with Timezone) | TimestampWithTimeZone | timestamp with time zone | +| Date32 / Date64 | Date | date | +| Time32 / Time64 | Time | time | +| Interval | Interval | interval | +| Duration | BigInteger | bigint | +| List / LargeList / FixedSizeList | Array | array | +| Struct | N/A | Composite (Custom type) | From cf4a74a8bf568172a796805eca194475033c2d62 Mon Sep 17 00:00:00 2001 From: Sergei Grebnov Date: Thu, 17 Oct 2024 11:03:05 -0700 Subject: [PATCH 2/2] Update index.md --- .../data-accelerators/postgres/index.md | 50 +++++++++---------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/spiceaidocs/docs/components/data-accelerators/postgres/index.md b/spiceaidocs/docs/components/data-accelerators/postgres/index.md index 238c1887..e1a33031 100644 --- a/spiceaidocs/docs/components/data-accelerators/postgres/index.md +++ b/spiceaidocs/docs/components/data-accelerators/postgres/index.md @@ -85,28 +85,28 @@ datasets: The table below lists the supported [Apache Arrow data types](https://arrow.apache.org/rust/arrow/datatypes/enum.DataType.html) and their mappings to [PostgreSQL types](https://www.postgresql.org/docs/current/datatype.html) when stored -| Arrow Type | sea_query ColumnType | PostgreSQL Type | -| ------------------------------ | -------------------------- | ---------------------------- | -| Int8 | TinyInteger | smallint | -| Int16 | SmallInteger | smallint | -| Int32 | Integer | integer | -| Int64 | BigInteger | bigint | -| UInt8 | TinyUnsigned | smallint | -| UInt16 | SmallUnsigned | smallint | -| UInt32 | Unsigned | bigint | -| UInt64 | BigUnsigned | numeric | -| Decimal128 / Decimal256 | Decimal | decimal | -| Float32 | Float | real | -| Float64 | Double | double precision | -| Utf8 / LargeUtf8 | Text | text | -| Boolean | Boolean | bool | -| Binary / LargeBinary | VarBinary | bytea | -| FixedSizeBinary | Binary | bytea | -| Timestamp (no Timezone) | Timestamp | timestamp without time zone | -| Timestamp (with Timezone) | TimestampWithTimeZone | timestamp with time zone | -| Date32 / Date64 | Date | date | -| Time32 / Time64 | Time | time | -| Interval | Interval | interval | -| Duration | BigInteger | bigint | -| List / LargeList / FixedSizeList | Array | array | -| Struct | N/A | Composite (Custom type) | +| Arrow Type | sea_query ColumnType | PostgreSQL Type | +| -------------------------------------- | ----------------------- | ----------------------------- | +| `Int8` | `TinyInteger` | `smallint` | +| `Int16` | `SmallInteger` | `smallint` | +| `Int32` | `Integer` | `integer` | +| `Int64` | `BigInteger` | `bigint` | +| `UInt8` | `TinyUnsigned` | `smallint` | +| `UInt16` | `SmallUnsigned` | `smallint` | +| `UInt32` | `Unsigned` | `bigint` | +| `UInt64` | `BigUnsigned` | `numeric` | +| `Decimal128` / `Decimal256` | `Decimal` | `decimal` | +| `Float32` | `Float` | `real` | +| `Float64` | `Double` | `double precision` | +| `Utf8 / LargeUtf8` | `Text` | `text` | +| `Boolean` | `Boolean` | `bool` | +| `Binary / LargeBinary` | `VarBinary` | `bytea` | +| `FixedSizeBinary` | `Binary` | `bytea` | +| `Timestamp` (no Timezone) | `Timestamp` | `timestamp` without time zone | +| `Timestamp` (with Timezone) | `TimestampWithTimeZone` | `timestamp` with time zone | +| `Date32` / `Date64` | `Date` | `date` | +| `Time32` / `Time64` | `Time` | `time` | +| `Interval` | `Interval` | `interval` | +| `Duration` | `BigInteger` | `bigint` | +| `List` / `LargeList` / `FixedSizeList` | `Array` | `array` | +| `Struct` | `N/A` | `Composite` (Custom type) |