The TileDB connector supports most of Trino functionality. Below is a list of the features not currently supported.
The connector does not currently support creating/writing/reading encrypted arrays
The connector does not currently support the TileDB openAt
functionality to
open an array at a specific timestamp.
TileDB Trino connector supports the following SQL datatypes:
- BOOLEAN
- TINYINT
- INTEGER
- BIGINT
- REAL
- DOUBLE
- DECIMAL (treated as doubles)
- STRING*
- VARCHAR*
- CHAR*
- VARBINARY
No other datatypes are supported.
The TileDB Trino connector does not have full support for unsigned values.
Trino and all connectors are written in Java, and Java does not have unsigned
values. As a result of this Java limitation, an unsigned 64-bit integer can
overflow if it is larger than 2^63 - 1
. Unsigned integers that are 8, 16 or
32 bits are treated as larger integers. For instance, an unsigned 32-bit value
is read into a Java type of long
.
For varchar
, and char
datatypes the special case of char(1)
or varchar(1)
is stored on disk as a fixed-sized attribute of size 1. Any char
/varchar
greater
than 1 is stored as a variable-length attribute in TileDB. TileDB will not enforce
the length parameter but Trino will for inserts.
Decimal types are currently treated as doubles. TileDB does not enforce the precision or scale of the decimal types.
Create table is supported, however only a limited subset of TileDB parameters is supported.
- No support for creating encrypted arrays
- No support for setting custom filters on attributes, coordinates or offsets
The current split implementation is naive and splits domains evenly
with user defined predicates (WHERE
clause) or from the non-empty domains.
This even splitting will likely produce sub optimal splits for sparse
domains. Future work will move splitting into core TileDB where better
heuristics will be used to produce even splits.
For now, if splits are highly uneven consider increasing the number of splits
via the tiledb.splits
session parameter or add where clauses to limit the
data set to non-empty regions of the array.