You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As described in #2698, the R and Python APIs don't always write out to the same TileDB type, leading to cases where a character in R could become a bytes in Python instead of a str; this issue is a tracking for various types in the high-level languages and their mappings in Arrow and libtiledbsoma
Arrow-R maps raw to arrow::uint8(), this needs to be handled at the R level to map to arrow::binary() instead [r] Map raw to arrow::binary() #3506
The text was updated successfully, but these errors were encountered:
johnkerl
changed the title
[r/python] Standardize mappings of native types, Arrow types, and TileDB types
[r/python] Standardize mappings of native, Arrow, and TileDB types
Dec 20, 2024
As described in #2698, the R and Python APIs don't always write out to the same TileDB type, leading to cases where a
character
in R could become abytes
in Python instead of astr
; this issue is a tracking for various types in the high-level languages and their mappings in Arrow and libtiledbsomaExamples showcasing string/character/
bytes
inputs: https://gist.github.com/mojaveazure/ce7b10447b85de1f360218b173f4516fPython
str
and Rcharacter
TILEDB_STRING_UTF8
TILEDB_STRING_ASCII
/TILEDB_CHAR
, which mapped to Pythonbytes
(resolved in [r] Write group-level string metadata asTILEDB_STRING_UTF8
#3469`)character
to sparse and dense arrays [r] Disallow writingcharacter
andraw
arrays #3503TypeError: Invalid pyarrow type string
[python] Cryptic error messages when creating arrays of disallowed types #3504str
, so sparse arrays fail when trying to create a Tensor for writingstr
, Rcharacter
, and Pythonbytes
are cast to large variants #3507Python
bytes
and Rraw
TILEDB_STRING_ASCII
orTILEDB_CHAR
(equivalent types)bytes
meta data asTILEDB_STRING_UTF8
, resulting in these always being read back asstr
([python] Group and array-level meta data written asbytes
should be preserved asbytes
#3502)raw
/arrow::binary()
as meta data [r] Allow writingraw
as metadata for groups and arrays #3505raw
to sparse and dense arrays, but are cast toarrow::uint8()
[r] Disallow writingcharacter
andraw
arrays #3503TypeError: Invalid pyarrow type string
[python] Cryptic error messages when creating arrays of disallowed types #3504bytes
, so sparse arrays fail when trying to create a Tensor for writingstr
, Rcharacter
, and Pythonbytes
are cast to large variants #3507raw
toarrow::uint8()
, this needs to be handled at the R level to map toarrow::binary()
instead [r] Mapraw
toarrow::binary()
#3506The text was updated successfully, but these errors were encountered: