You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you describe an actual error or failure you have encountered as a result of this? Or even a reproducible example?
The following is a MRE that displays that, due to the different transformation of the partition value in the writer and the reader, the file_uris method fails to find the correct file uris. I used the boolean type to display the error, because in the writer you call str(value).lower(), but the same MRE would work for any other type which has a different implementation. Note in the MRE that, since the integers are casted with str(value) in both the reader and the writer. Also, note the fact that the .to_pyarrow_table() does not suffer from this bug.
importdeltalakeimportpyarrowaspafp="./resources/mytable"ta=pa.Table.from_pydict(
{
"bool_col": [True, False, True, False],
"int_col": [0, 1, 2, 3],
"str_col": ["a", "b", "c", "d"],
}
)
deltalake.write_deltalake(fp, ta, partition_by=["bool_col", "int_col"], mode="overwrite")
dt=deltalake.DeltaTable(fp)
assertdt.to_pyarrow_table(
filters=[
("int_col", "=", 0),
("bool_col", "=", True),
]
).num_rows==1assertlen(dt.file_uris(
partition_filters=[
("int_col", "=", 0),
("bool_col", "=", "true"),
]
)) ==1# finds the actual partition because it looks for "bool_col=true"assertlen(dt.file_uris(
partition_filters=[
("int_col", "=", 0),
("bool_col", "=", True),
]
)) ==1# does not find any partition because it looks for "bool_col=True"
Environment
Delta-rs version: python v0.10.2
Bug
The delta reader and writer treat differently partition values which are not strings. You can clearly see it from:
delta-rs/python/deltalake/table.py
Lines 614 to 627 in a74589b
delta-rs/python/deltalake/writer.py
Lines 494 to 509 in a74589b
An easy fix would be to make the second function accessible, removing the double underscore, and using it inside the reader.
The text was updated successfully, but these errors were encountered: