You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Therefore by the same logic I think adding support for re-ordering columns is also reasonable.
Again my personal motivation is related to https://github.com/delta-io/delta-rs/. We want to be able to read any table and Spark sometimes writes parquet files with different physical schema field order compared to the order in the table's schema.
Component(s)
C++
The text was updated successfully, but these errors were encountered:
Traceback (most recent call last):
File "/home/tomnewton/arrow/cpp/src/arrow/compute/example.py", line 30, in <module>
table_read = pq.read_table(
File "/home/tomnewton/.local/lib/python3.8/site-packages/pyarrow/parquet/core.py", line 1843, in read_table
return dataset.read(columns=columns, use_threads=use_threads,
File "/home/tomnewton/.local/lib/python3.8/site-packages/pyarrow/parquet/core.py", line 1485, in read
table = self._dataset.to_table(
File "pyarrow/_dataset.pyx", line 562, in pyarrow._dataset.Dataset.to_table
File "pyarrow/_dataset.pyx", line 3804, in pyarrow._dataset.Scanner.to_table
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: struct fields don't match or are in the wrong order: Input fields: struct<sub_column0: int32, sub_column1: int32> output fields: struct<sub_column1: int32, sub_column0: int32>
Describe the enhancement requested
Arrow can already cast to a different order columns but it can't do the same on struct fields.
I previously completed a related issue #44555 where it was agreed
#44555 (comment)
Therefore by the same logic I think adding support for re-ordering columns is also reasonable.
Again my personal motivation is related to https://github.com/delta-io/delta-rs/. We want to be able to read any table and Spark sometimes writes parquet files with different physical schema field order compared to the order in the table's schema.
Component(s)
C++
The text was updated successfully, but these errors were encountered: