You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on discussions from iceberg-python/#584, we found that the Java Iceberg library "sanitizes" and transforms column names with special characters before writing to parquet.
For example, an Iceberg table with TEST:A1B2.RAW.ABC-GG-1-A column is transformed into TEST_x3AA1B2_x2ERAW_x2EABC_x2DGG_x2D1_x2DA which is then used to write the parquet files.
This process is done for both reads and writes. The behavior was introduced in #601
I think Iceberg should (optionally) allow writing column names without the "sanitization" and transformation. This can be made configurable to enable backward compatibility.
Query engine
None
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
Feature Request / Improvement
Based on discussions from iceberg-python/#584, we found that the Java Iceberg library "sanitizes" and transforms column names with special characters before writing to parquet.
For example, an Iceberg table with
TEST:A1B2.RAW.ABC-GG-1-A
column is transformed intoTEST_x3AA1B2_x2ERAW_x2EABC_x2DGG_x2D1_x2DA
which is then used to write the parquet files.This process is done for both reads and writes. The behavior was introduced in #601
I think Iceberg should (optionally) allow writing column names without the "sanitization" and transformation. This can be made configurable to enable backward compatibility.
Query engine
None
The text was updated successfully, but these errors were encountered: