You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, data skipping predicates (containing logical column names) are evaluated as-is against both Delta stats and parquet row group stats -- both of which use physical column names. When a table uses column name mapping, the predicates are useless because they reference columns that seem to not exist. Currently we don't validate that the column actually exists (which is probably a bug in its own right), so this silently manifests as a complete lack of data skipping.
Describe the functionality you are proposing.
We need to rewrite logical column name references in data skipping predicates to use their physical counterparts.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Please describe why this is necessary.
Today, data skipping predicates (containing logical column names) are evaluated as-is against both Delta stats and parquet row group stats -- both of which use physical column names. When a table uses column name mapping, the predicates are useless because they reference columns that seem to not exist. Currently we don't validate that the column actually exists (which is probably a bug in its own right), so this silently manifests as a complete lack of data skipping.
Describe the functionality you are proposing.
We need to rewrite logical column name references in data skipping predicates to use their physical counterparts.
Additional context
No response
The text was updated successfully, but these errors were encountered: