fix: schema adapter doesn't map partial batches correctly #2735
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When pushdown_filters on DataFusion SessionConfig is enabled for Parquet, map_partial_batch on the schema adapter is called with a batch that is evaluating the RowFilter created by pushdown_filters. The batch only contains the column(s) in the predicate for the RowFilter. If table_schema contains not null column then the cast will fail.
Example error:
This change only uses from table_schema the columns that exist in the batch.
See https://docs.rs/datafusion/latest/datafusion/datasource/schema_adapter/trait.SchemaMapper.html#tymethod.map_partial_batch
Related Issue(s)
Documentation