Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct map field names #2182

Merged
merged 3 commits into from
Feb 14, 2024
Merged

Conversation

emcake
Copy link
Contributor

@emcake emcake commented Feb 11, 2024

Description

This changes the field names inside map types to be key and value instead of keys and values. This matches the parquet spec for encoding map columns: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps

I found this while trying to load a checkpoint from a table where it was failing with spurious errors. Pyarrow is tolerant of this to a degree but other readers like parquet4s (used in e.g delta standalone) took issue with it.

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Feb 11, 2024
@ion-elgreco ion-elgreco merged commit a5a2cca into delta-io:main Feb 14, 2024
20 checks passed
@roeap
Copy link
Collaborator

roeap commented Feb 14, 2024

@emcake @ion-elgreco - sorry for being late to the discussion. Just to create awareness, bit there is no "right" wat for theses names -AFAIK. Different parquet writers and readers behave differently and different arrow implementations also take different positions on this.

So "right" only applies to a given implementation.

I habe some work queued to try and solve some of this more generally, but the changes in this Pr may not be quite long-lived depending on which implementation we will end up aligning with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants