Skip to content

Commit

Permalink
fix: correct map field names (#2182)
Browse files Browse the repository at this point in the history
# Description
This changes the field names inside map types to be `key` and `value`
instead of `keys` and `values`. This matches the parquet spec for
encoding map columns:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps

I found this while trying to load a checkpoint from a table where it was
failing with spurious errors. Pyarrow is tolerant of this to a degree
but other readers like parquet4s (used in e.g delta standalone) took
issue with it.

---------

Co-authored-by: R. Tyler Croy <[email protected]>
  • Loading branch information
emcake and rtyler authored Feb 14, 2024
1 parent dcb5fc3 commit a5a2cca
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions crates/core/src/kernel/arrow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ pub(crate) mod extract;
pub(crate) mod json;

const MAP_ROOT_DEFAULT: &str = "entries";
const MAP_KEY_DEFAULT: &str = "keys";
const MAP_VALUE_DEFAULT: &str = "values";
const MAP_KEY_DEFAULT: &str = "key";
const MAP_VALUE_DEFAULT: &str = "value";
const LIST_ROOT_DEFAULT: &str = "item";

impl TryFrom<ActionType> for ArrowField {
Expand Down Expand Up @@ -846,9 +846,9 @@ mod tests {
let entry_offsets_buffer = Buffer::from(entry_offsets.to_byte_slice());
let keys_data = StringArray::from_iter_values(keys);

let keys_field = Arc::new(Field::new("keys", ArrowDataType::Utf8, false));
let keys_field = Arc::new(Field::new("key", ArrowDataType::Utf8, false));
let values_field = Arc::new(Field::new(
"values",
"value",
values.data_type().clone(),
values.null_count() > 0,
));
Expand Down

0 comments on commit a5a2cca

Please sign in to comment.