Skip to content

Commit

Permalink
ARROW-10042: [Rust] Fix tests involving ArrayData/Buffer equality
Browse files Browse the repository at this point in the history
The tests added, unignored, and modified in this PR should illustrate the behavior I'm going for.

This gets 2 ignored Parquet -> Arrow roundtrip tests to pass because they were only failing due to differences in their `ArrayData`'s `BufferData` capacities.

This leaves `BufferData`'s `PartialEq` implementation as-is (that is, taking the equality of `capacity` into account) and does not modify any of the tests in rust/arrow/src/array/{array, builder, union}.rs or in rust/arrow/src/compute/{kernels/cast, kernels/filter, util}.rs that are testing the internal implementation of `Buffer`.

This *does* change `PartialEq` for `ArrayData` to [not assert to fix the problems noted](#8546 (comment)) by @jorgecarleitao .

This has the downside of making test failures where `ArrayData`s should be equal but aren't harder to interpret because the `assert_eq!`s aren't as granular. We can't use `debug_assert_eq!` either, because then we can't write tests that do `assert_ne!(array_data1, array_data2)` (I have included an example of such a test). I think to solve this downside, we should add a function only used in tests that does the more granular assertions. I am leaving that to a future PR.

Closes #8590 from carols10cents/bufferdata-eq

Authored-by: Carol (Nichols || Goulding) <[email protected]>
Signed-off-by: alamb <[email protected]>
  • Loading branch information
carols10cents authored and alamb committed Nov 10, 2020
1 parent 18b9281 commit 14501b2
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 16 deletions.
7 changes: 7 additions & 0 deletions rust/arrow/src/array/data.rs
Original file line number Diff line number Diff line change
Expand Up @@ -446,4 +446,11 @@ mod tests {
assert_eq!(2, new_data.offset());
assert_eq!(data.null_count() - 1, new_data.null_count());
}

#[test]
fn test_equality() {
let int_data = ArrayData::builder(DataType::Int32).build();
let float_data = ArrayData::builder(DataType::Float32).build();
assert_ne!(int_data, float_data);
}
}
14 changes: 11 additions & 3 deletions rust/arrow/src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -845,7 +845,7 @@ mod tests {
#[test]
fn test_buffer_data_equality() {
let buf1 = Buffer::from(&[0, 1, 2, 3, 4]);
let mut buf2 = Buffer::from(&[0, 1, 2, 3, 4]);
let buf2 = Buffer::from(&[0, 1, 2, 3, 4]);
assert_eq!(buf1, buf2);

// slice with same offset should still preserve equality
Expand All @@ -854,12 +854,20 @@ mod tests {
let buf4 = buf2.slice(2);
assert_eq!(buf3, buf4);

// Different capacities should still preserve equality
let mut buf2 = MutableBuffer::new(65);
buf2.write_all(&[0, 1, 2, 3, 4])
.expect("write should be OK");

let buf2 = buf2.freeze();
assert_eq!(buf1, buf2);

// unequal because of different elements
buf2 = Buffer::from(&[0, 0, 2, 3, 4]);
let buf2 = Buffer::from(&[0, 0, 2, 3, 4]);
assert_ne!(buf1, buf2);

// unequal because of different length
buf2 = Buffer::from(&[0, 1, 2, 3]);
let buf2 = Buffer::from(&[0, 1, 2, 3]);
assert_ne!(buf1, buf2);
}

Expand Down
14 changes: 1 addition & 13 deletions rust/parquet/src/arrow/arrow_writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -911,17 +911,7 @@ mod tests {
let expected_data = expected_batch.column(i).data();
let actual_data = actual_batch.column(i).data();

assert_eq!(expected_data.data_type(), actual_data.data_type());
assert_eq!(expected_data.len(), actual_data.len());
assert_eq!(expected_data.null_count(), actual_data.null_count());
assert_eq!(expected_data.offset(), actual_data.offset());
assert_eq!(expected_data.buffers(), actual_data.buffers());
assert_eq!(expected_data.child_data(), actual_data.child_data());
// Null counts should be the same, not necessarily bitmaps
// A null bitmap is optional if an array has no nulls
if expected_data.null_count() != 0 {
assert_eq!(expected_data.null_bitmap(), actual_data.null_bitmap());
}
assert_eq!(expected_data, actual_data);
}
}

Expand Down Expand Up @@ -1198,7 +1188,6 @@ mod tests {
}

#[test]
#[ignore] // Binary support isn't correct yet - buffers don't match
fn binary_single_column() {
let one_vec: Vec<u8> = (0..SMALL_SIZE as u8).collect();
let many_vecs: Vec<_> = std::iter::repeat(one_vec).take(SMALL_SIZE).collect();
Expand All @@ -1209,7 +1198,6 @@ mod tests {
}

#[test]
#[ignore] // Large binary support isn't correct yet - buffers don't match
fn large_binary_single_column() {
let one_vec: Vec<u8> = (0..SMALL_SIZE as u8).collect();
let many_vecs: Vec<_> = std::iter::repeat(one_vec).take(SMALL_SIZE).collect();
Expand Down

0 comments on commit 14501b2

Please sign in to comment.