Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DEMO) tests pass when validate_full is forced #987

Closed
wants to merge 5 commits into from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Nov 30, 2021

Built on #921

This PR demonstrates that (most) of the tests pass if I (temporarily) force ArrayData::new_unchecked() to check all inputs, which gives me additional confidence that validate_full() is validating inputs as it supposed to.

The only one that fails is related to the arrow_reader not having its nullability set correctly which I think is a bug in the array reader and which I will file separately

@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 30, 2021
}
};

// Bug discovery mechanism: call validate_full here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the demo -- it is not for merging, but for demonstrating that all the tests pass even when we run validate_full() on them

@alamb alamb changed the title (DEMO) All tests pass when validate_full is forced (DEMO) tests pass when validate_full is forced Nov 30, 2021
@alamb alamb force-pushed the alamb/remove_special_checks branch from ca5a429 to 72906d5 Compare November 30, 2021 21:40
@github-actions github-actions bot added the parquet Changes to the parquet crate label Nov 30, 2021
@@ -670,6 +670,7 @@ mod tests {
}

#[test]
#[ignore]
Copy link
Contributor Author

@alamb alamb Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails as follows:

---- arrow::arrow_reader::tests::test_read_maps stdout ----
thread 'arrow::arrow_reader::tests::test_read_maps' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Child type mismatch for Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }]). Expected Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false) but child data had Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }, false)")', arrow/src/array/data.rs:308:34
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

(the issue is the nullability of the child field is declared differently on the Struct's datatype than on the ArrayData which I think is an actual bug in the array reader, and will file a follow on ticket)

@codecov-commenter
Copy link

Codecov Report

Merging #987 (72906d5) into master (6a6e7f7) will decrease coverage by 0.00%.
The diff coverage is 87.93%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #987      +/-   ##
==========================================
- Coverage   82.31%   82.31%   -0.01%     
==========================================
  Files         168      168              
  Lines       48763    49008     +245     
==========================================
+ Hits        40139    40339     +200     
- Misses       8624     8669      +45     
Impacted Files Coverage Δ
parquet/src/arrow/arrow_reader.rs 88.88% <ø> (-0.44%) ⬇️
arrow/src/array/data.rs 82.37% <85.88%> (+2.57%) ⬆️
arrow/src/array/array_binary.rs 93.20% <100.00%> (-0.22%) ⬇️
arrow/src/array/array_boolean.rs 94.48% <100.00%> (-0.05%) ⬇️
arrow/src/array/array_dictionary.rs 88.75% <100.00%> (+0.28%) ⬆️
arrow/src/array/array_list.rs 94.46% <100.00%> (-1.07%) ⬇️
arrow/src/array/array_primitive.rs 94.05% <100.00%> (-0.04%) ⬇️
arrow/src/array/array_string.rs 97.08% <100.00%> (-0.83%) ⬇️
arrow/src/buffer/immutable.rs 97.84% <100.00%> (ø)
parquet/src/arrow/array_reader.rs 75.69% <0.00%> (-1.07%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a6e7f7...72906d5. Read the comment docs.

@alamb
Copy link
Contributor Author

alamb commented Dec 4, 2021

This PR has served its purpose. RIP

@alamb alamb closed this Dec 4, 2021
@alamb alamb deleted the alamb/remove_special_checks branch December 4, 2021 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants