Replies: 5 comments 5 replies
-
@xiaoxmeng @mbasmanova @pedroerp @czentgr @majetideepak @yingsu00 @aditi-pandit @FelixYBW @hitarth @jaystarshot |
Beta Was this translation helpful? Give feedback.
-
Can you list the PR and its status somewhere we can track? We can test the Spark UT with complex datatype enabled once all current PRs are merged. Later we can test the parquet files of parquet-mr UT. |
Beta Was this translation helpful? Give feedback.
-
@qqibrow What's blocking this solution? Why can't we implement it now? |
Beta Was this translation helpful? Give feedback.
-
@mbasmanova we had a sync back at veloxcon. seems no group has the bandwidth now. Also even after the fuzzer approach is implemented, we still need this method to validate. |
Beta Was this translation helpful? Give feedback.
-
Groups from Pinterest, Uber, IBM, and Intel had an discussion about complex type support in Parquet:
Ying has created #8103 to track the the long term solution. I don’t have a rough estimate now but it will take some time to support all features supported in presto parquet test framework. More details can be found within the issue thread. Here are the pros and cons: Presto Unit Test: Pros:
Cons:
Fuzzer Test: Pros:
Cons:
LMKWDYT @mbasmanova @FelixYBW @yingsu00 |
Beta Was this translation helpful? Give feedback.
-
Overview
We’ve developed a test solution to test velox velox native parquet reader using presto unit tests. we want to discuss whether we can add that in CI in either velox or presto project.
Why
High level implementation
Testing result
Test Location Options
Long term Solution
The long term solution for testing is to using fuzzer to test velox parquet. Using presto unit test will be an interim solution before such implementation.
Beta Was this translation helpful? Give feedback.
All reactions