Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add one-level list encoding support in parquet reader #9848
Add one-level list encoding support in parquet reader #9848
Changes from 10 commits
7193abb
6ddab85
6f395dd
974b105
39f5c31
e0b9d55
7632f32
df6c1c6
4215404
51984a4
414866d
786b456
81c0ae2
f626a89
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the included in the output schema?
In ORC the names of nested columns are generated as the index in the parent's list of children. Gives a uniform way to access nested columns of lists/maps/structs. I don't know enough about Parquet to understand if the same logic can apply here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is really little spec information I can find for this one-level list encoding, not to mention how to name the elements. I will check
parquet-cpp
to see how they handle it and removeTODO
whenever I find a proper fix (which may take some certain time). Probably in a follow-up PR.