Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String-based path column projection #182

Closed
alamb opened this issue Apr 26, 2021 · 0 comments · Fixed by #6871
Closed

String-based path column projection #182

alamb opened this issue Apr 26, 2021 · 0 comments · Fixed by #6871
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11618

There is currently no way to select a column by its path, e.g. 'a.b.c'. We have to select the column by its index, which is not trivial for nested structures.

For example, if a record has the following schema, the column indices are shown in parentheses:

{code}
schema:
a [struct] ("a")
b [struct] ("a.b")
c [int32] ("a.b.c") [0]
d [struct] ("a.b.d")
e [int32] ("a.b.d.e") [1]
f [bool] ("a.b.d.f") [2]
g [int64] ("a.b.g") [3]
{code}

if one wants to select 'a.b', they need to know that 'a.b.d' spans 2 (1 to 2) columns. This is inconvenient, and potentially forces readers to read whole records to avoid this inconvenience.

A string-based projection could allow one to select columns 1 and 2 via "a.b.d" or column 2 via "a.b.g"

@alamb alamb added the arrow Changes to the arrow crate label Apr 26, 2021
@jorgecarleitao jorgecarleitao added parquet Changes to the parquet crate and removed arrow Changes to the arrow crate labels Apr 26, 2021
@jorgecarleitao jorgecarleitao changed the title [Parquet] String-based path column projection String-based path column projection Apr 26, 2021
@alamb alamb added the enhancement Any new improvement worthy of a entry in the changelog label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants