Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple files per partition for CSV Json and Avro exec plans #1122

Closed
rdettai opened this issue Oct 15, 2021 · 0 comments · Fixed by #1138
Closed

Multiple files per partition for CSV Json and Avro exec plans #1122

rdettai opened this issue Oct 15, 2021 · 0 comments · Fixed by #1138
Labels
enhancement New feature or request

Comments

@rdettai
Copy link
Contributor

rdettai commented Oct 15, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently, only ParquetExec supports having multiple files per partition. CSV, Json and Avro execution plans should work the same way.

Describe the solution you'd like
I will create a generic abstraction, FileStream, that will take care of converting a list of files into a record batch stream, then replace NdJsonStream, CsvStream and AvroStream with that abstraction.

Additional context
I will submit a PR for this as soon as #1120 is merged (in order to avoid conflicts on various abstractions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant