Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow PARQUET format for uploading data. #609
Allow PARQUET format for uploading data. #609
Changes from 21 commits
eaab36a
fa79291
41d2151
bbde49e
95d8c3d
509feb6
a45a72c
b16fe81
adab0bf
fc8eea0
76fd963
1e90f8f
3196088
41fae01
d7255b1
453e946
485a898
774b62e
14a7e29
af040e1
3a49fe6
f293253
a68d651
56f4207
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you consider trying nanoparquet instead? It's very new but has no dependencies, so we could use it from imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that nanoparquet is a good alternative because it has no dependencies. The only downside I see is that you would have to write to disk to read the raw data, as it lacks an output stream buffer implementation.
If you think the advantages outweigh the disadvantages, I could start testing and adapting the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I didn't think about that, but I suspect it's still worthwhile given the lighter dependencies. Do you mind filing a nanoparquet issue to add a stream buffer output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will begin implementing nanoparquet. If BufferedOutputStream is added in the future, an update will be necessary.
r-lib/nanoparquet#31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave these headings off? We add them as part of the final release process?
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.