-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: add zstd compression support #1342
Comments
What do you think about this @johnkerl ? Thank you |
@aborruso for comparison let's first look at Now for |
@aborruso can you check out head and try this? No worries if not; please let me know ... Also you can take a peek at head docs here: |
Wow, it works great. I was already using zstd with the prepipe, but it seemed very convenient and important for Miller to support it natively and directly. Thank you very much |
Miller is the data tool I use the most. Another tool that I use a lot is duckdb.
It supports zstd (and gzip) compressed csv. ZSTD compression and decompression can be extremely fast. I compress a 4.5 GB CSV file in 3 seconds (I have 16 GB of ram and 12th Gen Intel(R) Core(TM) i7-1280P 2.00 GHz).
The output is a 160 MB compressed csv file.
And it's possible to run a duckdb SUMMARIZE on it in 8.5 seconds.
The CSV has 1745439 rows and 199 columns.
A big credit goes to duckdb, but part of the credit goes to this compression format.
This issue to ask enable it in Miller compressed data.
Thank you
The text was updated successfully, but these errors were encountered: