-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EXPORT: support option to compress output files using gzip #45579
Comments
I think we'll want an approach somewhat similar to #45326 We'll need to update the options here: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/export.go#L59 And then the translation into a distsql spec here: And then to do the actual compression, I think we'll want to wrap Finally, there are some tests in exportcsv_test.go that should provide a basis for adding a compression case to them. |
I would like to confirm that my understanding of expected outcome, first of all as far as I understand I need to plug into export.go, i.e.
Next, I need to add supporting logic into exportcsv.go the So some questions:
|
I'm not really sure why you need, but I might be missing something since not familiar with the code base yet. Also, given that you add only an option into https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/export.go#L59 don't you need to also updated protobuf so you can read it later in https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/importccl/exportcsv.go#L100? |
Seems if we add compression options then there is not need to change anything in |
I believe this issue can be closed? @dt ? |
This commit extends EXPORT functionality by enabling compression of the exported stream as suggested in cockroachdb#45579. Currently only gzip is supported and the export clause to use compression looks as following: ``` export into csv 's3://export.csv' with compression = gzip from select * from foo; ``` Signed-off-by: Artem Barger <[email protected]> Release note (sql change): support option to compress output files using gzip Release justification: none
This commit extends EXPORT functionality by enabling compression of the exported stream as suggested in cockroachdb#45579. Currently only gzip is supported and the export clause to use compression looks as following: ``` export into csv 's3://export.csv' with compression = gzip from select * from foo; ``` Signed-off-by: Artem Barger <[email protected]> Release note (sql change): support option to compress output files using gzip Release justification: none
45978: importccl: support option to compress output files using gzip r=dt a=C0rWin Fix #45579 This commit extends EXPORT functionality by enabling compression of the exported stream as suggested in #45579. Currently only gzip is supported and the export clause to use compression looks as following: ``` export into csv 's3://export.csv' with compression = gzip from select * from foo; ``` Signed-off-by: Artem Barger <[email protected]> Release note (sql change): support option to compress output files using gzip Co-authored-by: Artem Barger <[email protected]>
When running EXPORT, one option that may assist with smaller file sizes and less writes is by using compression. This could be a simple a option in the export statement:
export into csv 's3://export.csv' with delimiter = '|', compression = gzip from select * from kv;
This can save users from kicking off another process to compress data and give them smaller files to work with for faster pipeline processing. It seems like we already have something similar for imports: #26796
The text was updated successfully, but these errors were encountered: