-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add compression option ZSTD. #1890
Conversation
Before we merge, there are a couple items I wanna investigate. |
tests/unit/test_enums.py
Outdated
# limitations under the License. | ||
|
||
|
||
def test_compression_enums(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how useful this test is? Seems an awful lot like a change-detector test to me. I'd be fine adding the constant without adding the test.
Alternatively, maybe there's a system test we could write to make sure this is synced with the bigquery discovery document? But even then, compression isn't a true enum. The allowed values are only listed in the documentation string from what I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this test.
In terms of attempting to ensure that this matches the discovery docs... the terms referenced by the docs are present in the description, as you note, so we would need some mechanism to extract them from the discovery doc, which feels somewhat fragile (ie extract all words that are ALL CAPS and deduplicate them). Thoughts?
"JobConfigurationExtract": {
...
"properties": {
"compression": {
"description": "Optional. The compression type to use for exported files.
Possible values include DEFLATE, GZIP, NONE, SNAPPY, and ZSTD. The
default value is NONE. Not all compression formats are support for all
file formats. DEFLATE is only supported for Avro. ZSTD is only supported
for Parquet. Not applicable when extracting models.",
"type": "string"
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Without a structured representation of the allowed values, it's too fragile.
Based on a PR submitted by EthanSteinberg.
I added a test to confirm that the enum is correctly populated with only the allowed options.
For future me, including this link directly to the list of current export formats and the allowable compression types.
Closing Ethan's PR.