Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Default to zstd compression for Parquet in new tables #8593

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

aokolnychyi
Copy link
Contributor

This PR is an updated and rebased version of #8299.

Assert.assertEquals(Collections.singletonMap("dummy", "test"), table.properties());
Map<String, String> expectedProperties = defaultProperties();
expectedProperties.put("dummy", "test");
Assert.assertEquals(expectedProperties, table.properties());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to have tests for specific catalogs, but I don't think that setting this property is a general behavior of all catalogs that we want to validate. I'd remove these tests because this is up to the catalog implementation and it is fine if a catalog implementation doesn't apply this default. It can still be a valid implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting to modify all tests to ensure dummy key is present instead of validating the actual set of properties?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Tests like these should ensure that the property passed in is present in properties, not that we have the exact set of properties.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests should assert that the expected properties are a subset of actual properties and should not expect or assert that there are any specific default properties added. That's the basic contract here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Updated tests.

@aokolnychyi aokolnychyi force-pushed the zstd-default branch 2 times, most recently from 22f9039 to a10d3c5 Compare September 20, 2023 21:32
@rdblue rdblue merged commit 2e291c2 into apache:master Sep 21, 2023
47 checks passed
@rdblue
Copy link
Contributor

rdblue commented Sep 21, 2023

Thanks, @szehon-ho, @aokolnychyi, and @kbendick!

@Fokko
Copy link
Contributor

Fokko commented Sep 28, 2023

@Fokko
Copy link
Contributor

Fokko commented Sep 29, 2023

The write.parquet.compression-codec property is also set if the file-format is Avro 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants