Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-2333: Support bzip2 and xz compression in the to-avro subcommand #1131

Merged
merged 1 commit into from
Aug 16, 2023

Conversation

sekikn
Copy link
Contributor

@sekikn sekikn commented Aug 5, 2023

Make sure you have checked all steps below.

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

I've newly added the following tests into ToAvroCommandTest:

  • testToAvroCommandWithBzip2Compression
  • testToAvroCommandWithXzCompression

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@@ -110,6 +110,12 @@
<artifactId>avro</artifactId>
<version>${avro.version}</version>
</dependency>
<dependency>
<groupId>org.tukaani</groupId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have any license issue? And what about its dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has no license issue, since its Java implementation is public domain, which is classified into Category A. It also doesn't seem to have any dependent libraries.

As other examples, Druid, HBase and Spark already include it.
https://github.com/apache/druid/blob/druid-27.0.0/pom.xml#L527-L531
https://github.com/apache/hbase/blob/rel/2.5.5/pom.xml#L1470-L1474
https://github.com/apache/spark/blob/v3.4.1/pom.xml#L1489-L1493

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming!

@wgtmac wgtmac merged commit f8465a2 into apache:master Aug 16, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants