Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-2361: Reduce failure rate of unit test testParquetFileWithBlo… #1167

Closed
wants to merge 1 commit into from

Conversation

fengjiajie
Copy link
Contributor

@fengjiajie fengjiajie commented Oct 4, 2023

…omFilterWithFpp

Change-Id: Ic230f197b0996333a082bb05bd201963d05d862e

[INFO] Results:
[INFO] 
Error:  Failures: 
Error:    TestParquetWriter.testParquetFileWithBloomFilterWithFpp:342
[INFO]  

Multiple different PR triggered this failure:

  1. Bump jmh.version from 1.21 to 1.36 #1062
  2. https://github.com/apache/parquet-mr/actions/runs/5420924489/job/14680382407
  3. https://github.com/apache/parquet-mr/actions/runs/6336014897
  4. https://github.com/apache/parquet-mr/actions/runs/6381223319
  5. https://github.com/apache/parquet-mr/actions/runs/6394826826/job/17357106390

The unit test utilizes random string generation for test data without using a fixed seed. The expectation of a unit test is that the number of false positives in the Bloom filter should match the set probability. Therefore, a simple fix is to increase the number of tests on the Bloom filter. The reason for not using a fixed seed with random numbers is to avoid making the tests effective only in specific scenarios. If it is necessary to use a fixed seed, I can also modify the PR accordingly.

Make sure you have checked all steps below.

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

…omFilterWithFpp

Change-Id: Ic230f197b0996333a082bb05bd201963d05d862e
@fengjiajie fengjiajie closed this Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant