Improve `AggregateFuzz` testing: generate random queries #12847

alamb · 2024-10-10T11:57:44Z

Which issue does this PR close?

Rationale for this change

Now that we have a great aggregation fuzz testing framework from @Rachelint , lets use it to increase coverage

Specifically while thinking about how to improve test coverage I think the key constraints are what types are supported by what aggregate functions. Thus it makes sense to have different functions tested in different combinations

What changes are included in this PR?

Add a query generator to randomly create queries
Update aggregate fuzz testing to cover various permutations of types and aggregates
List other potential things to test

Are these changes tested?

Only tests

Are there any user-facing changes?

No, only tests

alamb · 2024-10-15T19:15:23Z

test-utils/src/string_gen.rs

@@ -62,7 +62,7 @@ impl StringBatchGenerator {
        let mut cases = vec![];
        let mut rng = thread_rng();
        for null_pct in [0.0, 0.01, 0.1, 0.5] {
-            for _ in 0..100 {
+            for _ in 0..10 {


Reduce the number of strings in the input to reduce test time

alamb · 2024-10-15T19:17:59Z

datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs

+//
+// Notes on tests:
+//
+// Since the supported types differ for each aggregation function, the tests


This is my key observation -- if we structure the tests by aggregate function I think that is mostly likely to permit the greatest coverage (as the framework now varies queries, number of columns, etc)

alamb

I ran some code coverage on this PR using

cargo llvm-cov --test fuzz -- aggregate_fuzz

The coverage in aggregation has improved, but still has a ways to go:

datafusion/physical-optimizer/src/combine_partial_final_agg.rs                                54                 9    83.33%          10                 2    80.00%          97                10    89.69%           0                 0         -
datafusion/physical-optimizer/src/limit_pushdown.rs                                           92                65    29.35%          15                 8    46.67%         190               132    30.53%           0                 0         -
datafusion/physical-optimizer/src/limited_distinct_aggregation.rs                             63                48    23.81%           9                 4    55.56%         119                89    25.21%           0                 0         -
datafusion/physical-optimizer/src/output_requirements.rs                                      56                14    75.00%          23                 5    78.26%         150                40    73.33%           0                 0         -
datafusion/physical-optimizer/src/topk_aggregation.rs                                         90                61    32.22%          10                 5    50.00%         106                75    29.25%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/bytes.rs                                 25                 3    88.00%           9                 0   100.00%          77                 2    97.40%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/bytes_view.rs                            25                25     0.00%           9                 9     0.00%          77                77     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/column.rs                               130                32    75.38%          17                 1    94.12%         167                19    88.62%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/group_column.rs                          84                21    75.00%          19                 0   100.00%         202                21    89.60%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/mod.rs                                   19                10    47.37%           1                 0   100.00%          21                 8    61.90%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/null_builder.rs                          27                 0   100.00%           6                 0   100.00%          54                 0   100.00%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/primitive.rs                             48                 2    95.83%          15                 2    86.67%         102                 4    96.08%           0                 0         -
datafusion/physical-plan/src/aggregates/group_values/row.rs                                   92                92     0.00%          16                16     0.00%         176               176     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/mod.rs                                               422               205    51.42%         102                35    65.69%         852               311    63.50%           0                 0         -
datafusion/physical-plan/src/aggregates/no_grouping.rs                                        62                62     0.00%          11                11     0.00%         146               146     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/order/full.rs                                         25                 6    76.00%           7                 1    85.71%          47                 7    85.11%           0                 0         -
datafusion/physical-plan/src/aggregates/order/mod.rs                                          32                 2    93.75%           6                 0   100.00%          56                 1    98.21%           0                 0         -
datafusion/physical-plan/src/aggregates/order/partial.rs                                      61                13    78.69%           9                 0   100.00%         104                 8    92.31%           0                 0         -
datafusion/physical-plan/src/aggregates/row_hash.rs                                          347               112    67.72%          31                 5    83.87%         503                92    81.71%           0                 0         -
datafusion/physical-plan/src/aggregates/topk/hash_table.rs                                    97                97     0.00%          36                36     0.00%         198               198     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/topk/heap.rs                                         149               149     0.00%          38                38     0.00%         295               295     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/topk/priority_map.rs                                  17                17     0.00%           5                 5     0.00%          64                64     0.00%           0                 0         -
datafusion/physical-plan/src/aggregates/topk_stream.rs                                        83                83     0.00%           5                 5     0.00%         100               100     0.00%           0                 0         -

Full report is here: cov.zip

alamb · 2024-10-15T19:27:04Z

datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs

+// The test framework handles varying combinations of arguments (data types),
+// sortedness, and grouping parameters
+//
+// TODO: Test on floating point values (where output needs to be compared with some


I plan to work on these tests next. First I will add coverage for StringView / BinaryView

alamb · 2024-10-15T19:27:59Z

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

@@ -305,3 +327,172 @@ impl AggregationFuzzTestTask {
        )
    }
 }
+
+/// Pretty prints the `RecordBatch`es, limited to the first 100 rows
+fn format_batches_with_limit(batches: &[RecordBatch]) -> impl std::fmt::Display {


without limiting the size of the output, the output is overwhelming

I think maybe it is unnecessary to output batches?
And what we need seems to be reproducing it?

I found that seeing the differences in the output made it easier for me to understand what was wrong

Specifically, when I added a test for AVG and it failed intermittently for me, it was easy for me to say "integer overflow" as the values of the averages were e19 or something like that

alamb · 2024-10-15T19:28:57Z

datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs

+    let data_gen_config = baseline_config();
+
+    // Queries like SELECT min(a) FROM fuzz_table GROUP BY b
+    let query_builder = QueryBuilder::new()


The major "innovation" of this PR is to automatically generate the queries as well to increase coverage (e.g. different numbers of group columns, etc)

alamb · 2024-10-15T19:33:54Z

FYI @Rachelint

Rachelint · 2024-10-16T16:07:57Z

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

+
+    /// Generate a random number of aggregate functions (potentially repeating).
+    ///
+    fn random_aggregate_functions(&self) -> Vec<String> {


I think the challenge for randomly generate aggregate_functions are about their different arguments.

Yes, that is right -- Improved the comments to clarify

Rachelint · 2024-10-16T16:11:57Z

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

+        }
+        self.table_name(query_builder.table_name())
+    }
+
    pub fn add_sql(mut self, sql: &str) -> Self {


Seems can remove pub of add_sql now?

Rachelint · 2024-10-16T16:21:56Z

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

+    ///
+    /// Limited to 3 group by columns to ensure coverage for large groups. With
+    /// larger numbers of columns, each group has many fewer values.
+    fn random_group_by(&self) -> Vec<String> {


Seems dataset will be sorted by utf8 and u8 columns currently.

But the group by columns will be generated randomly.

Should we do something to ensure that streaming aggregation(partial and full sorted) can be covered?

Yes, this is a good point. I think there should be some way to generate lower cardinality columns. I will do that as a follow on PR

Done in #12990

Rachelint · 2024-10-16T16:22:41Z

The query generator seems really excellent!

…coverage

goldmedal

Thanks @alamb, it looks to me 👍. Just left some comments for typo.
I saw there are some follow-up PR waiting for this. I think we can merge it if no more comments tomorrow.

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

Co-authored-by: Jax Liu <[email protected]>

alamb · 2024-10-21T20:32:50Z

Thank you @goldmedal

…coverage

goldmedal · 2024-10-22T09:43:50Z

Thanks @alamb and @Rachelint for reviewing!

alamb changed the title ~~Alamb/aggregate fuzz coverage~~ Improve AggregateFuzz testing Oct 10, 2024

alamb mentioned this pull request Oct 13, 2024

Implement GroupColumn support for StringView / ByteView (faster grouping performance) #12809

Merged

alamb force-pushed the alamb/aggregate_fuzz_coverage branch from 4d0cbf5 to 4c5d621 Compare October 14, 2024 17:07

github-actions bot added the core Core DataFusion crate label Oct 14, 2024

alamb force-pushed the alamb/aggregate_fuzz_coverage branch 2 times, most recently from 37636b3 to d0aca65 Compare October 15, 2024 19:12

alamb commented Oct 15, 2024

View reviewed changes

alamb force-pushed the alamb/aggregate_fuzz_coverage branch from d0aca65 to 252623f Compare October 15, 2024 19:15

alamb added the development-process Related to development process of DataFusion label Oct 15, 2024

alamb commented Oct 15, 2024

View reviewed changes

alamb force-pushed the alamb/aggregate_fuzz_coverage branch from 252623f to 42ffada Compare October 15, 2024 19:25

github-actions bot removed the development-process Related to development process of DataFusion label Oct 15, 2024

Add random queries into aggregate fuzz tester

4ad1ec4

alamb force-pushed the alamb/aggregate_fuzz_coverage branch from 42ffada to 4ad1ec4 Compare October 15, 2024 19:26

alamb commented Oct 15, 2024

View reviewed changes

alamb marked this pull request as ready for review October 15, 2024 19:33

alamb changed the title ~~Improve AggregateFuzz testing~~ Improve AggregateFuzz testing: random queries Oct 15, 2024

Rachelint reviewed Oct 16, 2024

View reviewed changes

Merge remote-tracking branch 'apache/main' into alamb/aggregate_fuzz_…

9d92d14

…coverage

alamb changed the title ~~Improve AggregateFuzz testing: random queries~~ Improve AggregateFuzz testing: generate random queries Oct 17, 2024

alamb added 2 commits October 17, 2024 11:46

Merge remote-tracking branch 'apache/main' into alamb/aggregate_fuzz_…

bdbfa57

…coverage

Address review comments

e61792e

This was referenced Oct 17, 2024

Aggregation fuzz testing #12114

Open

Increase fuzz testing of streaming group by / low cardinality columns #12990

Merged

goldmedal approved these changes Oct 21, 2024

View reviewed changes

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs Outdated Show resolved Hide resolved

datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs Outdated Show resolved Hide resolved

LeslieKid mentioned this pull request Oct 21, 2024

feat: Add Date32/Date64 in aggregate fuzz testing #13041

Merged

alamb and others added 2 commits October 21, 2024 16:32

Update datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

d428d53

Co-authored-by: Jax Liu <[email protected]>

Update datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs

0b2e9a2

Co-authored-by: Jax Liu <[email protected]>

Merge remote-tracking branch 'apache/main' into alamb/aggregate_fuzz_…

90ef789

…coverage

goldmedal merged commit c22abb4 into apache:main Oct 22, 2024
24 checks passed

alamb deleted the alamb/aggregate_fuzz_coverage branch October 22, 2024 15:28

jonathanc-n mentioned this pull request Nov 7, 2024

Add boolean columns for fuzz testing #13297

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `AggregateFuzz` testing: generate random queries #12847

Improve `AggregateFuzz` testing: generate random queries #12847

alamb commented Oct 10, 2024 •

edited

Loading

alamb Oct 15, 2024

alamb Oct 15, 2024

alamb left a comment

alamb Oct 15, 2024

alamb Oct 15, 2024

Rachelint Oct 16, 2024 •

edited

Loading

alamb Oct 17, 2024

alamb Oct 15, 2024

alamb commented Oct 15, 2024

Rachelint Oct 16, 2024

alamb Oct 17, 2024

Rachelint Oct 16, 2024

Rachelint Oct 16, 2024

alamb Oct 17, 2024

alamb Oct 17, 2024

Rachelint commented Oct 16, 2024

goldmedal left a comment •

edited

Loading

alamb commented Oct 21, 2024

goldmedal commented Oct 22, 2024

Improve AggregateFuzz testing: generate random queries #12847

Improve AggregateFuzz testing: generate random queries #12847

Conversation

alamb commented Oct 10, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rachelint Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Oct 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rachelint commented Oct 16, 2024

goldmedal left a comment • edited Loading

Choose a reason for hiding this comment

alamb commented Oct 21, 2024

goldmedal commented Oct 22, 2024

Improve `AggregateFuzz` testing: generate random queries #12847

Improve `AggregateFuzz` testing: generate random queries #12847

alamb commented Oct 10, 2024 •

edited

Loading

Rachelint Oct 16, 2024 •

edited

Loading

goldmedal left a comment •

edited

Loading