-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve AggregateFuzz
testing: generate random queries
#12847
Conversation
4d0cbf5
to
4c5d621
Compare
37636b3
to
d0aca65
Compare
@@ -62,7 +62,7 @@ impl StringBatchGenerator { | |||
let mut cases = vec![]; | |||
let mut rng = thread_rng(); | |||
for null_pct in [0.0, 0.01, 0.1, 0.5] { | |||
for _ in 0..100 { | |||
for _ in 0..10 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce the number of strings in the input to reduce test time
d0aca65
to
252623f
Compare
// | ||
// Notes on tests: | ||
// | ||
// Since the supported types differ for each aggregation function, the tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my key observation -- if we structure the tests by aggregate function I think that is mostly likely to permit the greatest coverage (as the framework now varies queries, number of columns, etc)
252623f
to
42ffada
Compare
42ffada
to
4ad1ec4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran some code coverage on this PR using
cargo llvm-cov --test fuzz -- aggregate_fuzz
The coverage in aggregation has improved, but still has a ways to go:
datafusion/physical-optimizer/src/combine_partial_final_agg.rs 54 9 83.33% 10 2 80.00% 97 10 89.69% 0 0 -
datafusion/physical-optimizer/src/limit_pushdown.rs 92 65 29.35% 15 8 46.67% 190 132 30.53% 0 0 -
datafusion/physical-optimizer/src/limited_distinct_aggregation.rs 63 48 23.81% 9 4 55.56% 119 89 25.21% 0 0 -
datafusion/physical-optimizer/src/output_requirements.rs 56 14 75.00% 23 5 78.26% 150 40 73.33% 0 0 -
datafusion/physical-optimizer/src/topk_aggregation.rs 90 61 32.22% 10 5 50.00% 106 75 29.25% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/bytes.rs 25 3 88.00% 9 0 100.00% 77 2 97.40% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/bytes_view.rs 25 25 0.00% 9 9 0.00% 77 77 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/column.rs 130 32 75.38% 17 1 94.12% 167 19 88.62% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/group_column.rs 84 21 75.00% 19 0 100.00% 202 21 89.60% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/mod.rs 19 10 47.37% 1 0 100.00% 21 8 61.90% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/null_builder.rs 27 0 100.00% 6 0 100.00% 54 0 100.00% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/primitive.rs 48 2 95.83% 15 2 86.67% 102 4 96.08% 0 0 -
datafusion/physical-plan/src/aggregates/group_values/row.rs 92 92 0.00% 16 16 0.00% 176 176 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/mod.rs 422 205 51.42% 102 35 65.69% 852 311 63.50% 0 0 -
datafusion/physical-plan/src/aggregates/no_grouping.rs 62 62 0.00% 11 11 0.00% 146 146 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/order/full.rs 25 6 76.00% 7 1 85.71% 47 7 85.11% 0 0 -
datafusion/physical-plan/src/aggregates/order/mod.rs 32 2 93.75% 6 0 100.00% 56 1 98.21% 0 0 -
datafusion/physical-plan/src/aggregates/order/partial.rs 61 13 78.69% 9 0 100.00% 104 8 92.31% 0 0 -
datafusion/physical-plan/src/aggregates/row_hash.rs 347 112 67.72% 31 5 83.87% 503 92 81.71% 0 0 -
datafusion/physical-plan/src/aggregates/topk/hash_table.rs 97 97 0.00% 36 36 0.00% 198 198 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/topk/heap.rs 149 149 0.00% 38 38 0.00% 295 295 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/topk/priority_map.rs 17 17 0.00% 5 5 0.00% 64 64 0.00% 0 0 -
datafusion/physical-plan/src/aggregates/topk_stream.rs 83 83 0.00% 5 5 0.00% 100 100 0.00% 0 0 -
Full report is here: cov.zip
// The test framework handles varying combinations of arguments (data types), | ||
// sortedness, and grouping parameters | ||
// | ||
// TODO: Test on floating point values (where output needs to be compared with some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to work on these tests next. First I will add coverage for StringView / BinaryView
@@ -305,3 +327,172 @@ impl AggregationFuzzTestTask { | |||
) | |||
} | |||
} | |||
|
|||
/// Pretty prints the `RecordBatch`es, limited to the first 100 rows | |||
fn format_batches_with_limit(batches: &[RecordBatch]) -> impl std::fmt::Display { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without limiting the size of the output, the output is overwhelming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe it is unnecessary to output batches?
And what we need seems to be reproducing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that seeing the differences in the output made it easier for me to understand what was wrong
Specifically, when I added a test for AVG
and it failed intermittently for me, it was easy for me to say "integer overflow" as the values of the averages were e19
or something like that
let data_gen_config = baseline_config(); | ||
|
||
// Queries like SELECT min(a) FROM fuzz_table GROUP BY b | ||
let query_builder = QueryBuilder::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The major "innovation" of this PR is to automatically generate the queries as well to increase coverage (e.g. different numbers of group columns, etc)
AggregateFuzz
testingAggregateFuzz
testing: random queries
FYI @Rachelint |
|
||
/// Generate a random number of aggregate functions (potentially repeating). | ||
/// | ||
fn random_aggregate_functions(&self) -> Vec<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the challenge for randomly generate aggregate_functions
are about their different arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is right -- Improved the comments to clarify
} | ||
self.table_name(query_builder.table_name()) | ||
} | ||
|
||
pub fn add_sql(mut self, sql: &str) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems can remove pub
of add_sql
now?
/// | ||
/// Limited to 3 group by columns to ensure coverage for large groups. With | ||
/// larger numbers of columns, each group has many fewer values. | ||
fn random_group_by(&self) -> Vec<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems dataset will be sorted by utf8
and u8
columns currently.
But the group by columns will be generated randomly.
Should we do something to ensure that streaming aggregation(partial and full sorted) can be covered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good point. I think there should be some way to generate lower cardinality columns. I will do that as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #12990
The query generator seems really excellent! |
AggregateFuzz
testing: random queriesAggregateFuzz
testing: generate random queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb, it looks to me 👍. Just left some comments for typo.
I saw there are some follow-up PR waiting for this. I think we can merge it if no more comments tomorrow.
Co-authored-by: Jax Liu <[email protected]>
Co-authored-by: Jax Liu <[email protected]>
Thank you @goldmedal |
Thanks @alamb and @Rachelint for reviewing! |
Which issue does this PR close?
Part of #12114
Rationale for this change
Now that we have a great aggregation fuzz testing framework from @Rachelint , lets use it to increase coverage
Specifically while thinking about how to improve test coverage I think the key constraints are what types are supported by what aggregate functions. Thus it makes sense to have different functions tested in different combinations
What changes are included in this PR?
Are these changes tested?
Only tests
Are there any user-facing changes?
No, only tests