-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Time
/Interval
/Decimal
/Utf8View
in aggregate fuzz testing
#13226
Conversation
This PR looks great -- thank you @LeslieKid |
Time
/Interval
/Decimal
in aggregate fuzz testingTime
/Interval
/Decimal
/StringView
in aggregate fuzz testing
Time
/Interval
/Decimal
/StringView
in aggregate fuzz testingTime
/Interval
/Decimal
/Utf8View
in aggregate fuzz testing
@@ -338,6 +338,10 @@ impl GroupsAccumulator for MinMaxBytesAccumulator { | |||
/// This is a heuristic to avoid allocating too many small buffers | |||
fn capacity_to_view_block_size(data_capacity: usize) -> u32 { | |||
let max_block_size = 2 * 1024 * 1024; | |||
// Avoid block size equal to zero when calling `with_fixed_block_size()`. | |||
if data_capacity == 0 { | |||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data_capacity
might be zero and results in aggregation fuzz tests panicked with message "Block size must be greater than 0".
So I modify this function to ensure that the block size would not be 0 in this case. But I'm not sure if this is a bug...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet there is something / somewhere that is passing in an empty batch -- and a small optimization might be to avoid doing so.
Do you happen to have the stack trace still around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @LeslieKid -- this is really nice
use rand::Rng; | ||
|
||
/// Randomly generate decimal arrays | ||
pub struct DecimalArrayGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice
@@ -338,6 +338,10 @@ impl GroupsAccumulator for MinMaxBytesAccumulator { | |||
/// This is a heuristic to avoid allocating too many small buffers | |||
fn capacity_to_view_block_size(data_capacity: usize) -> u32 { | |||
let max_block_size = 2 * 1024 * 1024; | |||
// Avoid block size equal to zero when calling `with_fixed_block_size()`. | |||
if data_capacity == 0 { | |||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet there is something / somewhere that is passing in an empty batch -- and a small optimization might be to avoid doing so.
Do you happen to have the stack trace still around?
basic_random_data!(IntervalYearMonthType); | ||
basic_random_data!(Decimal128Type); | ||
|
||
impl RandomNativeData for Date64Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks again @LeslieKid |
Which issue does this PR close?
Part of #12114 .
Rationale for this change
Supporting more types for dataset generator in fuzzer framework is needed to improve aggregation fuzzer coverage.
What changes are included in this PR?
Interval
andTime
types forPrimitiveArrayGenerator
.DecimalArrayGenerator
to supportDecimal
type.Utf8View
type.Are these changes tested?
Are there any user-facing changes?