-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add overflow-checking variants of arithmetic scalar dyn kernels #2713
Conversation
7a10d91
to
54d44d2
Compare
cc @sunchao |
/// | ||
/// This detects overflow and returns an `Err` for that. For an non-overflow-checking variant, | ||
/// use `add_scalar_dyn` instead. | ||
pub fn add_scalar_checked_dyn<T>(array: &dyn Array, scalar: T::Native) -> Result<ArrayRef> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious: do we have benchmark to track how much slower add_scalar_checked_dyn
is comparing to add_scalar_dyn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is anything like the non-scalar kernels, it is about 10x slower. Aside from the branching costs, it prevents LLVM from vectorising it correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I wonder if we should point that out in the doc of this method, in case it's not obvious to the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it will be much slower. As by default (ansi-mode disabled) in our case, non-checked kernels will be used. So most of time users will use faster one, except they have special need to use checked kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I'm going to add a few lines mentioning that.
@@ -834,12 +834,34 @@ where | |||
/// Add every value in an array by a scalar. If any value in the array is null then the | |||
/// result is also null. The given array must be a `PrimitiveArray` of the type same as | |||
/// the scalar, or a `DictionaryArray` of the value type same as the scalar. | |||
/// | |||
/// This doesn't detect overflow. Once overflowing, the result will wrap around. | |||
/// For an overflow-checking variant, use `add_scalar_checked_dyn` instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps explain a bit when it will return Err
arrow/src/compute/kernels/arity.rs
Outdated
F: Fn(T::Native) -> Result<T::Native>, | ||
{ | ||
downcast_dictionary_array! { | ||
array => try_unary_dict::<_, F, T>(array, op), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm how do we know the dictionary value type matches T
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, there is no type-bound for the dictionary value type. Just do a simple test. At runtime downcast_ref
will fail in unary_dict
. I will address it in other PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally as the op is provided by users, I suppose that users know dictionary value is same type as the scalar. But it is good to return a meaningful Err instead of runtime panic. I will do it in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up sounds fine to me. Perhaps we can just check the type here:
downcast_dictionary_array! {
array => if array.values().data_type() == &T::DATA_TYPE {
try_unary_dict::<_, F, T>(array, op)
} else {
// throw error
},
t => {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, actually I thought to handle it at try_unary_dict
. But this fix looks okay as try_unary_dict
is currently used here and not public. I may fix at try_unary_dict
at another followup.
arrow/src/compute/kernels/arity.rs
Outdated
F: Fn(T::Native) -> Result<T::Native>, | ||
{ | ||
downcast_dictionary_array! { | ||
array => try_unary_dict::<_, F, T>(array, op), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up sounds fine to me. Perhaps we can just check the type here:
downcast_dictionary_array! {
array => if array.values().data_type() == &T::DATA_TYPE {
try_unary_dict::<_, F, T>(array, op)
} else {
// throw error
},
t => {
Thanks. |
Benchmark runs are scheduled for baseline = 2a0fc77 and contender = 7594db6. 7594db6 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #2712.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?