You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug, including details regarding any error messages, version, and platform.
There are two variants of cumsum, cumulative_sum always wraps around on overflow and cumulative_sum always return Status::Invalid on overflow, regardless of the check_overflow parameter in CumulativeSumOptions.
For example:
CumulativeSumOptions options(/*start=*/0, /*skip_nulls=*/true, /*check_overflow=*/true);
auto res = CallFunction("cumulative_sum", {ArrayFromJSON(int8(), "[127, 1]")}, &options);
std::cout << res->make_array()->ToString() << std::endl;
// prints [127, -128], but user might expect an error returned.
I believe it's more of a ambiguity rather than a bug. The document of cumulative_sum also doesn't memtion check_overflow at all, and check_overflow is not exposed to pyarrow. So IMO the best approach to avoid confusion is to remove check_overflow from CumulativeSumOptions.
The only place where check_overflow is used is in the convenience function CumulativeSum defined in api_vector.h, which calls cumulative_sum_checked or cumulative_sum depending on the parameter. It can be replaced by a bool parameter in the convenience function.
Component(s)
C++
The text was updated successfully, but these errors were encountered:
### Rationale for this change
There are two variants of cumsum, `cumulative_sum` always wraps around on overflow and `cumulative_sum_checked` always return `Status::Invalid` on overflow, regardless of the `check_overflow` parameter in CumulativeSumOptions.
For example:
```cpp
CumulativeSumOptions options(/*start=*/0, /*skip_nulls=*/true, /*check_overflow=*/true);
auto res = CallFunction("cumulative_sum", {ArrayFromJSON(int8(), "[127, 1]")}, &options);
std::cout << res->make_array()->ToString() << std::endl;
// prints [127, -128], but user might expect an error returned.
```
I believe it's more of a ambiguity rather than a bug. The document of `cumulative_sum` also doesn't mention `check_overflow` at all, and `check_overflow` is not exposed to pyarrow. So IMO the best approach to avoid confusion is to remove `check_overflow` from CumulativeSumOptions.
The only place where `check_overflow` is used is in the C++ convenience function `CumulativeSum` defined in api_vector.h, which calls `cumulative_sum_checked` or `cumulative_sum` depending on the parameter. It can be replaced by a bool parameter in the C++ convenience function.
### What changes are included in this PR?
1. `check_overflow` is removed from CumulativeSumOptions.
2. Add a bool check_overflow parameter to the C++ convenience function CumulativeSum.
### Are these changes tested?
It doesn't affect the current tests.
### Are there any user-facing changes?
Yes, but only the C++ convenience function because check_overflow is not used in the kernel implementation and not exposed to pyarrow anyways.
* Closes: #35789
Authored-by: Jin Shang <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
There are two variants of cumsum,
cumulative_sum
always wraps around on overflow andcumulative_sum
always returnStatus::Invalid
on overflow, regardless of thecheck_overflow
parameter in CumulativeSumOptions.For example:
I believe it's more of a ambiguity rather than a bug. The document of
cumulative_sum
also doesn't memtioncheck_overflow
at all, andcheck_overflow
is not exposed to pyarrow. So IMO the best approach to avoid confusion is to removecheck_overflow
from CumulativeSumOptions.The only place where
check_overflow
is used is in the convenience functionCumulativeSum
defined in api_vector.h, which callscumulative_sum_checked
orcumulative_sum
depending on the parameter. It can be replaced by a bool parameter in the convenience function.Component(s)
C++
The text was updated successfully, but these errors were encountered: