Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Add element-wise power() compute function #27714

Closed
asfimport opened this issue Mar 5, 2021 · 7 comments
Closed

[C++] Add element-wise power() compute function #27714

asfimport opened this issue Mar 5, 2021 · 7 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Mar 5, 2021

It would be nice to have an element-wise power() compute function.

I.e. in analogy to numpy.power().

Reporter: ARF / @ARF1

Related issues:

Note: This issue was originally created as ARROW-11871. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Adding a power kernel would indeed be nice.

One behavioural aspect that has come up in pandas is the question about what to do with nulls in case of power(null, 0) or power(1, null: propagate the null value (as is otherwise always done for element-wise arithmetic operations), or in this case return an actual result (1 in both cases). Reference to the pandas issue: pandas-dev/pandas#29997

@asfimport
Copy link
Collaborator Author

ARF / @ARF1:
@jorisvandenbossche For what it's worth, in my book null != 0. To me null has absolutely nothing to do with the value 0. To me null indicates an invalid or non-existent value and the name null is merely a (maybe unfortunate) historical artifact.

As a consequence in my opinion, the following should hold: power(null, 0) == null as well as power(1, null) == null.
I read this as: (either) one of two operands of a binary operator is invalid or missing, hence the result is invalid or missing as well.

With this convention, if a user wants a different behaviour they can always use fill_null(0) to ensure that power(fill_null(null, 0), 0) == 1 and power(1, fill_null(null, 0)) == 1. The converse is not true.

Also, I believe explicit is better than implicit...

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
To clarify, my comment above was not about null being regarded as 0. But rather the interpretation that null is seen as "some unknown value". And then you can argue that the result is is not unknown for power(null, 0), because power(<any value>, 0) is always 1, whatever value is passed as the first argument.

@asfimport
Copy link
Collaborator Author

Neal Richardson / @nealrichardson:
This is a duplicate of ARROW-11070 right?

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Indeed!

@asfimport
Copy link
Collaborator Author

ARF / @ARF1:
@jorisvandenbossche Thanks for your explanations. I accept that in end this depends on the semantics of null. Two different programmers can legitimately understand null to mean different things. How a programmer understands null dictates the correct behaviour of power().

A programmer that understands null in an array to mean "this is a value of the defined datatype, I just don't know what it is" will expect power(<any value including any unknown value>, 0) == 1.

A programmer that understands null in an array to mean "this value does not exist, it is fundamentally invalid" will expect power(null, 0) == null.

It would seem to me the solution is to leave the choice to the user and allow her/him to specify the desired behaviour as an option to power. Then the debate becomes only "what should the default behaviour be?" ;-) In this case, I would reverse my opinion and would argue at least for pyarrow to default to the python behaviour of float('NaN')**0.0 == 1.

If arrow has to specify a unique semantic interpretation of null and cannot allow user choice, I believe however power(null, 0) == null is the better choice due to greater versatility: As I tried to explain in my previous comment, this interpretation allows to user to obtain the alternative behaviour by using power(fill_null(null, 0), 0) == 1.

Conversely if arrow standardized on power(null, 0) == 1 there is nothing the user can do to get the alternative behaviour. Once a value becomes non-null, there is no way to recover its original null-ness.

Please feel free to close this issue as a duplicate. I searched for issues relating to power and did not find ARROW-11070.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Yes, I already closed it.

It would seem to me the solution is to leave the choice to the user and allow her/him to specify the desired behaviour as an option to power.

Indeed, if there are different downstream applications that might need either behaviour, an option might be best. But so that's the main reason I brought up the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant