Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Provisional support for NEP 18 (__array_function__ protocol) #200

Merged
merged 16 commits into from
Oct 10, 2022

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented Aug 19, 2021

My initial motivation here was to add some unit representation to error messages when comparing two unyt_array instance via functions from numpy.testing, like so:

import numpy as np
import unyt as un

a = [1, 2, 3] * un.cm
b = [1, 2, 3] * un.km
np.testing.assert_array_equal(a, b)

which yields, on master:

...
AssertionError:
Arrays are not equal

Mismatched elements: 3 / 3 (100%)
Max absolute difference: 299997.
Max relative difference: 0.99999
 x: unyt_array([1, 2, 3])
 y: unyt_array([1, 2, 3])

and on this branch

previous version
...
AssertionError:
Arrays are not equal

Mismatched elements: 3 / 3 (100%)
Max absolute difference: 299997. cm
Max relative difference: 0.99999 dimensionless
 x: unyt_array([1, 2, 3] cm)
 y: unyt_array([1, 2, 3] km)

edit:

AssertionError:
Arrays are not equal

Mismatched elements: 3 / 3 (100%)
Max absolute difference: 299997., units='cm'
Max relative difference: 0.99999, units='dimensionless'
 x: unyt_array([1, 2, 3], units='cm')
 y: unyt_array([1, 2, 3], units='km')

Incidentally, it turns out that fixing this necessitated a kick-off implementation of NEP 18, so this work laid the foundation to solve:

More broadly, implementing NEP 18 for any was the topic of #139.
Granted I need to take more time to check that I'm not going against the original intentions here. My current approach is that, since covering the whole numpy public API in one go seems like an gigantic task, I'm implementing unyt_array.__array_function__ with a fallthrough condition: if a special case isn't implemented yet, just fallback to the raw numpy implem (which is currently the behaviour for all functions subject to NEP 18). This way we can add support for more and more functions in a progressive way.
I'm going to set the bar low(ish) for now, and try and fix the already reported cases, as reported above, as a first step.

An important question to address is: what should be done in the general case where we don't have a custom implementation for an array function ?

  1. transparently default to the raw numpy implementation without a warning (this is de facto what is done as of unyt 2.8.0, and will remain the case until NEP 18 isn't at least partially implemented).
  2. same as 1. but emit a warning (possibly, we could have a whitelist of functions that are known to be perfectly fine without a specialised implementation, for which no warning would be emitted) along the lines of
UserWarning: calling `numpy.FUNC` on a unyt_array. Results may hold incorrect units. A future version of unyt will remove this warning, and possibly change the behaviour of this function to be dimensionally correct.
  1. Error out

Option 1 this is the current implementation of this PR because I think it is the less disruptive or noisy one. My personal opinion is that it's probably okay to have incomplete support for NEP 18 for a couple release, as long as it is clearly stated in the release notes.

@neutrinoceros neutrinoceros marked this pull request as draft August 19, 2021 16:41
@neutrinoceros neutrinoceros force-pushed the nep18_repr branch 2 times, most recently from 5045c28 to 67129b0 Compare August 19, 2021 20:42
@neutrinoceros neutrinoceros added the bug Something isn't working label Aug 19, 2021
@neutrinoceros neutrinoceros linked an issue Aug 19, 2021 that may be closed by this pull request
@neutrinoceros neutrinoceros linked an issue Aug 19, 2021 that may be closed by this pull request
@neutrinoceros neutrinoceros force-pushed the nep18_repr branch 3 times, most recently from 84562d9 to 3a51a41 Compare August 20, 2021 12:07
@neutrinoceros
Copy link
Member Author

neutrinoceros commented Aug 20, 2021

I've gone the long way and started implemented a bunch of functions with obvious requirements.
TODO:

inverse-like functions

  • np.linalg.inv
  • np.linalg.tensorinv
  • np.linalg.pinv

product-like functions

  • np.dot
  • np.vdot
  • np.inner
  • np.outer
  • np.matmul
  • np.kron

histogram functions (as requested for #130)

  • np.histogram
  • np.histogram2d
  • np.histogramdd

Pifalls I found along the way

  • np.linalg.pinv, np.matmul and likely other functions support "stacks of arrays" as their arguments. In this mode, NEP 18 doesn't help, as far as I can tell, so there is nothing we can do at the moment (as far as I understand), besides mentioning this limitation in docs, and possibly as a custom warning too.
  • np.ma.dot isn't as trivial to implement than say np.dot, and the errors I got with my naive attempts were extremely similar o Creating masked array raises AttributeError #169, which was recently fixed by Fix unyt_array.__getitem__ to support numpy masked arrays #170. I think it's probably okay to leave this function out of the scope of the present PR.

@neutrinoceros neutrinoceros force-pushed the nep18_repr branch 3 times, most recently from 1da1198 to 092f9b3 Compare August 20, 2021 18:50
@neutrinoceros neutrinoceros marked this pull request as ready for review August 20, 2021 20:00
@neutrinoceros neutrinoceros changed the title [WIP] start adding NEP 18 support Provisional support for NEP 18 Aug 20, 2021
@neutrinoceros neutrinoceros changed the title Provisional support for NEP 18 Provisional support for NEP 18 (__array_function protocol) Aug 20, 2021
@neutrinoceros neutrinoceros changed the title Provisional support for NEP 18 (__array_function protocol) Provisional support for NEP 18 (__array_function__ protocol) Aug 20, 2021
@neutrinoceros neutrinoceros changed the title Provisional support for NEP 18 (__array_function__ protocol) ENH: Provisional support for NEP 18 (__array_function__ protocol) Aug 22, 2021
@chrishavlin chrishavlin mentioned this pull request Mar 2, 2022
6 tasks
@neutrinoceros
Copy link
Member Author

pre-commit.ci autofix

@neutrinoceros
Copy link
Member Author

@jzuhone do you think you'd be able to review this ? Is there someone else I could ask ?

@jzuhone
Copy link
Contributor

jzuhone commented Aug 19, 2022

@neutrinoceros I can try to look at it next week

@neutrinoceros
Copy link
Member Author

@jzuhone I would like to propose we bundle this PR in the next release alongside the dask layer. IMO, each of these would justify a release on their own, so it's up to you :)

@jzuhone
Copy link
Contributor

jzuhone commented Sep 22, 2022

@chrishavlin or @matthewturk could you look at this one?

@neutrinoceros
Copy link
Member Author

I rebased the branch but didn't write the history. I added two commit, the first of which corresponds to #288, and the second one adds deprecation warnings as requested.

@neutrinoceros
Copy link
Member Author

pre-commit.ci autofix

@neutrinoceros
Copy link
Member Author

(rebased again now that #288 went in)

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple minor nits from the changes you made today. Will merge once they're resolved.

unyt/tests/test_unyt_array.py Outdated Show resolved Hide resolved
unyt/_deprecation.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

np.histogram does not support unyt_array Using np.linalg.inv with unyt_array has no effect on units
3 participants