Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEP-18 and python scalars #950

Closed
keewis opened this issue Dec 22, 2019 · 7 comments
Closed

NEP-18 and python scalars #950

keewis opened this issue Dec 22, 2019 · 7 comments

Comments

@keewis
Copy link
Contributor

keewis commented Dec 22, 2019

We now implement __array_function__ which makes pint.Quantity seem like a numpy duck array. However, if we use pint to wrap python scalars, this is wrong: python scalars don't have attributes like shape or ndim.

I don't know how we could fix this but it is something we definitely should not leave as it is right now.

@jthielen
Copy link
Contributor

jthielen commented Dec 22, 2019

Would you be able to explain further what is problematic and/or wrong about the current behavior?

For instance, when a Quantity wraps a Python scalar, there is the following behavior:

>>> from pint import Quantity
>>> x = Quantity(2, 'meter')
>>> x
<Quantity(2, 'meter')>
>>> x.shape
Traceback (most recent call last):
  File ".../quantity.py", line 1575, in __getattr__
    return getattr(self._magnitude, item)
AttributeError: 'int' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../quantity.py", line 1579, in __getattr__
    "has attribute '{}'".format(self._magnitude, item)
AttributeError: Neither Quantity object nor its magnitude (2) has attribute 'shape'
>>> x.ndim
Traceback (most recent call last):
  File ".../quantity.py", line 1575, in __getattr__
    return getattr(self._magnitude, item)
AttributeError: 'int' object has no attribute 'ndim'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../quantity.py", line 1579, in __getattr__
    "has attribute '{}'".format(self._magnitude, item)
AttributeError: Neither Quantity object nor its magnitude (2) has attribute 'ndim'

And when a NumPy scalar is wrapped:

>>> y = Quantity(np.arange(3), 'meter')[-1]
>>> y
<Quantity(2, 'meter')>
>>> y.shape
()
>>> y.ndim
0

Pint Quantity scalars also need to respond to __array_function__ for much of the NumPy functionality to work, such as multiplication (which gets routed through np.multiply):

>>> np.arange(2) * Quantity(2, 'meter')
<Quantity([0 2], 'meter')>

At least to me, this all seems like reasonable, expected behavior for a flexible wrapper class.

Also, on a related topic, there has previously been discussion of splitting the Quantity class into a scalar and a sequence class, but that now seems to be disfavored (see #753).

@keewis
Copy link
Contributor Author

keewis commented Dec 22, 2019

I've been trying to use xarray's assert_allclose, which assumes everything a DataArray can contain is an array (the DataArray constructor makes sure of that but does not try to convert objects that have the __array_function__ attribute). The moment an object tries to disguise itself as a duck array without actually being one by setting __array_function__, this approach fails.

So the question is: what can we reasonably expect from an object that implements __array_function__? In other words, what is the minimal interface for a duck array? I would have expected attributes like shape and ndim but I might be wrong to assume that.

@jthielen
Copy link
Contributor

So the question is: what can we reasonably expect from an object that implements __array_function__? In other words, what is the minimal interface for a duck array? I would have expected attributes like shape and ndim but I might be wrong to assume that.

I think these are two different questions.

While based on xarray's implementation, I had initially assumed that an object having __array_function__ was declaring it itself to be a duck array, but after the discussion in #883, and rereading the relevant NEPs (13, 18, 22, 30), I no longer believe this to be the case. Based on these (and especially this), I think an object implementing __array_function__ (and not returning NotImplemented) is simply declaring that it supports functions in the NumPy API. Instead, an object declaring itself a duck array doesn't seem well-defined right now, at least until NEP 30 (with an explicit __duckarray__ protocol) is implemented. With that in mind, NEP 22 makes it clear that shape, ndim, and dtype are expected attributes on a duck array.

So, I would consider a Pint Quantity to always support __array_function__, but only sometimes be a duck array (when it wraps a ndarray or another duck array).

However, this is all just my interpretation of the relevant documentation, and I very well may be mistaken. If you get the chance, @crusaderky and @shoyer, would you mind weighing in?

@keewis
Copy link
Contributor Author

keewis commented Dec 23, 2019

Sorry about that, I didn't read the NEPs carefully enough which lead me to consider implementing __array_function__ the same as marking the object as a duck array.

Would it be reasonable for pint to define shape, ndim and dtype if the installed numpy version supports __array_function__ (and is enabled for 1.16)?

@jthielen
Copy link
Contributor

Would it be reasonable for pint to define shape, ndim and dtype if the installed numpy version supports __array_function__ (and is enabled for 1.16)?

I'm honestly not sure!

shape and ndim seem reasonable enough to have in general/irrespective of __array_function__ (following NumPy scalar/0d-array behavior of being () and 0 respectively for scalars), but dtype is questionable since NEP 22 implies that dtype needs to be a NumPy dtype, which wouldn't be the case for a generic Python scalar.

If all these attributes are needed in the way NumPy expects them, perhaps it would just be better to have the scalars converted in the first place? It would likely need to be opt-in, so something like the current UnitRegistry(force_ndarray=True) behavior can be extended to a UnitRegistry(force_ndarray_like=True), where scalars are converted to 0d NumPy arrays, but duck arrays below Quantities on the type casting hierarchy are not downcast to ndarrays (xref #878 (comment), since it's come up before).

@keewis
Copy link
Contributor Author

keewis commented Dec 23, 2019

👍 for adding force_ndarray_like when the support for wrapping duck arrays is implemented.

thanks for the trick with force_ndarray, it solves my use case. I think this is resolved, should we close this?

@jthielen
Copy link
Contributor

I think this is resolved, should we close this?

I'd say go ahead if you want. I can add notes on force_ndarray_like and the scalar vs. duck array distinctions to #845.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants