-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split in two Quantity Classes for Scalar and Array objects #753
Comments
The new array_function support is probably relevant: https://www.numpy.org/neps/nep-0018-array-function-protocol.html |
Thanks for opening this! I think the reasoning behind Pandas' ExtensionTypes to to get around some difficulties with numpy. One example being int32 not being able to store nan, so we get this behaviour in the
Which makes 1 and 2 undesirable. From a pint-pandas perspective, an issue I found was having a single class for both Scalars and sequences. It would be good to have QuantityScalar and QuantitySequence to inherit from Quantity. The array_function looks promising. I followed the example implementation and was able to get concatenate to return a quantity: @implements(concatenate)
def _(arrays, axis=0, out=None):
out_units = arrays[0].units
arrays = [q.to(out_units) for q in arrays]
return ureg.Quantity(np.concatenate(arrays, axis=axis, out=out), out_units)
my_array = ureg.Quantity([1,2],"m")
my_array_mm = ureg.Quantity([1,2],"mm")
concatenate([my_array,my_array_mm])
(1.02.00.0010.002)meter
While you'd usually get np.concatenate([my_array,my_array_mm])
C:\Users\A\Anaconda3\envs\py36\lib\site-packages\pint\quantity.py:1377: UnitStrippedWarning: The unit of the quantity is stripped.
warnings.warn("The unit of the quantity is stripped.", UnitStrippedWarning)
array([1, 2, 1, 2]) Will need to get the latest numpy version to try it properly (so it's using np.concatenate instead of concatenate like in https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006 ) |
@andrewgsavage Not sure if the code is right:. Why not: But in any case, it looks promising. My proposal would be that one side: On the other side, we start brainstorming how to do it with QuantityScalar and QuantitySequence. |
Yea that'd be better, surprised I didn't get a warning message. Sounds good. Are you thinking in a separate module like pint-pandas? wrt scalar/sequence, I was thinking something like this: |
Not sure. It seems to me that 99% of the people who use pint for science will use numpy. I would just create a numpy module inside pint (unless there are good reasons not to do it). I put some comments on #764 |
Could you elaborate on how this would be helpful? I don't understand the motivation for the separate scalar/sequence classes in pint. NumPy does have separate scalars/sequences, but it is generally regarded as a design mistake. The main virtue of separate scalars is that they are immutable and hashable, so you can use them as dictionary keys, but the separate code paths add a significant amount of complexity, both internally in NumPy and externally for its users. NumPy does not even manage to keep the distinction straight: it will convert 0-dimensional arrays into scalars when used in many operations. |
I too am personally against the idea of splitting the classes. I'm also seriously concerned about numpy scalar arrays (ndim=0) - they should be considered carefully early in the design phase. |
I'm sorry for the very delayed response in regards to this and the changes of #875...grad school can get busy sometimes. 😬 My previous agreement with the split quantity class change (e.g. #764 (comment), #875 (comment)) was mostly motivated by two things:
1 was looking to be taken care of by #764 initially or later #875 with a follow-up PR, and I was able to test that 2 held (especially in fixing #751 / #760 / #881). I was indifferent about the quantity class split itself. When it comes to reviewing implementation details, though, I still have a lot to learn and I feel it's a bit beyond my level of experience. So, I'm very thankful for @crusaderky's and @shoyer's informative comments to clarify the problems with #875, and I defer to their judgement on the matter. With all this in mind, however, I'm still antsy to see a working NEP 18 implementation in pint make a release soon (MetPy's planned v1.0 release at the end of 2019 needs improved xarray support in the current unit-aware calculations, and having NEP 18 support in both xarray and pint would be extremely helpful for this). Given the discussion here and in #875 (comment), and how the timeline has gone on so far, would the best way to move forward be:
? I'd be glad to still help where I can to get this moving along (in particular, I could help in fixing #751 and #760 and reimplementing |
@crusaderky No, I am just a downstream user who really wants to get pint working smoothly with xarray (my main contributions are to MetPy). @cpascual Do you have any thoughts here? You are the one other merging reviewer I found in |
When Pint was in its infancy there were several options ahead of us to support numpy arrays:
Good: Works with numpy.
Bad: requires numpy for everything.
Good: works with numpy without requiring it for non numpy operations
Bad: makes things more complicated.
Good: easy to implement
Bad: not very efficient
Good: works with numpy without requiring it for non numpy operations
Bad: a custom dtype difficult to get right and distribute. Fragile.
Good: works with numpy without requiring it for non numpy operations. Easy to implement
Bad: Some numpy operations are not supported
The current solutions has served us well (very well, I would say) and we were able to enforce correctness for operations with units without requiring users to import a particular math package and without monkey patching. So there is a strong reason to keep things like they are. But there are a few reasons to move forward
To recap the reason why some numpy operations are not supported is because within numpy the fuction
asanyarray
is called multiple times and this copies the content of the array but looses the type (Quantity) because Quantity is not anp.array
subclass. We even had a prototype of solution (2) which you can see a very outdated version in the_npsubclass
Please join the discussion.
The text was updated successfully, but these errors were encountered: