-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NamedArray tracking issue #8238
Comments
For the operators we will run into the following issue: Currently If you want to make NamedArray independent from xarray, you need to find a way to solve this. I assume numpy has the same issues, maybe check over there how it works? |
@headtr1ck or @Illviljan would one of you be able to handle moving over unary and binary ops and maybe even |
I can have a look at it :) |
How much should NamedArray conform to the array api standard? For example, the array api does not implement the methods, I'm leaning towards being strict with the array api and to remove them because they are causing some truly difficult typing issues (#8281, #8294). |
@TomNicholas do you have time to move over |
The API is certainly going to be a superset of the array API (that doesn't even have NaN skipping reductions!)
I don't think typing should limit the API we offer.
Can we add these back please? As of now, our plan is to include them. https://github.com/pydata/xarray/blob/main/design_notes/named_array_design_doc.md#attributes-to-be-preserved-from-xarrayvariable |
We should also remember that the array API is not fixed in stone - it can and already has changed to include more functions. We should be raising issues there to get functions we want added. See data-apis/array-api#187 (comment) and data-apis/array-api#669 for example.
+1
I'll have a look now! |
We should slow down with the PRs, haha. The merge conflicts will be a nightmare |
Hey all! Don't want to hijack this thread, but wanted to respond to a few of the comments above regarding the Array API.
@Illviljan In the Array API standard, we've preferred functional APIs over methods and have the equivalent of
@dcherian There has been some discussion about adding support for NaN reductions to the specification (see data-apis/array-api#621); however, there is still some debate as to what form that should take. E.g., whether we actually want
@TomNicholas Indeed! May be good to list all the current Array API pain points (if not done already), so that we can potentially prioritize. We could also potentially arrange for xarray devs to present at one of the Consortium workgroup meetings. Regardless, thanks for considering the Array API standard. We'd love to get additional feedback on the specification over on the specification repo. :) |
@andersy005 will there be a function similar to xarray/xarray/tests/test_namedarray.py Lines 73 to 75 in 72abfdf
|
@maxrjones, currently, the constructor of |
This is helpful, thank you! I will proceed by marking these as xfail for now, so it would be easy to test in NamedArray if added in the future. |
@andersy005 @scottyhq @dcherian I'm at the NumFOCUS project summit and recently spoke with a masters student in physical oceanography who chose to use NumPy only rather than Xarray for her code because of the performance overhead for xarray. There may be computational patterns that would make Xarray workable, but this also could be a good use case to test out NamedArray on a real-work science case when it comes time to demonstrate the utility of NamedArray. |
thank you for the update, @maxrjones! @dcherian and I have discussed putting together a write-up/post that provides a clear overview of the current state of NamedArray, and highlighting the next steps needed to bring it to completion. In the coming weeks, I’m going to take a stab at this post. It would be extremely useful to connect with users, especially those interested in using NamedArray features outside of Xarray, which could shape the next phases of development. |
FYI after speaking with this student I think that student's troubles were really a classic case of difficulty configuring dask for a larger-than-memory workload, nothing to do with overhead of xarray or NamedArray.
We do need something like this. I've had multiple conversations at this summit with people who are interested in named-array-like features, but I'm not sure what to tell them in terms of readiness. (see for example https://github.com/pydims/pydims) |
Also I don't know where the correct issue for this is but there is a strong case made here that a minimal implementation like |
@andersy005 I think it would be good to keep a running list of NamedArray tasks. I'll start with a rough sketch, please update/edit as you like.
NamedArray
base class (initial refactor for NamedArray #8075)VariableArithmetic
toNamedArrayArithmetic
(Migrate VariableArithmetic to NamedArrayArithmetic #8244)*Indexer
objects to.oindex
and.vindex
on ExplicitlyIndexed array classesgenerate_reductions.py
? (Move Variable aggregations to NamedArray #8304)formatting.py
parallelcompat.py
pycompat.py
(Migrate VariableArithmetic to NamedArrayArithmetic #8244)test_variable.py
test both NamedArray and VariableNamedArray.shape
does not support unknown dimensions #8291axis
kwarg? #8333xarray.core/*
by importingnamedarray
functionality intoxarray.core/*
xref #3981
The text was updated successfully, but these errors were encountered: