Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Useful functions not in the Array API Standard #193

Open
TomNicholas opened this issue May 31, 2023 · 4 comments
Open

Useful functions not in the Array API Standard #193

TomNicholas opened this issue May 31, 2023 · 4 comments
Labels
array api enhancement New feature or request xarray-integration Uses or required for cubed-xarray integration

Comments

@TomNicholas
Copy link
Member

There are a few numpy functions which xarray calls on wrapped arrays but which are not (yet) in the Array API Standard. (See data-apis/array-api#187 (comment)) Cubed could choose to implement these to facilitate full integration.

FYI of this list xarray currently uses:

np.clip
np.diff
np.pad
np.repeat
np.take
np.tile

Of particular interest to me personally is np.pad. It's used within xarray's .pad method, which is used within xGCM's apply_as_grid_ufunc, which led to the pad function being an important part of the test case that exposed memory management problems with dask's distributed scheduler. I can't really close the loop by trying out cubed on that full original problem unless pad is available in cubed.

pad is also interesting because a parallel implementation isn't trivial - dask's pad implementation uses map_blocks in some cases, but more complicated tricks in other cases. For my purposes above I wouldn't need to implement more than one or two of the mode kwarg options though.

I would be interested in submitting a PR for adding pad if that's something you would welcome @tomwhite? (I mentioned this on the xarray tracker but it's really a cubed question pydata/xarray#7848 (comment))

cc @jbusecke

@TomNicholas TomNicholas added enhancement New feature or request array api xarray-integration Uses or required for cubed-xarray integration labels May 31, 2023
@tomwhite
Copy link
Member

tomwhite commented Jun 1, 2023

I would be interested in submitting a PR for adding pad

That would be very welcome! Just implementing the cases you need seems like a good way forward. (We might want to put it in a different namespace as it's not part of the array API, but we can discuss that later.)

BTW take is already implemented here.

@jbusecke
Copy link

jbusecke commented Jun 5, 2023

That would be really cool @TomNicholas. Happy to test out prototypes whenever you think that is useful!

@tomwhite
Copy link
Member

Xarray doesn't delegate to np.diff so this already works with Cubed:

>>> import xarray as xr
>>> import numpy as np
>>> import cubed.random
>>> da = xr.DataArray(cubed.random.random((3, 4), chunks=(2, 2)), dims=["x", "y"])
>>> d = da.diff("y")
>>> from numpy.testing import assert_array_equal
>>> assert_array_equal(d.values, np.diff(da.values, axis=1))

@tomwhite
Copy link
Member

tomwhite commented Oct 1, 2024

clip was added in #583

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array api enhancement New feature or request xarray-integration Uses or required for cubed-xarray integration
Projects
None yet
Development

No branches or pull requests

3 participants