Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: "TypeError: Vectorized indexing is not supported" with xarray 2024.10.0 + sparse #9694

Open
5 tasks done
khaeru opened this issue Oct 29, 2024 · 3 comments
Open
5 tasks done
Labels
array API standard Support for the Python array API standard bug regression topic-arrays related to flexible array support

Comments

@khaeru
Copy link

khaeru commented Oct 29, 2024

What happened?

Code that worked with xarray 2024.9.0 has begun to fail with xarray 2024.10.0, even though no breaking changes were advertised.

What did you expect to happen?

The below MCVE runs with xarray 2024.9.0, giving the output below.

I'd expect it runs the same way with xarray 2024.10.0.

Minimal Complete Verifiable Example

import pandas as pd
import xarray as xr
from numpy import nan

# Create a series
s = pd.Series(
    [nan, nan, 1.0, nan, nan, nan, 2, 3, 4, nan, 5, 6, 7, 8, 9],
    index=pd.MultiIndex.from_product(
        [["x0", "x1", "x2"], ["y0", "y1", "y2", "y3", "y4"]], names=list("xy")
    ),
)

# Create indexers
newdim = {"newdim": ["nd0", "nd1", "nd2"]}
x_idx = xr.DataArray(["x2", "x1", "x2"], coords=newdim)
y_idx = xr.DataArray(["y4", "y2", "y0"], coords=newdim)

for sparse in (False, True):
    # Create a DataArray
    da = xr.DataArray.from_series(s, sparse=sparse)

    # Do vectorized indexing
    print(da.sel(x=x_idx, y=y_idx))

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

With xarray 2024.9.0:

<xarray.DataArray (newdim: 3)> Size: 24B
array([9., 3., 5.])
Coordinates:
    x        (newdim) object 24B 'x2' 'x1' 'x2'
    y        (newdim) object 24B 'y4' 'y2' 'y0'
  * newdim   (newdim) <U3 36B 'nd0' 'nd1' 'nd2'
<xarray.DataArray (newdim: 3)> Size: 48B
<COO: shape=(3,), dtype=float64, nnz=3, fill_value=nan>
Coordinates:
    x        (newdim) object 24B 'x2' 'x1' 'x2'
    y        (newdim) object 24B 'y4' 'y2' 'y0'
  * newdim   (newdim) <U3 36B 'nd0' 'nd1' 'nd2'

With xarray 2024.10.0:

<xarray.DataArray (newdim: 3)> Size: 24B
array([9., 3., 5.])
Coordinates:
    x        (newdim) object 24B 'x2' 'x1' 'x2'
    y        (newdim) object 24B 'y4' 'y2' 'y0'
  * newdim   (newdim) <U3 36B 'nd0' 'nd1' 'nd2'
Traceback (most recent call last):
  File "/home/khaeru/vc/genno/bug.py", line 23, in <module>
    print(da.sel(x=x_idx, y=y_idx))
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/dataarray.py", line 1675, in sel
    ds = self._to_temp_dataset().sel(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/dataset.py", line 3237, in sel
    result = self.isel(indexers=query_results.dim_indexers, drop=drop)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/dataset.py", line 3070, in isel
    return self._isel_fancy(indexers, drop=drop, missing_dims=missing_dims)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/dataset.py", line 3126, in _isel_fancy
    new_var = var.isel(indexers=var_indexers)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 1049, in isel
    return self[key]
           ~~~~^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 816, in __getitem__
    data = indexing.apply_indexer(indexable, indexer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/indexing.py", line 1029, in apply_indexer
    return indexable.vindex[indexer]
           ~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/indexing.py", line 369, in __getitem__
    return self.getter(key)
           ^^^^^^^^^^^^^^^^
  File "/home/khaeru/.venv/3.12/lib/python3.12/site-packages/xarray/core/indexing.py", line 1589, in _vindex_get
    raise TypeError("Vectorized indexing is not supported")
TypeError: Vectorized indexing is not supported

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.7 (main, Oct 3 2024, 15:15:22) [GCC 14.2.0]
python-bits: 64
OS: Linux
OS-release: 6.11.0-9-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.10.0
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.13.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.4.0
dask: 2024.5.1
distributed: None
matplotlib: 3.9.0
cartopy: None
seaborn: 0.13.1
numbagg: None
fsspec: 2023.12.2
cupy: None
pint: 0.24.1
sparse: 0.15.4
flox: None
numpy_groupies: None
setuptools: 69.0.3
pip: 24.2
conda: None
pytest: 8.3.3
mypy: 1.11.2
IPython: 8.17.1
sphinx: 8.1.3

@khaeru khaeru added bug needs triage Issue that has not been reviewed by xarray team member labels Oct 29, 2024
@keewis
Copy link
Collaborator

keewis commented Oct 29, 2024

thanks for the report!

This was introduced in #9530, and it was definitely not intentional to break vectorized indexing with sparse (looks like I forgot to add a release note entry, which I would have put into "new features").

The issue here is the order of preference for the indexing adapters, which was switched from preferring __array_function__ to preferring __array_namespace__ if both are present. However, it looks like I missed sparse when investigating whether any array type in use already implemented both (and our tests didn't catch them either).

There's two things I believe we can do (besides reverting if we need more time to figure out a good way to resolve this):

  1. allow specifying the preferred protocol by setting an attribute on the array type (with the default being "__array_function__")
  2. work around the Array API not including vectorized indexing / indexing with an integer array

not sure which one will be better in the end

@keewis keewis added topic-arrays related to flexible array support array API standard Support for the Python array API standard and removed needs triage Issue that has not been reviewed by xarray team member labels Oct 29, 2024
@dcherian
Copy link
Contributor

work around the Array API not including vectorized indexing / indexing with an integer array

Let's move ahead with this. We already have the code in explicit_indexing_adapter, we just need to figure out the right IndexingSupport enum variant for whatever array api prescribes.

@shoyer
Copy link
Member

shoyer commented Oct 30, 2024

work around the Array API not including vectorized indexing / indexing with an integer array

Let's move ahead with this. We already have the code in explicit_indexing_adapter, we just need to figure out the right IndexingSupport enum variant for whatever array api prescribes.

The array API has discussed adding support for vectorized indexing in the near future: data-apis/array-api#669

Hopefully this happens!

khaeru added a commit to khaeru/genno that referenced this issue Nov 26, 2024
Still present in xarray 2024.11.0.
khaeru added a commit to khaeru/genno that referenced this issue Nov 26, 2024
Still present in xarray 2024.11.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array API standard Support for the Python array API standard bug regression topic-arrays related to flexible array support
Projects
None yet
Development

No branches or pull requests

5 participants