[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

headtr1ck · 2022-02-01T08:40:30Z

What happened?

I tried to reduce some dimensions using where (sel did not work in this case) and shorten the dimensions with "drop=True".
This works fine on DataArrays and Datasets with only a single dimension but fails as soon as you have a Dataset with two dimensions on different variables.
The dimensions are left untouched and you have NaNs in the data, just as if you were using "drop=False" (see example).

I am actually not sure what the expected behavior is, maybe I am wrong and it is correct due to some broadcasting rules?

What did you expect to happen?

I expected that relevant dims are shortened.
If the ds.where with "drop=False" all variables along a dimenions have some NaNs, then using "drop=True" I expect these dimensions to be shortened and the NaNs removed.

Minimal Complete Verifiable Example

import xarray as xr

# this works
ds = xr.Dataset({"a": ("x", [1, 2 ,3])})
ds.where(ds > 2, drop=True)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 1)
# Dimensions without coordinates: x
# Data variables:
#     a        (x) float64 3.0

# this doesn't
ds = xr.Dataset({"a": ("x", [1, 2 ,3]), "b": ("y", [2, 3, 4])})
ds.where(ds > 2, drop=True)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 3, y: 3)
# Dimensions without coordinates: x, y
# Data variables:
#     a        (x) float64 nan nan 3.0
#     b        (y) float64 nan 3.0 4.0

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.9.1 (default, Jan 13 2021, 15:21:08)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.49.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 0.20.2
pandas: 1.3.5
numpy: 1.21.5
scipy: 1.7.3
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.1.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.5.1
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
setuptools: 49.2.1
pip: 22.0.2
conda: None
pytest: 6.2.5
IPython: 8.0.0
sphinx: None

The text was updated successfully, but these errors were encountered:

headtr1ck added bug needs triage Issue that has not been reviewed by xarray team member labels Feb 1, 2022

dcherian removed the needs triage Issue that has not been reviewed by xarray team member label Apr 9, 2022

headtr1ck mentioned this issue May 1, 2022

Improved Dataset broadcasting #6549

Open

headtr1ck mentioned this issue Jun 12, 2022

Fix Dataset.where with drop=True and mixed dims #6690

Merged

3 tasks

max-sixty closed this as completed in #6690 Jun 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

headtr1ck commented Feb 1, 2022

[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

[Bug]: Dataset.where(x, drop=True) behaves inconsistent #6227

Comments

headtr1ck commented Feb 1, 2022

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS