-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behavior between DatasetRolling.construct
and DataArrayRolling.construct
with stride > 1.
#7021
Comments
Thanks for the report & I agree that this should lead to the same but the code paths are indeed different - but I have not looked in to the actual root cause. Could be that this is also not super thoroughly tested (and used!): Line 289 in b018442
Line 721 in b018442
B.t.w. a copy-pastable example would be appreciated. |
Thanks for the response, here is a straightforward example. import xarray as xr
dummy = list(range(100))
x, y, z = [xr.DataArray(dummy, dims=['t']) for _ in range(3)]
ds = xr.Dataset(
{'x': x, 'y': y, 'z': z}
)
print(x.rolling(t=4).construct('w', stride=4).shape)
print(ds.rolling(t=4).construct('w', stride=4).x.shape) Results:
I had a hunch that the problem come from this part - not quite sure what self._mapping_to_list did here, haven't look it up yet. Lines 764 to 772 in b018442
Since I only had one dimension to deal with, removing this loop solves the problem for me. |
Been half a year and I found myself stuck at this inconsistent behavior again. Another problem I found but haven't mentioned yet is that This time, I've actually identified a cause for this problem below: Lines 789 to 791 in b018442
.isel({d: slice(None, None, s) for d, s in zip(self.dim, strides)}) I currently still can't figure it out what is the original intention that SolutionRemoving Test casetest_arr = xr.DataArray(np.arange(8).reshape(2, 4), dims=('a', 'b')) # Borrowed from `DataArray.__doc__`'s example.
test_dset= xr.Dataset(data_vars={i: tr for i in range(3)}) DataArray tr.rolling(b=2).construct('window_dim', stride=2)
>>> <xarray.DataArray (a: 2, b: 2, window_dim: 2)>
array([[[nan, 0.],
[ 1., 2.]],
[[nan, 4.],
[ 5., 6.]]])
Dimensions without coordinates: a, b, window_dim Dataset trd.rolling(b=2).construct('window_dim', stride=2)
>>> <xarray.Dataset>
Dimensions: (a: 2, b: 2, window_dim: 2)
Dimensions without coordinates: a, b, window_dim
Data variables:
0 (a, b, window_dim) float64 nan 0.0 1.0 2.0 nan 4.0 5.0 6.0
1 (a, b, window_dim) float64 nan 0.0 1.0 2.0 nan 4.0 5.0 6.0
2 (a, b, window_dim) float64 nan 0.0 1.0 2.0 nan 4.0 5.0 6.0
trd.rolling(b=2).construct('window_dim', stride=2)[0]
>>> <xarray.DataArray 0 (a: 2, b: 2, window_dim: 2)>
array([[[nan, 0.],
[ 1., 2.]],
[[nan, 4.],
[ 5., 6.]]])
Dimensions without coordinates: a, b, window_dim |
What is your issue?
INSTALLED VERSIONS
commit: None
python: 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
xarray: 2022.6.0
pandas: 1.4.2
numpy: 1.19.5
scipy: 1.7.0
netCDF4: 1.6.0
pydap: None
h5netcdf: 1.0.2
h5py: 3.1.0
Nio: None
zarr: 2.12.0
cftime: 1.6.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.4
dask: 2021.06.2
distributed: 2021.06.2
matplotlib: 3.5.3
cartopy: None
seaborn: 0.12.0
numbagg: None
fsspec: 2021.07.0
cupy: 9.2.0
pint: None
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 49.6.0.post20210108
pip: 21.1.3
conda: 4.10.3
pytest: 6.2.4
IPython: 7.24.1
sphinx: None
Reproducing the problem
I have an xarray Dataset with a single dimension as specified. (Or any arbitrary Xarray's
Dataset
When applied rolling operation on
DataArray
with no overlapping window, it is working as one would normally expected.11058688 / 256 = 43198
However when applying the same operation to the
Dataset
:I don't see any reasons why should rolling on
Dataset
andDataArray
should behave differently. Shouldn't rolling on dataset is just repeatingDataArray
rolling on every data variable?This differing behavior is not mentioned on the documentation either.
The text was updated successfully, but these errors were encountered: