You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
APIs for groupby, groupby_bins, resample, rolling are different, especially for multi-dimensional array.
importnumpyasnpimportxarrayasxrimportpandasaspdtime=pd.date_range('2000-01-01', freq='6H', periods=365*4)
ds=xr.Dataset({'foo': (('time', 'x'), np.random.randn(365*4, 5)), 'time': time,
'x': [0, 1, 2, 1, 0]})
ds.rolling(time=2).mean() # result dims : ('time', 'x')ds.resample(time='M').mean() # result dims : ('time', 'x')ds['foo'].resample(time='M').mean() # result dims : ('time', ) maybe a bug #2362ds.groupby('time.month').mean() # result dims : ('month', )ds.groupby_bins('time', 3).mean() # result dims : ('time_bins', )
In rolling and resample(for Dataset), reduction without argument is carried out along grouped dimension
In rolling, reduction along other dimesnion is not possible
In groupby and groupby_bins, reduction is applied to the grouped objects and if without argument, it reduces alongall the dimensions of each grouped object.
I think rollings API is most clean, but I am not sure it is worth to change these APIs.
The possible options would be
Change APIs of groupby and groupby_bins so that they share similar API with rolling.
Document clearly how to perform resample or groupby with multidimensional arrays.
The text was updated successfully, but these errors were encountered:
This does seem to be a little inconsistent currently.
My original reasoning for the default groupby behavior was that that this felt more consistent with the behavior for non-grouped reductions, which reduces across all dimensions.
But it's probably less useful, and results in a lot of redundant code. I can only think of a few times when I've actually wanted this behavior, rather than summing over only the grouped dimension. Especially when going from 1D -> ND, this is a likely source of errors.
So instead, we could change this to:
ds.groupby('time.month').mean() # result dims : ('month', 'x')
ds.groupby('time.month').mean(dim=None) # result dims : ('month',)
Or maybe we could add a special constant xarray.ALL_DIMS to indicate all dimensions? This is probably the most readable version:
ds.groupby('time.month').mean(dim=xarray.ALL_DIMS) # result dims : ('month',)
From #2356
APIs for
groupby
,groupby_bins
,resample
,rolling
are different, especially for multi-dimensional array.rolling
andresample
(for Dataset), reduction without argument is carried out along grouped dimensionrolling
, reduction along other dimesnion is not possiblegroupby
andgroupby_bins
, reduction is applied to the grouped objects and if without argument, it reduces alongall the dimensions of each grouped object.I think
rolling
s API is most clean, but I am not sure it is worth to change these APIs.The possible options would be
groupby
andgroupby_bins
so that they share similar API withrolling
.resample
orgroupby
with multidimensional arrays.The text was updated successfully, but these errors were encountered: