Skip to content

Commit

Permalink
Support multiple dimensions in DataArray.argmin() and DataArray.argma…
Browse files Browse the repository at this point in the history
…x() methods (pydata#3936)

* DataArray.indices_min() and DataArray.indices_max() methods

These return dicts of the indices of the minimum or maximum of a
DataArray over several dimensions.

* Update whats-new.rst and api.rst with indices_min(), indices_max()

* Fix type checking in DataArray._unravel_argminmax()

* Fix expected results for TestReduce3D.test_indices_max()

* Respect global default for keep_attrs

* Merge behaviour of indices_min/indices_max into argmin/argmax

When argmin or argmax are called with a sequence for 'dim', they now
return a dict with the indices for each dimension in dim.

* Basic overload of argmin() and argmax() for Dataset

If single dim is passed to Dataset.argmin() or Dataset.argmax(), then
pass through to _argmin_base or _argmax_base. If a sequence is passed
for dim, raise an exception, because the result for each DataArray would
be a dict, which cannot be stored in a Dataset.

* Update Variable and dask tests with _argmin_base, _argmax_base

The basic numpy-style argmin() and argmax() methods were renamed when
adding support for handling multiple dimensions in DataArray.argmin()
and DataArray.argmax(). Variable.argmin() and Variable.argmax() are
therefore renamed as Variable._argmin_base() and
Variable._argmax_base().

* Update api-hidden.rst with _argmin_base and _argmax_base

* Explicitly defined class methods override injected methods

If a method (such as 'argmin') has been explicitly defined on a class
(so that hasattr(cls, "argmin")==True), then do not inject that method,
as it would override the explicitly defined one. Instead inject a
private method, prefixed by "_injected_" (such as '_injected_argmin'), so
that the injected method is available to the explicitly defined one.

Do not perform the hasattr check on binary ops, because this breaks
some operations (e.g. addition between DataArray and int in
test_dask.py).

* Move StringAccessor back to bottom of DataArray class definition

* Revert use of _argmin_base and _argmax_base

Now not needed because of change to injection in ops.py.

* Move implementation of argmin, argmax from DataArray to Variable

Makes use of argmin and argmax more general (they are available for
Variable) and is straightforward for DataArray to wrap the Variable
version.

* Update tests for change to coordinates on result of argmin, argmax

* Add 'out' keyword to argmin/argmax methods - allow numpy call signature

When np.argmin(da) is called, numpy passes an 'out' keyword argument to
argmin/argmax. Need to allow this argument to avoid errors (but an
exception is thrown if out is not None).

* Update and correct docstrings for argmin and argmax

* Correct suggested replacement for da.argmin() and da.argmax()

* Remove use of _injected_ methods in argmin/argmax

* Fix typo in name of argminmax_func

Co-Authored-By: keewis <[email protected]>

* Mark argminmax argument to _unravel_argminmax as a string

Co-Authored-By: keewis <[email protected]>

* Hidden internal methods don't need to appear in docs

* Basic docstrings for Dataset.argmin() and Dataset.argmax()

* Set stacklevel for DeprecationWarning in argmin/argmax methods

* Revert "Explicitly defined class methods override injected methods"

This reverts commit 8caf2b8.

* Revert "Add 'out' keyword to argmin/argmax methods - allow numpy call signature"

This reverts commit ab480b5.

* Remove argmin and argmax from ops.py

* Use self.reduce() in Dataset.argmin() and Dataset.argmax()

Replaces need for "_injected_argmin" and "_injected_argmax".

* Whitespace after 'title' lines in docstrings

* Remove tests of np.argmax() and np.argmin() functions from test_units.py

Applying numpy functions to xarray objects is not necessarily expected
to work, and the wrapping of argmin() and argmax() is broken by
xarray-specific interface of argmin() and argmax() methods of Variable,
DataArray and Dataset.

* Clearer deprecation warnings in Dataset.argmin() and Dataset.argmax()

Also, previously suggested workaround was not correct. Remove suggestion
as there is no workaround (but the removed behaviour is unlikely to be
useful).

* Add unravel_index to duck_array_ops, use in Variable._unravel_argminmax

* Filter argmin/argmax DeprecationWarnings in tests

* Correct test for exception for nan in test_argmax

* Remove injected argmin and argmax methods from api-hidden.rst

* flake8 fixes

* Tidy up argmin/argmax following code review

Co-authored-by: Deepak Cherian <[email protected]>

* Remove filters for warnings from argmin/argmax from tests

Pass an explicit axis or dim argument instead to avoid the warning.

* Swap order of reduce_dims checks in Dataset.reduce()

Prefer to pass reduce_dims=None when possible, including for variables
with only one dimension. Avoids an error if an 'axis' keyword was
passed.

* revert the changes to Dataset.reduce

* use dim instead of axis

* use dimension instead of Ellipsis

* Make passing 'dim=...' to Dataset.argmin() or Dataset.argmax() an error

* Better docstrings for Dataset.argmin() and Dataset.argmax()

* Update doc/whats-new.rst

Co-authored-by: keewis <[email protected]>

Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: keewis <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Keewis <[email protected]>
  • Loading branch information
5 people authored Jun 29, 2020
1 parent a64cf2d commit bdcfab5
Show file tree
Hide file tree
Showing 11 changed files with 1,415 additions and 44 deletions.
20 changes: 0 additions & 20 deletions doc/api-hidden.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,6 @@

core.rolling.DatasetCoarsen.all
core.rolling.DatasetCoarsen.any
core.rolling.DatasetCoarsen.argmax
core.rolling.DatasetCoarsen.argmin
core.rolling.DatasetCoarsen.count
core.rolling.DatasetCoarsen.max
core.rolling.DatasetCoarsen.mean
Expand All @@ -68,8 +66,6 @@
core.groupby.DatasetGroupBy.where
core.groupby.DatasetGroupBy.all
core.groupby.DatasetGroupBy.any
core.groupby.DatasetGroupBy.argmax
core.groupby.DatasetGroupBy.argmin
core.groupby.DatasetGroupBy.count
core.groupby.DatasetGroupBy.max
core.groupby.DatasetGroupBy.mean
Expand All @@ -85,8 +81,6 @@
core.resample.DatasetResample.all
core.resample.DatasetResample.any
core.resample.DatasetResample.apply
core.resample.DatasetResample.argmax
core.resample.DatasetResample.argmin
core.resample.DatasetResample.assign
core.resample.DatasetResample.assign_coords
core.resample.DatasetResample.bfill
Expand All @@ -110,8 +104,6 @@
core.resample.DatasetResample.dims
core.resample.DatasetResample.groups

core.rolling.DatasetRolling.argmax
core.rolling.DatasetRolling.argmin
core.rolling.DatasetRolling.count
core.rolling.DatasetRolling.max
core.rolling.DatasetRolling.mean
Expand Down Expand Up @@ -185,8 +177,6 @@

core.rolling.DataArrayCoarsen.all
core.rolling.DataArrayCoarsen.any
core.rolling.DataArrayCoarsen.argmax
core.rolling.DataArrayCoarsen.argmin
core.rolling.DataArrayCoarsen.count
core.rolling.DataArrayCoarsen.max
core.rolling.DataArrayCoarsen.mean
Expand All @@ -211,8 +201,6 @@
core.groupby.DataArrayGroupBy.where
core.groupby.DataArrayGroupBy.all
core.groupby.DataArrayGroupBy.any
core.groupby.DataArrayGroupBy.argmax
core.groupby.DataArrayGroupBy.argmin
core.groupby.DataArrayGroupBy.count
core.groupby.DataArrayGroupBy.max
core.groupby.DataArrayGroupBy.mean
Expand All @@ -228,8 +216,6 @@
core.resample.DataArrayResample.all
core.resample.DataArrayResample.any
core.resample.DataArrayResample.apply
core.resample.DataArrayResample.argmax
core.resample.DataArrayResample.argmin
core.resample.DataArrayResample.assign_coords
core.resample.DataArrayResample.bfill
core.resample.DataArrayResample.count
Expand All @@ -252,8 +238,6 @@
core.resample.DataArrayResample.dims
core.resample.DataArrayResample.groups

core.rolling.DataArrayRolling.argmax
core.rolling.DataArrayRolling.argmin
core.rolling.DataArrayRolling.count
core.rolling.DataArrayRolling.max
core.rolling.DataArrayRolling.mean
Expand Down Expand Up @@ -423,8 +407,6 @@

IndexVariable.all
IndexVariable.any
IndexVariable.argmax
IndexVariable.argmin
IndexVariable.argsort
IndexVariable.astype
IndexVariable.broadcast_equals
Expand Down Expand Up @@ -564,8 +546,6 @@
CFTimeIndex.all
CFTimeIndex.any
CFTimeIndex.append
CFTimeIndex.argmax
CFTimeIndex.argmin
CFTimeIndex.argsort
CFTimeIndex.asof
CFTimeIndex.asof_locs
Expand Down
7 changes: 7 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,13 @@ Enhancements

New Features
~~~~~~~~~~~~
- :py:meth:`DataArray.argmin` and :py:meth:`DataArray.argmax` now support
sequences of 'dim' arguments, and if a sequence is passed return a dict
(which can be passed to :py:meth:`isel` to get the value of the minimum) of
the indices for each dimension of the minimum or maximum of a DataArray.
(:pull:`3936`)
By `John Omotani <https://github.com/johnomotani>`_, thanks to `Keisuke Fujii
<https://github.com/fujiisoup>`_ for work in :pull:`1469`.
- Added :py:meth:`xarray.infer_freq` for extending frequency inferring to CFTime indexes and data (:pull:`4033`).
By `Pascal Bourgault <https://github.com/aulemahal>`_.
- ``chunks='auto'`` is now supported in the ``chunks`` argument of
Expand Down
203 changes: 203 additions & 0 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -3819,6 +3819,209 @@ def idxmax(
keep_attrs=keep_attrs,
)

def argmin(
self,
dim: Union[Hashable, Sequence[Hashable]] = None,
axis: int = None,
keep_attrs: bool = None,
skipna: bool = None,
) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
"""Index or indices of the minimum of the DataArray over one or more dimensions.
If a sequence is passed to 'dim', then result returned as dict of DataArrays,
which can be passed directly to isel(). If a single str is passed to 'dim' then
returns a DataArray with dtype int.
If there are multiple minima, the indices of the first one found will be
returned.
Parameters
----------
dim : hashable, sequence of hashable or ..., optional
The dimensions over which to find the minimum. By default, finds minimum over
all dimensions - for now returning an int for backward compatibility, but
this is deprecated, in future will return a dict with indices for all
dimensions; to return a dict with all dimensions now, pass '...'.
axis : int, optional
Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
can be supplied.
keep_attrs : bool, optional
If True, the attributes (`attrs`) will be copied from the original
object to the new one. If False (default), the new object will be
returned without attributes.
skipna : bool, optional
If True, skip missing values (as marked by NaN). By default, only
skips missing values for float dtypes; other dtypes either do not
have a sentinel missing value (int) or skipna=True has not been
implemented (object, datetime64 or timedelta64).
Returns
-------
result : DataArray or dict of DataArray
See also
--------
Variable.argmin, DataArray.idxmin
Examples
--------
>>> array = xr.DataArray([0, 2, -1, 3], dims="x")
>>> array.min()
<xarray.DataArray ()>
array(-1)
>>> array.argmin()
<xarray.DataArray ()>
array(2)
>>> array.argmin(...)
{'x': <xarray.DataArray ()>
array(2)}
>>> array.isel(array.argmin(...))
array(-1)
>>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
... [[1, 3, 2], [2, -5, 1], [2, 3, 1]]],
... dims=("x", "y", "z"))
>>> array.min(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[ 1, 2, 1],
[ 2, -5, 1],
[ 2, 1, 1]])
Dimensions without coordinates: y, z
>>> array.argmin(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[1, 0, 0],
[1, 1, 1],
[0, 0, 1]])
Dimensions without coordinates: y, z
>>> array.argmin(dim=["x"])
{'x': <xarray.DataArray (y: 3, z: 3)>
array([[1, 0, 0],
[1, 1, 1],
[0, 0, 1]])
Dimensions without coordinates: y, z}
>>> array.min(dim=("x", "z"))
<xarray.DataArray (y: 3)>
array([ 1, -5, 1])
Dimensions without coordinates: y
>>> array.argmin(dim=["x", "z"])
{'x': <xarray.DataArray (y: 3)>
array([0, 1, 0])
Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
array([2, 1, 1])
Dimensions without coordinates: y}
>>> array.isel(array.argmin(dim=["x", "z"]))
<xarray.DataArray (y: 3)>
array([ 1, -5, 1])
Dimensions without coordinates: y
"""
result = self.variable.argmin(dim, axis, keep_attrs, skipna)
if isinstance(result, dict):
return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
else:
return self._replace_maybe_drop_dims(result)

def argmax(
self,
dim: Union[Hashable, Sequence[Hashable]] = None,
axis: int = None,
keep_attrs: bool = None,
skipna: bool = None,
) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
"""Index or indices of the maximum of the DataArray over one or more dimensions.
If a sequence is passed to 'dim', then result returned as dict of DataArrays,
which can be passed directly to isel(). If a single str is passed to 'dim' then
returns a DataArray with dtype int.
If there are multiple maxima, the indices of the first one found will be
returned.
Parameters
----------
dim : hashable, sequence of hashable or ..., optional
The dimensions over which to find the maximum. By default, finds maximum over
all dimensions - for now returning an int for backward compatibility, but
this is deprecated, in future will return a dict with indices for all
dimensions; to return a dict with all dimensions now, pass '...'.
axis : int, optional
Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
can be supplied.
keep_attrs : bool, optional
If True, the attributes (`attrs`) will be copied from the original
object to the new one. If False (default), the new object will be
returned without attributes.
skipna : bool, optional
If True, skip missing values (as marked by NaN). By default, only
skips missing values for float dtypes; other dtypes either do not
have a sentinel missing value (int) or skipna=True has not been
implemented (object, datetime64 or timedelta64).
Returns
-------
result : DataArray or dict of DataArray
See also
--------
Variable.argmax, DataArray.idxmax
Examples
--------
>>> array = xr.DataArray([0, 2, -1, 3], dims="x")
>>> array.max()
<xarray.DataArray ()>
array(3)
>>> array.argmax()
<xarray.DataArray ()>
array(3)
>>> array.argmax(...)
{'x': <xarray.DataArray ()>
array(3)}
>>> array.isel(array.argmax(...))
<xarray.DataArray ()>
array(3)
>>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
... [[1, 3, 2], [2, 5, 1], [2, 3, 1]]],
... dims=("x", "y", "z"))
>>> array.max(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[3, 3, 2],
[3, 5, 2],
[2, 3, 3]])
Dimensions without coordinates: y, z
>>> array.argmax(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[0, 1, 1],
[0, 1, 0],
[0, 1, 0]])
Dimensions without coordinates: y, z
>>> array.argmax(dim=["x"])
{'x': <xarray.DataArray (y: 3, z: 3)>
array([[0, 1, 1],
[0, 1, 0],
[0, 1, 0]])
Dimensions without coordinates: y, z}
>>> array.max(dim=("x", "z"))
<xarray.DataArray (y: 3)>
array([3, 5, 3])
Dimensions without coordinates: y
>>> array.argmax(dim=["x", "z"])
{'x': <xarray.DataArray (y: 3)>
array([0, 1, 0])
Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
array([0, 1, 2])
Dimensions without coordinates: y}
>>> array.isel(array.argmax(dim=["x", "z"]))
<xarray.DataArray (y: 3)>
array([3, 5, 3])
Dimensions without coordinates: y
"""
result = self.variable.argmax(dim, axis, keep_attrs, skipna)
if isinstance(result, dict):
return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
else:
return self._replace_maybe_drop_dims(result)

# this needs to be at the end, or mypy will confuse with `str`
# https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names
str = utils.UncachedAccessor(StringAccessor)
Expand Down
Loading

0 comments on commit bdcfab5

Please sign in to comment.