Support multiple dimensions in DataArray.argmin() and DataArray.argma…

…x() methods (pydata#3936) * DataArray.indices_min() and DataArray.indices_max() methods These return dicts of the indices of the minimum or maximum of a DataArray over several dimensions. * Update whats-new.rst and api.rst with indices_min(), indices_max() * Fix type checking in DataArray._unravel_argminmax() * Fix expected results for TestReduce3D.test_indices_max() * Respect global default for keep_attrs * Merge behaviour of indices_min/indices_max into argmin/argmax When argmin or argmax are called with a sequence for 'dim', they now return a dict with the indices for each dimension in dim. * Basic overload of argmin() and argmax() for Dataset If single dim is passed to Dataset.argmin() or Dataset.argmax(), then pass through to _argmin_base or _argmax_base. If a sequence is passed for dim, raise an exception, because the result for each DataArray would be a dict, which cannot be stored in a Dataset. * Update Variable and dask tests with _argmin_base, _argmax_base The basic numpy-style argmin() and argmax() methods were renamed when adding support for handling multiple dimensions in DataArray.argmin() and DataArray.argmax(). Variable.argmin() and Variable.argmax() are therefore renamed as Variable._argmin_base() and Variable._argmax_base(). * Update api-hidden.rst with _argmin_base and _argmax_base * Explicitly defined class methods override injected methods If a method (such as 'argmin') has been explicitly defined on a class (so that hasattr(cls, "argmin")==True), then do not inject that method, as it would override the explicitly defined one. Instead inject a private method, prefixed by "_injected_" (such as '_injected_argmin'), so that the injected method is available to the explicitly defined one. Do not perform the hasattr check on binary ops, because this breaks some operations (e.g. addition between DataArray and int in test_dask.py). * Move StringAccessor back to bottom of DataArray class definition * Revert use of _argmin_base and _argmax_base Now not needed because of change to injection in ops.py. * Move implementation of argmin, argmax from DataArray to Variable Makes use of argmin and argmax more general (they are available for Variable) and is straightforward for DataArray to wrap the Variable version. * Update tests for change to coordinates on result of argmin, argmax * Add 'out' keyword to argmin/argmax methods - allow numpy call signature When np.argmin(da) is called, numpy passes an 'out' keyword argument to argmin/argmax. Need to allow this argument to avoid errors (but an exception is thrown if out is not None). * Update and correct docstrings for argmin and argmax * Correct suggested replacement for da.argmin() and da.argmax() * Remove use of _injected_ methods in argmin/argmax * Fix typo in name of argminmax_func Co-Authored-By: keewis <[email protected]> * Mark argminmax argument to _unravel_argminmax as a string Co-Authored-By: keewis <[email protected]> * Hidden internal methods don't need to appear in docs * Basic docstrings for Dataset.argmin() and Dataset.argmax() * Set stacklevel for DeprecationWarning in argmin/argmax methods * Revert "Explicitly defined class methods override injected methods" This reverts commit 8caf2b8. * Revert "Add 'out' keyword to argmin/argmax methods - allow numpy call signature" This reverts commit ab480b5. * Remove argmin and argmax from ops.py * Use self.reduce() in Dataset.argmin() and Dataset.argmax() Replaces need for "_injected_argmin" and "_injected_argmax". * Whitespace after 'title' lines in docstrings * Remove tests of np.argmax() and np.argmin() functions from test_units.py Applying numpy functions to xarray objects is not necessarily expected to work, and the wrapping of argmin() and argmax() is broken by xarray-specific interface of argmin() and argmax() methods of Variable, DataArray and Dataset. * Clearer deprecation warnings in Dataset.argmin() and Dataset.argmax() Also, previously suggested workaround was not correct. Remove suggestion as there is no workaround (but the removed behaviour is unlikely to be useful). * Add unravel_index to duck_array_ops, use in Variable._unravel_argminmax * Filter argmin/argmax DeprecationWarnings in tests * Correct test for exception for nan in test_argmax * Remove injected argmin and argmax methods from api-hidden.rst * flake8 fixes * Tidy up argmin/argmax following code review Co-authored-by: Deepak Cherian <[email protected]> * Remove filters for warnings from argmin/argmax from tests Pass an explicit axis or dim argument instead to avoid the warning. * Swap order of reduce_dims checks in Dataset.reduce() Prefer to pass reduce_dims=None when possible, including for variables with only one dimension. Avoids an error if an 'axis' keyword was passed. * revert the changes to Dataset.reduce * use dim instead of axis * use dimension instead of Ellipsis * Make passing 'dim=...' to Dataset.argmin() or Dataset.argmax() an error * Better docstrings for Dataset.argmin() and Dataset.argmax() * Update doc/whats-new.rst Co-authored-by: keewis <[email protected]> Co-authored-by: Stephan Hoyer <[email protected]> Co-authored-by: keewis <[email protected]> Co-authored-by: Deepak Cherian <[email protected]> Co-authored-by: Keewis <[email protected]>
raphaeldussin · Jun 29, 2020 · bdcfab5 · bdcfab5
1 parent a64cf2d
commit bdcfab5
Show file tree

Hide file tree

Showing 11 changed files with 1,415 additions and 44 deletions.
diff --git a/doc/api-hidden.rst b/doc/api-hidden.rst
@@ -41,8 +41,6 @@
 
    core.rolling.DatasetCoarsen.all
    core.rolling.DatasetCoarsen.any
-   core.rolling.DatasetCoarsen.argmax
-   core.rolling.DatasetCoarsen.argmin
    core.rolling.DatasetCoarsen.count
    core.rolling.DatasetCoarsen.max
    core.rolling.DatasetCoarsen.mean
@@ -68,8 +66,6 @@
    core.groupby.DatasetGroupBy.where
    core.groupby.DatasetGroupBy.all
    core.groupby.DatasetGroupBy.any
-   core.groupby.DatasetGroupBy.argmax
-   core.groupby.DatasetGroupBy.argmin
    core.groupby.DatasetGroupBy.count
    core.groupby.DatasetGroupBy.max
    core.groupby.DatasetGroupBy.mean
@@ -85,8 +81,6 @@
    core.resample.DatasetResample.all
    core.resample.DatasetResample.any
    core.resample.DatasetResample.apply
-   core.resample.DatasetResample.argmax
-   core.resample.DatasetResample.argmin
    core.resample.DatasetResample.assign
    core.resample.DatasetResample.assign_coords
    core.resample.DatasetResample.bfill
@@ -110,8 +104,6 @@
    core.resample.DatasetResample.dims
    core.resample.DatasetResample.groups
 
-   core.rolling.DatasetRolling.argmax
-   core.rolling.DatasetRolling.argmin
    core.rolling.DatasetRolling.count
    core.rolling.DatasetRolling.max
    core.rolling.DatasetRolling.mean
@@ -185,8 +177,6 @@
 
    core.rolling.DataArrayCoarsen.all
    core.rolling.DataArrayCoarsen.any
-   core.rolling.DataArrayCoarsen.argmax
-   core.rolling.DataArrayCoarsen.argmin
    core.rolling.DataArrayCoarsen.count
    core.rolling.DataArrayCoarsen.max
    core.rolling.DataArrayCoarsen.mean
@@ -211,8 +201,6 @@
    core.groupby.DataArrayGroupBy.where
    core.groupby.DataArrayGroupBy.all
    core.groupby.DataArrayGroupBy.any
-   core.groupby.DataArrayGroupBy.argmax
-   core.groupby.DataArrayGroupBy.argmin
    core.groupby.DataArrayGroupBy.count
    core.groupby.DataArrayGroupBy.max
    core.groupby.DataArrayGroupBy.mean
@@ -228,8 +216,6 @@
    core.resample.DataArrayResample.all
    core.resample.DataArrayResample.any
    core.resample.DataArrayResample.apply
-   core.resample.DataArrayResample.argmax
-   core.resample.DataArrayResample.argmin
    core.resample.DataArrayResample.assign_coords
    core.resample.DataArrayResample.bfill
    core.resample.DataArrayResample.count
@@ -252,8 +238,6 @@
    core.resample.DataArrayResample.dims
    core.resample.DataArrayResample.groups
 
-   core.rolling.DataArrayRolling.argmax
-   core.rolling.DataArrayRolling.argmin
    core.rolling.DataArrayRolling.count
    core.rolling.DataArrayRolling.max
    core.rolling.DataArrayRolling.mean
@@ -423,8 +407,6 @@
 
    IndexVariable.all
    IndexVariable.any
-   IndexVariable.argmax
-   IndexVariable.argmin
    IndexVariable.argsort
    IndexVariable.astype
    IndexVariable.broadcast_equals
@@ -564,8 +546,6 @@
    CFTimeIndex.all
    CFTimeIndex.any
    CFTimeIndex.append
-   CFTimeIndex.argmax
-   CFTimeIndex.argmin
    CFTimeIndex.argsort
    CFTimeIndex.asof
    CFTimeIndex.asof_locs

diff --git a/doc/whats-new.rst b/doc/whats-new.rst
@@ -54,6 +54,13 @@ Enhancements
 
 New Features
 ~~~~~~~~~~~~
+- :py:meth:`DataArray.argmin` and :py:meth:`DataArray.argmax` now support
+  sequences of 'dim' arguments, and if a sequence is passed return a dict
+  (which can be passed to :py:meth:`isel` to get the value of the minimum) of
+  the indices for each dimension of the minimum or maximum of a DataArray.
+  (:pull:`3936`)
+  By `John Omotani <https://github.com/johnomotani>`_, thanks to `Keisuke Fujii
+  <https://github.com/fujiisoup>`_ for work in :pull:`1469`.
 - Added :py:meth:`xarray.infer_freq` for extending frequency inferring to CFTime indexes and data (:pull:`4033`).
   By `Pascal Bourgault <https://github.com/aulemahal>`_.
 - ``chunks='auto'`` is now supported in the ``chunks`` argument of

diff --git a/xarray/core/dataarray.py b/xarray/core/dataarray.py
@@ -3819,6 +3819,209 @@ def idxmax(
             keep_attrs=keep_attrs,
         )
 
+    def argmin(
+        self,
+        dim: Union[Hashable, Sequence[Hashable]] = None,
+        axis: int = None,
+        keep_attrs: bool = None,
+        skipna: bool = None,
+    ) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
+        """Index or indices of the minimum of the DataArray over one or more dimensions.
+
+        If a sequence is passed to 'dim', then result returned as dict of DataArrays,
+        which can be passed directly to isel(). If a single str is passed to 'dim' then
+        returns a DataArray with dtype int.
+
+        If there are multiple minima, the indices of the first one found will be
+        returned.
+
+        Parameters
+        ----------
+        dim : hashable, sequence of hashable or ..., optional
+            The dimensions over which to find the minimum. By default, finds minimum over
+            all dimensions - for now returning an int for backward compatibility, but
+            this is deprecated, in future will return a dict with indices for all
+            dimensions; to return a dict with all dimensions now, pass '...'.
+        axis : int, optional
+            Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
+            can be supplied.
+        keep_attrs : bool, optional
+            If True, the attributes (`attrs`) will be copied from the original
+            object to the new one.  If False (default), the new object will be
+            returned without attributes.
+        skipna : bool, optional
+            If True, skip missing values (as marked by NaN). By default, only
+            skips missing values for float dtypes; other dtypes either do not
+            have a sentinel missing value (int) or skipna=True has not been
+            implemented (object, datetime64 or timedelta64).
+
+        Returns
+        -------
+        result : DataArray or dict of DataArray
+
+        See also
+        --------
+        Variable.argmin, DataArray.idxmin
+
+        Examples
+        --------
+        >>> array = xr.DataArray([0, 2, -1, 3], dims="x")
+        >>> array.min()
+        <xarray.DataArray ()>
+        array(-1)
+        >>> array.argmin()
+        <xarray.DataArray ()>
+        array(2)
+        >>> array.argmin(...)
+        {'x': <xarray.DataArray ()>
+        array(2)}
+        >>> array.isel(array.argmin(...))
+        array(-1)
+
+        >>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
+        ...                       [[1, 3, 2], [2, -5, 1], [2, 3, 1]]],
+        ...                      dims=("x", "y", "z"))
+        >>> array.min(dim="x")
+        <xarray.DataArray (y: 3, z: 3)>
+        array([[ 1,  2,  1],
+               [ 2, -5,  1],
+               [ 2,  1,  1]])
+        Dimensions without coordinates: y, z
+        >>> array.argmin(dim="x")
+        <xarray.DataArray (y: 3, z: 3)>
+        array([[1, 0, 0],
+               [1, 1, 1],
+               [0, 0, 1]])
+        Dimensions without coordinates: y, z
+        >>> array.argmin(dim=["x"])
+        {'x': <xarray.DataArray (y: 3, z: 3)>
+        array([[1, 0, 0],
+               [1, 1, 1],
+               [0, 0, 1]])
+        Dimensions without coordinates: y, z}
+        >>> array.min(dim=("x", "z"))
+        <xarray.DataArray (y: 3)>
+        array([ 1, -5,  1])
+        Dimensions without coordinates: y
+        >>> array.argmin(dim=["x", "z"])
+        {'x': <xarray.DataArray (y: 3)>
+        array([0, 1, 0])
+        Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
+        array([2, 1, 1])
+        Dimensions without coordinates: y}
+        >>> array.isel(array.argmin(dim=["x", "z"]))
+        <xarray.DataArray (y: 3)>
+        array([ 1, -5,  1])
+        Dimensions without coordinates: y
+        """
+        result = self.variable.argmin(dim, axis, keep_attrs, skipna)
+        if isinstance(result, dict):
+            return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
+        else:
+            return self._replace_maybe_drop_dims(result)
+
+    def argmax(
+        self,
+        dim: Union[Hashable, Sequence[Hashable]] = None,
+        axis: int = None,
+        keep_attrs: bool = None,
+        skipna: bool = None,
+    ) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
+        """Index or indices of the maximum of the DataArray over one or more dimensions.
+
+        If a sequence is passed to 'dim', then result returned as dict of DataArrays,
+        which can be passed directly to isel(). If a single str is passed to 'dim' then
+        returns a DataArray with dtype int.
+
+        If there are multiple maxima, the indices of the first one found will be
+        returned.
+
+        Parameters
+        ----------
+        dim : hashable, sequence of hashable or ..., optional
+            The dimensions over which to find the maximum. By default, finds maximum over
+            all dimensions - for now returning an int for backward compatibility, but
+            this is deprecated, in future will return a dict with indices for all
+            dimensions; to return a dict with all dimensions now, pass '...'.
+        axis : int, optional
+            Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
+            can be supplied.
+        keep_attrs : bool, optional
+            If True, the attributes (`attrs`) will be copied from the original
+            object to the new one.  If False (default), the new object will be
+            returned without attributes.
+        skipna : bool, optional
+            If True, skip missing values (as marked by NaN). By default, only
+            skips missing values for float dtypes; other dtypes either do not
+            have a sentinel missing value (int) or skipna=True has not been
+            implemented (object, datetime64 or timedelta64).
+
+        Returns
+        -------
+        result : DataArray or dict of DataArray
+
+        See also
+        --------
+        Variable.argmax, DataArray.idxmax
+
+        Examples
+        --------
+        >>> array = xr.DataArray([0, 2, -1, 3], dims="x")
+        >>> array.max()
+        <xarray.DataArray ()>
+        array(3)
+        >>> array.argmax()
+        <xarray.DataArray ()>
+        array(3)
+        >>> array.argmax(...)
+        {'x': <xarray.DataArray ()>
+        array(3)}
+        >>> array.isel(array.argmax(...))
+        <xarray.DataArray ()>
+        array(3)
+
+        >>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
+        ...                       [[1, 3, 2], [2, 5, 1], [2, 3, 1]]],
+        ...                      dims=("x", "y", "z"))
+        >>> array.max(dim="x")
+        <xarray.DataArray (y: 3, z: 3)>
+        array([[3, 3, 2],
+               [3, 5, 2],
+               [2, 3, 3]])
+        Dimensions without coordinates: y, z
+        >>> array.argmax(dim="x")
+        <xarray.DataArray (y: 3, z: 3)>
+        array([[0, 1, 1],
+               [0, 1, 0],
+               [0, 1, 0]])
+        Dimensions without coordinates: y, z
+        >>> array.argmax(dim=["x"])
+        {'x': <xarray.DataArray (y: 3, z: 3)>
+        array([[0, 1, 1],
+               [0, 1, 0],
+               [0, 1, 0]])
+        Dimensions without coordinates: y, z}
+        >>> array.max(dim=("x", "z"))
+        <xarray.DataArray (y: 3)>
+        array([3, 5, 3])
+        Dimensions without coordinates: y
+        >>> array.argmax(dim=["x", "z"])
+        {'x': <xarray.DataArray (y: 3)>
+        array([0, 1, 0])
+        Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
+        array([0, 1, 2])
+        Dimensions without coordinates: y}
+        >>> array.isel(array.argmax(dim=["x", "z"]))
+        <xarray.DataArray (y: 3)>
+        array([3, 5, 3])
+        Dimensions without coordinates: y
+        """
+        result = self.variable.argmax(dim, axis, keep_attrs, skipna)
+        if isinstance(result, dict):
+            return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
+        else:
+            return self._replace_maybe_drop_dims(result)
+
     # this needs to be at the end, or mypy will confuse with `str`
     # https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names
     str = utils.UncachedAccessor(StringAccessor)