Skip to content

Commit

Permalink
Squash merge pydata#5950
Browse files Browse the repository at this point in the history
Squashed commit of the following:

commit 6916fa7
Author: Deepak Cherian <[email protected]>
Date:   Mon Nov 22 11:16:43 2021 -0700

    Update xarray/util/generate_reductions.py

    Co-authored-by: Illviljan <[email protected]>

commit cd8a898
Author: dcherian <[email protected]>
Date:   Sat Nov 20 14:37:17 2021 -0700

    add doctests

commit 19d82cd
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 22:00:29 2021 +0100

    more reduce

commit 0f94bec
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 20:48:27 2021 +0100

    another reduce

commit be33560
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 20:28:39 2021 +0100

    one more reduce

commit 3d854e5
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 20:21:26 2021 +0100

    more reduce edits

commit 2bbddaf
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 20:12:31 2021 +0100

    make reduce args consistent

commit dfbe103
Merge: f03b675 dd28a57
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 19:01:59 2021 +0100

    Merge branch 'generate-reductions-class' of https://github.com/dcherian/xarray into pr/5950

commit f03b675
Merge: 411d75d 7a201de
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 19:01:42 2021 +0100

    Merge branch 'main' into pr/5950

commit dd28a57
Author: dcherian <[email protected]>
Date:   Sat Nov 20 10:57:22 2021 -0700

    updates

commit 6a9a124
Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date:   Sat Nov 20 17:02:07 2021 +0000

    [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

commit 411d75d
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 18:00:08 2021 +0100

    Now get normal code running as well

    Protocols are not needed anymore when subclassing/defining directly in the class.

    When adding a dummy method in DatasetResampleReductions the order of subclassing had to be changed so the correct reduce was used.

commit 5dcb5bf
Author: Illviljan <[email protected]>
Date:   Sat Nov 20 12:30:50 2021 +0100

    Attempt fixing typing errors

    Mixing in DatasetReduce fixes:
    xarray/tests/test_groupby.py:460: error: Invalid self argument "Dataset" to attribute function "mean" with type "Callable[[DatasetReduce, Optional[Hashable], Optional[bool], Optional[bool], KwArg(Any)], T_Dataset]"  [misc]

    Switching to "Dateset" as returned type fixes:

    xarray/tests/test_groupby.py:77: error: Need type annotation for "expected"  [var-annotated]

commit 7a201de
Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Date:   Fri Nov 19 11:37:20 2021 -0700

    [pre-commit.ci] pre-commit autoupdate (pydata#5990)

commit 95394d5
Author: Illviljan <[email protected]>
Date:   Mon Nov 15 21:40:37 2021 +0100

    Use set_options for asv bottleneck tests (pydata#5986)

    * Use set_options for bottleneck tests

    * Use set_options in rolling

    * Update rolling.py

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * Update rolling.py

    * Update rolling.py

    * set_options not needed.

    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit b2d7cd8
Author: Kai Mühlbauer <[email protected]>
Date:   Mon Nov 15 18:33:43 2021 +0100

    Fix module name retrieval in `backend.plugins.remove_duplicates()`, plugin tests (pydata#5959)

    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit c7e9d96
Author: dcherian <[email protected]>
Date:   Wed Nov 10 11:49:47 2021 -0700

    Minor improvement

commit dea8fd9
Author: dcherian <[email protected]>
Date:   Mon Nov 8 16:18:07 2021 -0700

     REfactor

commit 9bb2c32
Author: dcherian <[email protected]>
Date:   Mon Nov 8 13:56:53 2021 -0700

    Reorder docstring to match numpy

commit 99bfe12
Author: dcherian <[email protected]>
Date:   Mon Nov 8 12:44:23 2021 -0700

    Fixes pydata#5898

commit 7f39cc0
Author: dcherian <[email protected]>
Date:   Mon Nov 8 12:39:00 2021 -0700

    Minor docstring improvements.

commit a04ed82
Author: dcherian <[email protected]>
Date:   Mon Nov 8 12:35:48 2021 -0700

    Small changes

commit 816e794
Author: dcherian <[email protected]>
Date:   Sun Nov 7 20:56:37 2021 -0700

    Generate DataArray, Dataset reductions too.

commit 569c67f
Author: dcherian <[email protected]>
Date:   Sun Nov 7 20:54:42 2021 -0700

    Add ddof for var, std

commit 6b9a81a
Author: dcherian <[email protected]>
Date:   Sun Nov 7 20:35:52 2021 -0700

    Better generator for reductions.
  • Loading branch information
dcherian committed Nov 26, 2021
1 parent cfd2c07 commit 3c51b1a
Show file tree
Hide file tree
Showing 17 changed files with 438 additions and 401 deletions.
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ repos:
- id: check-yaml
# isort should run before black as black sometimes tweaks the isort output
- repo: https://github.com/PyCQA/isort
rev: 5.9.3
rev: 5.10.1
hooks:
- id: isort
# https://github.com/python/black#version-control-integration
- repo: https://github.com/psf/black
rev: 21.9b0
rev: 21.10b0
hooks:
- id: black
- id: black-jupyter
Expand All @@ -22,8 +22,8 @@ repos:
hooks:
- id: blackdoc
exclude: "generate_reductions.py"
- repo: https://gitlab.com/pycqa/flake8
rev: 3.9.2
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
hooks:
- id: flake8
# - repo: https://github.com/Carreau/velin
Expand Down
2 changes: 1 addition & 1 deletion asv_bench/asv.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"pandas": [""],
"netcdf4": [""],
"scipy": [""],
"bottleneck": ["", null],
"bottleneck": [""],
"dask": [""],
"distributed": [""],
"flox": [""],
Expand Down
8 changes: 0 additions & 8 deletions asv_bench/benchmarks/dataarray_missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,6 @@ def make_bench_data(shape, frac_nan, chunks):
return da


def requires_bottleneck():
try:
import bottleneck # noqa: F401
except ImportError:
raise NotImplementedError()


class DataArrayMissingInterpolateNA:
def setup(self, shape, chunks, limit):
if chunks is not None:
Expand All @@ -46,7 +39,6 @@ def time_interpolate_na(self, shape, chunks, limit):

class DataArrayMissingBottleneck:
def setup(self, shape, chunks, limit):
requires_bottleneck()
if chunks is not None:
requires_dask()
self.da = make_bench_data(shape, 0.1, chunks)
Expand Down
92 changes: 56 additions & 36 deletions asv_bench/benchmarks/rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,29 +36,45 @@ def setup(self, *args, **kwargs):
randn_long, dims="x", coords={"x": np.arange(long_nx) * 0.1}
)

@parameterized(["func", "center"], (["mean", "count"], [True, False]))
def time_rolling(self, func, center):
getattr(self.ds.rolling(x=window, center=center), func)().load()

@parameterized(["func", "pandas"], (["mean", "count"], [True, False]))
def time_rolling_long(self, func, pandas):
@parameterized(
["func", "center", "use_bottleneck"],
(["mean", "count"], [True, False], [True, False]),
)
def time_rolling(self, func, center, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
getattr(self.ds.rolling(x=window, center=center), func)().load()

@parameterized(
["func", "pandas", "use_bottleneck"],
(["mean", "count"], [True, False], [True, False]),
)
def time_rolling_long(self, func, pandas, use_bottleneck):
if pandas:
se = self.da_long.to_series()
getattr(se.rolling(window=window, min_periods=window), func)()
else:
getattr(self.da_long.rolling(x=window, min_periods=window), func)().load()

@parameterized(["window_", "min_periods"], ([20, 40], [5, 5]))
def time_rolling_np(self, window_, min_periods):
self.ds.rolling(x=window_, center=False, min_periods=min_periods).reduce(
getattr(np, "nansum")
).load()

@parameterized(["center", "stride"], ([True, False], [1, 1]))
def time_rolling_construct(self, center, stride):
self.ds.rolling(x=window, center=center).construct(
"window_dim", stride=stride
).sum(dim="window_dim").load()
with xr.set_options(use_bottleneck=use_bottleneck):
getattr(
self.da_long.rolling(x=window, min_periods=window), func
)().load()

@parameterized(
["window_", "min_periods", "use_bottleneck"], ([20, 40], [5, 5], [True, False])
)
def time_rolling_np(self, window_, min_periods, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
self.ds.rolling(x=window_, center=False, min_periods=min_periods).reduce(
getattr(np, "nansum")
).load()

@parameterized(
["center", "stride", "use_bottleneck"], ([True, False], [1, 1], [True, False])
)
def time_rolling_construct(self, center, stride, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
self.ds.rolling(x=window, center=center).construct(
"window_dim", stride=stride
).sum(dim="window_dim").load()


class RollingDask(Rolling):
Expand Down Expand Up @@ -87,24 +103,28 @@ def setup(self, *args, **kwargs):


class DataArrayRollingMemory(RollingMemory):
@parameterized("func", ["sum", "max", "mean"])
def peakmem_ndrolling_reduce(self, func):
roll = self.ds.var1.rolling(x=10, y=4)
getattr(roll, func)()
@parameterized(["func", "use_bottleneck"], (["sum", "max", "mean"], [True, False]))
def peakmem_ndrolling_reduce(self, func, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
roll = self.ds.var1.rolling(x=10, y=4)
getattr(roll, func)()

@parameterized("func", ["sum", "max", "mean"])
def peakmem_1drolling_reduce(self, func):
roll = self.ds.var3.rolling(t=100)
getattr(roll, func)()
@parameterized(["func", "use_bottleneck"], (["sum", "max", "mean"], [True, False]))
def peakmem_1drolling_reduce(self, func, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
roll = self.ds.var3.rolling(t=100)
getattr(roll, func)()


class DatasetRollingMemory(RollingMemory):
@parameterized("func", ["sum", "max", "mean"])
def peakmem_ndrolling_reduce(self, func):
roll = self.ds.rolling(x=10, y=4)
getattr(roll, func)()

@parameterized("func", ["sum", "max", "mean"])
def peakmem_1drolling_reduce(self, func):
roll = self.ds.rolling(t=100)
getattr(roll, func)()
@parameterized(["func", "use_bottleneck"], (["sum", "max", "mean"], [True, False]))
def peakmem_ndrolling_reduce(self, func, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
roll = self.ds.rolling(x=10, y=4)
getattr(roll, func)()

@parameterized(["func", "use_bottleneck"], (["sum", "max", "mean"], [True, False]))
def peakmem_1drolling_reduce(self, func, use_bottleneck):
with xr.set_options(use_bottleneck=use_bottleneck):
roll = self.ds.rolling(t=100)
getattr(roll, func)()
2 changes: 2 additions & 0 deletions doc/user-guide/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,8 @@ Xarray also provides the ``max_gap`` keyword argument to limit the interpolation
data gaps of length ``max_gap`` or smaller. See :py:meth:`~xarray.DataArray.interpolate_na`
for more.

.. _agg:

Aggregation
===========

Expand Down
6 changes: 6 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ Bug fixes
~~~~~~~~~
- Fix plot.line crash for data of shape ``(1, N)`` in _title_for_slice on format_item (:pull:`5948`).
By `Sebastian Weigand <https://github.com/s-weigand>`_.
- Fix a regression in the removal of duplicate backend entrypoints (:issue:`5944`, :pull:`5959`)
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.

Documentation
~~~~~~~~~~~~~
Expand All @@ -49,6 +51,10 @@ Documentation
Internal Changes
~~~~~~~~~~~~~~~~

- Use ``importlib`` to replace functionality of ``pkg_resources`` in
backend plugins tests. (:pull:`5959`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.


.. _whats-new.0.20.1:

Expand Down
10 changes: 6 additions & 4 deletions xarray/backends/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,17 @@ def remove_duplicates(entrypoints):
# check if there are multiple entrypoints for the same name
unique_entrypoints = []
for name, matches in entrypoints_grouped:
matches = list(matches)
# remove equal entrypoints
matches = list(set(matches))
unique_entrypoints.append(matches[0])
matches_len = len(matches)
if matches_len > 1:
selected_module_name = matches[0].module_name
all_module_names = [e.module_name for e in matches]
all_module_names = [e.value.split(":")[0] for e in matches]
selected_module_name = all_module_names[0]
warnings.warn(
f"Found {matches_len} entrypoints for the engine name {name}:"
f"\n {all_module_names}.\n It will be used: {selected_module_name}.",
f"\n {all_module_names}.\n "
f"The entrypoint {selected_module_name} will be used.",
RuntimeWarning,
)
return unique_entrypoints
Expand Down
Loading

0 comments on commit 3c51b1a

Please sign in to comment.