Skip to content

Commit

Permalink
Merge branch 'main' into mfdataset_allow_numpy
Browse files Browse the repository at this point in the history
  • Loading branch information
Illviljan committed Oct 29, 2021
2 parents a6af772 + 867646f commit 3444281
Show file tree
Hide file tree
Showing 120 changed files with 2,112 additions and 967 deletions.
5 changes: 5 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# black PR 3142
d089df385e737f71067309ff7abae15994d581ec

# isort PR 1924
0e73e240107caee3ffd1a1149f0150c390d43251
74 changes: 74 additions & 0 deletions .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
name: Benchmark

on:
pull_request:
types: [opened, reopened, synchronize, labeled]
workflow_dispatch:

jobs:
benchmark:
if: ${{ contains( github.event.pull_request.labels.*.name, 'run-benchmark') && github.event_name == 'pull_request' || github.event_name == 'workflow_dispatch' }}
name: Linux
runs-on: ubuntu-20.04
env:
ASV_DIR: "./asv_bench"

steps:
# We need the full repo to avoid this issue
# https://github.com/actions/checkout/issues/23
- uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2
with:
# installer-url: https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
installer-url: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh

- name: Setup some dependencies
shell: bash -l {0}
run: |
pip install asv
sudo apt-get update -y
- name: Run benchmarks
shell: bash -l {0}
id: benchmark
env:
OPENBLAS_NUM_THREADS: 1
MKL_NUM_THREADS: 1
OMP_NUM_THREADS: 1
ASV_FACTOR: 1.5
ASV_SKIP_SLOW: 1
run: |
set -x
# ID this runner
asv machine --yes
echo "Baseline: ${{ github.event.pull_request.base.sha }} (${{ github.event.pull_request.base.label }})"
echo "Contender: ${GITHUB_SHA} (${{ github.event.pull_request.head.label }})"
# Use mamba for env creation
# export CONDA_EXE=$(which mamba)
export CONDA_EXE=$(which conda)
# Run benchmarks for current commit against base
ASV_OPTIONS="--split --show-stderr --factor $ASV_FACTOR"
asv continuous $ASV_OPTIONS ${{ github.event.pull_request.base.sha }} ${GITHUB_SHA} \
| sed "/Traceback \|failed$\|PERFORMANCE DECREASED/ s/^/::error::/" \
| tee benchmarks.log
# Report and export results for subsequent steps
if grep "Traceback \|failed\|PERFORMANCE DECREASED" benchmarks.log > /dev/null ; then
exit 1
fi
working-directory: ${{ env.ASV_DIR }}

- name: Add instructions to artifact
if: always()
run: |
cp benchmarks/README_CI.md benchmarks.log .asv/results/
working-directory: ${{ env.ASV_DIR }}

- uses: actions/upload-artifact@v2
if: always()
with:
name: asv-benchmark-results-${{ runner.os }}
path: ${{ env.ASV_DIR }}/.asv/results
3 changes: 1 addition & 2 deletions .github/workflows/ci-additional.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@ jobs:
[
"py37-bare-minimum",
"py37-min-all-deps",
"py37-min-nep18",
"py38-all-but-dask",
"py38-flaky",
]
Expand Down Expand Up @@ -103,7 +102,7 @@ jobs:
$PYTEST_EXTRA_FLAGS
- name: Upload code coverage to Codecov
uses: codecov/codecov-action@v2.0.2
uses: codecov/codecov-action@v2.1.0
with:
file: ./coverage.xml
flags: unittests,${{ matrix.env }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ jobs:
path: pytest.xml

- name: Upload code coverage to Codecov
uses: codecov/codecov-action@v2.0.2
uses: codecov/codecov-action@v2.1.0
with:
file: ./coverage.xml
flags: unittests
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/upstream-dev-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ jobs:
shopt -s globstar
python .github/workflows/parse_logs.py logs/**/*-log
- name: Report failures
uses: actions/github-script@v4.0.2
uses: actions/github-script@v4.1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
Expand Down
12 changes: 7 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,10 @@ repos:
- id: isort
# https://github.com/python/black#version-control-integration
- repo: https://github.com/psf/black
rev: 21.7b0
rev: 21.9b0
hooks:
- id: black
- id: black-jupyter
- repo: https://github.com/keewis/blackdoc
rev: v0.3.4
hooks:
Expand All @@ -30,20 +31,21 @@ repos:
# - id: velin
# args: ["--write", "--compact"]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.910
rev: v0.910-1
hooks:
- id: mypy
# Copied from setup.cfg
exclude: "properties|asv_bench"
# `properies` & `asv_bench` are copied from setup.cfg.
# `_typed_ops.py` is added since otherwise mypy will complain (but notably only in pre-commit)
exclude: "properties|asv_bench|_typed_ops.py"
additional_dependencies: [
# Type stubs
types-python-dateutil,
types-pkg_resources,
types-PyYAML,
types-pytz,
typing-extensions==3.10.0.0,
# Dependencies that are typed
numpy,
typing-extensions==3.10.0.0,
]
# run this occasionally, ref discussion https://github.com/pydata/xarray/pull/3194
# - repo: https://github.com/asottile/pyupgrade
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ to support our efforts.
History
-------

xarray is an evolution of an internal tool developed at `The Climate
Xarray is an evolution of an internal tool developed at `The Climate
Corporation`__. It was originally written by Climate Corp researchers Stephan
Hoyer, Alex Kleeman and Eugene Brevdo and was released as open source in
May 2014. The project was renamed from "xray" in January 2016. Xarray became a
Expand All @@ -131,16 +131,16 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

xarray bundles portions of pandas, NumPy and Seaborn, all of which are available
Xarray bundles portions of pandas, NumPy and Seaborn, all of which are available
under a "3-clause BSD" license:
- pandas: setup.py, xarray/util/print_versions.py
- NumPy: xarray/core/npcompat.py
- Seaborn: _determine_cmap_params in xarray/core/plot/utils.py

xarray also bundles portions of CPython, which is available under the "Python
Xarray also bundles portions of CPython, which is available under the "Python
Software Foundation License" in xarray/core/pycompat.py.

xarray uses icons from the icomoon package (free version), which is
Xarray uses icons from the icomoon package (free version), which is
available under the "CC BY 4.0" license.

The full text of these licenses are included in the licenses directory.
122 changes: 122 additions & 0 deletions asv_bench/benchmarks/README_CI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Benchmark CI

<!-- Author: @jaimergp -->
<!-- Last updated: 2021.07.06 -->
<!-- Describes the work done as part of https://github.com/scikit-image/scikit-image/pull/5424 -->

## How it works

The `asv` suite can be run for any PR on GitHub Actions (check workflow `.github/workflows/benchmarks.yml`) by adding a `run-benchmark` label to said PR. This will trigger a job that will run the benchmarking suite for the current PR head (merged commit) against the PR base (usually `main`).

We use `asv continuous` to run the job, which runs a relative performance measurement. This means that there's no state to be saved and that regressions are only caught in terms of performance ratio (absolute numbers are available but they are not useful since we do not use stable hardware over time). `asv continuous` will:

* Compile `scikit-image` for _both_ commits. We use `ccache` to speed up the process, and `mamba` is used to create the build environments.
* Run the benchmark suite for both commits, _twice_ (since `processes=2` by default).
* Generate a report table with performance ratios:
* `ratio=1.0` -> performance didn't change.
* `ratio<1.0` -> PR made it slower.
* `ratio>1.0` -> PR made it faster.

Due to the sensitivity of the test, we cannot guarantee that false positives are not produced. In practice, values between `(0.7, 1.5)` are to be considered part of the measurement noise. When in doubt, running the benchmark suite one more time will provide more information about the test being a false positive or not.

## Running the benchmarks on GitHub Actions

1. On a PR, add the label `run-benchmark`.
2. The CI job will be started. Checks will appear in the usual dashboard panel above the comment box.
3. If more commits are added, the label checks will be grouped with the last commit checks _before_ you added the label.
4. Alternatively, you can always go to the `Actions` tab in the repo and [filter for `workflow:Benchmark`](https://github.com/scikit-image/scikit-image/actions?query=workflow%3ABenchmark). Your username will be assigned to the `actor` field, so you can also filter the results with that if you need it.

## The artifacts

The CI job will also generate an artifact. This is the `.asv/results` directory compressed in a zip file. Its contents include:

* `fv-xxxxx-xx/`. A directory for the machine that ran the suite. It contains three files:
* `<baseline>.json`, `<contender>.json`: the benchmark results for each commit, with stats.
* `machine.json`: details about the hardware.
* `benchmarks.json`: metadata about the current benchmark suite.
* `benchmarks.log`: the CI logs for this run.
* This README.

## Re-running the analysis

Although the CI logs should be enough to get an idea of what happened (check the table at the end), one can use `asv` to run the analysis routines again.

1. Uncompress the artifact contents in the repo, under `.asv/results`. This is, you should see `.asv/results/benchmarks.log`, not `.asv/results/something_else/benchmarks.log`. Write down the machine directory name for later.
2. Run `asv show` to see your available results. You will see something like this:

```
$> asv show
Commits with results:
Machine : Jaimes-MBP
Environment: conda-py3.9-cython-numpy1.20-scipy
00875e67
Machine : fv-az95-499
Environment: conda-py3.7-cython-numpy1.17-pooch-scipy
8db28f02
3a305096
```

3. We are interested in the commits for `fv-az95-499` (the CI machine for this run). We can compare them with `asv compare` and some extra options. `--sort ratio` will show largest ratios first, instead of alphabetical order. `--split` will produce three tables: improved, worsened, no changes. `--factor 1.5` tells `asv` to only complain if deviations are above a 1.5 ratio. `-m` is used to indicate the machine ID (use the one you wrote down in step 1). Finally, specify your commit hashes: baseline first, then contender!

```
$> asv compare --sort ratio --split --factor 1.5 -m fv-az95-499 8db28f02 3a305096
Benchmarks that have stayed the same:
before after ratio
[8db28f02] [3a305096]
<ci-benchmark-check~9^2>
n/a n/a n/a benchmark_restoration.RollingBall.time_rollingball_ndim
1.23±0.04ms 1.37±0.1ms 1.12 benchmark_transform_warp.WarpSuite.time_to_float64(<class 'numpy.float64'>, 128, 3)
5.07±0.1μs 5.59±0.4μs 1.10 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (192, 192, 192), (192, 192, 192))
1.23±0.02ms 1.33±0.1ms 1.08 benchmark_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 128, 3)
9.45±0.2ms 10.1±0.5ms 1.07 benchmark_rank.Rank3DSuite.time_3d_filters('majority', (32, 32, 32))
23.0±0.9ms 24.6±1ms 1.07 benchmark_interpolation.InterpolationResize.time_resize((80, 80, 80), 0, 'symmetric', <class 'numpy.float64'>, True)
38.7±1ms 41.1±1ms 1.06 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (2048, 2048), (192, 192, 192))
4.97±0.2μs 5.24±0.2μs 1.05 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (2048, 2048), (2048, 2048))
4.21±0.2ms 4.42±0.3ms 1.05 benchmark_rank.Rank3DSuite.time_3d_filters('gradient', (32, 32, 32))
...
```

If you want more details on a specific test, you can use `asv show`. Use `-b pattern` to filter which tests to show, and then specify a commit hash to inspect:

```
$> asv show -b time_to_float64 8db28f02
Commit: 8db28f02 <ci-benchmark-check~9^2>
benchmark_transform_warp.WarpSuite.time_to_float64 [fv-az95-499/conda-py3.7-cython-numpy1.17-pooch-scipy]
ok
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============
-- N / order
--------------- --------------------------------------------------------------------------------------------------------------
dtype_in 128 / 0 128 / 1 128 / 3 1024 / 0 1024 / 1 1024 / 3 4096 / 0 4096 / 1 4096 / 3
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============
numpy.uint8 2.56±0.09ms 523±30μs 1.28±0.05ms 130±3ms 28.7±2ms 81.9±3ms 2.42±0.01s 659±5ms 1.48±0.01s
numpy.uint16 2.48±0.03ms 530±10μs 1.28±0.02ms 130±1ms 30.4±0.7ms 81.1±2ms 2.44±0s 653±3ms 1.47±0.02s
numpy.float32 2.59±0.1ms 518±20μs 1.27±0.01ms 127±3ms 26.6±1ms 74.8±2ms 2.50±0.01s 546±10ms 1.33±0.02s
numpy.float64 2.48±0.04ms 513±50μs 1.23±0.04ms 134±3ms 30.7±2ms 85.4±2ms 2.55±0.01s 632±4ms 1.45±0.01s
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============
started: 2021-07-06 06:14:36, duration: 1.99m
```

## Other details

### Skipping slow or demanding tests

To minimize the time required to run the full suite, we trimmed the parameter matrix in some cases and, in others, directly skipped tests that ran for too long or require too much memory. Unlike `pytest`, `asv` does not have a notion of marks. However, you can `raise NotImplementedError` in the setup step to skip a test. In that vein, a new private function is defined at `benchmarks.__init__`: `_skip_slow`. This will check if the `ASV_SKIP_SLOW` environment variable has been defined. If set to `1`, it will raise `NotImplementedError` and skip the test. To implement this behavior in other tests, you can add the following attribute:

```python
from . import _skip_slow # this function is defined in benchmarks.__init__

def time_something_slow():
pass

time_something.setup = _skip_slow
```
19 changes: 19 additions & 0 deletions asv_bench/benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import itertools
import os

import numpy as np

Expand Down Expand Up @@ -46,3 +47,21 @@ def randint(low, high=None, size=None, frac_minus=None, seed=0):
x.flat[inds] = -1

return x


def _skip_slow():
"""
Use this function to skip slow or highly demanding tests.
Use it as a `Class.setup` method or a `function.setup` attribute.
Examples
--------
>>> from . import _skip_slow
>>> def time_something_slow():
... pass
...
>>> time_something.setup = _skip_slow
"""
if os.environ.get("ASV_SKIP_SLOW", "0") == "1":
raise NotImplementedError("Skipping this test...")
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/combine.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ class Combine:
def setup(self):
"""Create 4 datasets with two different variables"""

t_size, x_size, y_size = 100, 900, 800
t_size, x_size, y_size = 50, 450, 400
t = np.arange(t_size)
data = np.random.randn(t_size, x_size, y_size)

Expand Down
Loading

0 comments on commit 3444281

Please sign in to comment.