Support test data in benchmark workflows #4402

wjbenfold · 2021-11-08T16:03:34Z

🚀 Pull Request

Description

Adds test data support to github action benchmarks (by downloading the iris test data and setting the OVERRIDE_TEST_DATA_REPOSITORY environment variable).
Adds an example benchmark (of area weighted regridding) that uses the test data

Currently github action benchmarks don't have access to the test data. I'm aiming to improve this.

To do:

Enable caching (by uncommenting) once the test data download is tested
Revert noxfile logging changes
Change the example benchmark to use new test data that shows significant slowness (once it's in the iris-test-data repo)
What's new

Consult Iris pull request check list

for more information, see https://pre-commit.ci

.github/workflows/benchmark.yml

…old/iris into wjbenfold-ci-benchmarks-test-data

for more information, see https://pre-commit.ci

…old/iris into wjbenfold-ci-benchmarks-test-data

trexfeathers · 2021-11-09T09:53:32Z

Change the example benchmark to use new test data that shows significant slowness (once it's in the iris-test-data repo)

@wjbenfold if you figure out a decent pattern to host full size PP files online, and without slowing the benchmark process, that will potentially solve some other benchmarking headaches so I'd be interested in the result!

for more information, see https://pre-commit.ci

…old/iris into wjbenfold-ci-benchmarks-test-data

wjbenfold · 2021-11-09T16:13:35Z

I think this PR is now ready for review (save a what's new).

Reviewer questions:

Are the changes to benchmarks/nox_asv_plugin.py, benchmarks/benchmarks/mixin.py and benchmarks/benchmarks/metadata_manager_factory.py as a result of changing .pre-commit-config.yaml acceptable? (This was done because the ci precommit was checking more aggressively than my local precommit and thus forcing me to pull the changes down to my local copy.)
I've used the test data to access files for benchmarking, ignoring the code in benchmarks/benchmarks/__init__.py that sets a data path. Is this ok, and if so should __init__.py be tidied at all (given I don't think it's doing anything to do with test data for us)?

trexfeathers

Thanks @wjbenfold! Some comments for you.

I don't get how an performance improvement has been detected for this PR, which makes no changes to the regridding code? It may be a sign that performance of this one is highly variable; I'll try re-running it.

benchmarks/benchmarks/regridding.py

.github/workflows/benchmark.yml

trexfeathers · 2021-11-10T16:05:02Z

I don't get how an performance improvement has been detected for this PR, which makes no changes to the regridding code? It may be a sign that performance of this one is highly variable; I'll try re-running it.

From investigation this looks like it was an aberration, so no changes needed.

(For the record I was going to suggest adjusting the repeat attribute to force ASV to spend longer / do more repeats. Not needed here but a good tool to have).

trexfeathers · 2021-11-10T16:09:32Z

Are the changes to benchmarks/nox_asv_plugin.py, benchmarks/benchmarks/mixin.py and benchmarks/benchmarks/metadata_manager_factory.py as a result of changing .pre-commit-config.yaml acceptable? (This was done because the ci precommit was checking more aggressively than my local precommit and thus forcing me to pull the changes down to my local copy.)

👍

I've used the test data to access files for benchmarking, ignoring the code in benchmarks/benchmarks/__init__.py that sets a data path. Is this ok, and if so should __init__.py be tidied at all (given I don't think it's doing anything to do with test data for us)?

👍 The data location stuff in __init__ will very likely get re-written as we establish new patterns to handle benchmark data as part of remote CI. A lot of this was written to support a fixed directory containing very large files, and that just isn't workable in CI.

Co-authored-by: Martin Yeo <[email protected]>

trexfeathers

I'm happy that you've actioned my comments. Still need that What's New entry though!

@wjbenfold

* main: (23 commits) Suggest type hinting (SciTools#4390) area weight regrid test fixes (SciTools#4432) Update latest.rst (SciTools#4425) Added @wjbenfold to the core dev list (SciTools#4423) Removed addition of period from wrap_lons. (SciTools#4421) Add release docs sections describing the role of a Release Manager (SciTools#4413) Subset should always return None if no value matches are found (SciTools#4417) What's new for SciTools#4400 (SciTools#4422) `iris.analysis.AreaWeighted` regrid speedup (SciTools#4400) [pre-commit.ci] pre-commit autoupdate (SciTools#4419) Remove newline to satisfy setuptools (SciTools#4418) Updated environment lockfiles (SciTools#4416) NAME loader fixes (SciTools#4411) Updated whatsnew for PR 4402 (SciTools#4410) Support test data in benchmark workflows (SciTools#4402) What's new for pr 4387 (SciTools#4405) Make concat mismatch warning for scalar coords more accurate (SciTools#4387) Added line to latest release notes for updates to pp_save_rules.py (SciTools#4404) Update pp_save_rules.py (SciTools#4391) [pre-commit.ci] pre-commit autoupdate (SciTools#4403) ...

Will Benfold and others added 3 commits November 8, 2021 15:53

Support test data in benchmark workflows

2095a70

Use existing test data for now to test

02f59c5

[pre-commit.ci] auto fixes from pre-commit.com hooks

fb26b03

for more information, see https://pre-commit.ci

trexfeathers reviewed Nov 8, 2021

View reviewed changes

.github/workflows/benchmark.yml Outdated Show resolved Hide resolved

Will Benfold and others added 10 commits November 8, 2021 16:16

Set OVERRIDE_TEST_DATA_REPOSITORY properly

2d1b780

Merge branch 'wjbenfold-ci-benchmarks-test-data' of github.com:wjbenf…

760fff1

…old/iris into wjbenfold-ci-benchmarks-test-data

Manually set version number to mirror cirrus.yml

1acbea7

Add licence to top of file

90cf420

Specify absolute path of test data

2d9b8f9

Changes to regridding to remove unnecessary print statements

b944fa7

[pre-commit.ci] auto fixes from pre-commit.com hooks

4f7462e

for more information, see https://pre-commit.ci

Can't refer to environment variables in env block

30bd14a

Merge branch 'wjbenfold-ci-benchmarks-test-data' of github.com:wjbenf…

b78ab47

…old/iris into wjbenfold-ci-benchmarks-test-data

Precommit checks benchmarks too

8bb1b46

Will Benfold and others added 12 commits November 9, 2021 10:27

Set the env var for data location

b0c856f

[pre-commit.ci] auto fixes from pre-commit.com hooks

a40a88c

for more information, see https://pre-commit.ci

Add some logging to noxfile for now

62b03f2

Merge branch 'wjbenfold-ci-benchmarks-test-data' of github.com:wjbenf…

c0a5a57

…old/iris into wjbenfold-ci-benchmarks-test-data

Do we have the test data path set right?

0ab3125

Rearrange imports and fix logging

0bd99af

More logging because iris.config.TEST_DATA_DIR was None

6e76a92

More logging and changed mv

4f55a5f

Revert logging changes

f28cdd0

Enable caching

8f7032f

Update to use new test data

25f7d76

Trim out last bit of logging

04ec65c

wjbenfold marked this pull request as ready for review November 9, 2021 16:15

wjbenfold marked this pull request as draft November 9, 2021 16:16

Will Benfold added 5 commits November 9, 2021 16:20

Update cache test data directory

dc666e6

Reprompt ci

15521eb

Fix cache location

c3ba131

Bit of logging to reprompt CI

568bdff

Remove that logging

f57768c

wjbenfold marked this pull request as ready for review November 9, 2021 17:40

Added type hinting

43000d9

trexfeathers requested changes Nov 10, 2021

View reviewed changes

benchmarks/benchmarks/regridding.py Outdated Show resolved Hide resolved

benchmarks/benchmarks/regridding.py Outdated Show resolved Hide resolved

.github/workflows/benchmark.yml Show resolved Hide resolved

.github/workflows/benchmark.yml Outdated Show resolved Hide resolved

trexfeathers reviewed Nov 10, 2021

View reviewed changes

.github/workflows/benchmark.yml Show resolved Hide resolved

Will Benfold and others added 2 commits November 11, 2021 10:41

Review changes

abfd06a

Prevent piping hiding benchmark failures

fdfba71

Co-authored-by: Martin Yeo <[email protected]>

trexfeathers approved these changes Nov 11, 2021

View reviewed changes

trexfeathers requested changes Nov 11, 2021

View reviewed changes

wjbenfold mentioned this pull request Nov 11, 2021

Updated whatsnew for PR 4402 #4410

Merged

trexfeathers approved these changes Nov 11, 2021

View reviewed changes

trexfeathers merged commit 94c1a6a into SciTools:main Nov 11, 2021

wjbenfold deleted the wjbenfold-ci-benchmarks-test-data branch November 11, 2021 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support test data in benchmark workflows #4402

Support test data in benchmark workflows #4402

wjbenfold commented Nov 8, 2021 •

edited

Loading

trexfeathers commented Nov 9, 2021 •

edited

Loading

wjbenfold commented Nov 9, 2021 •

edited

Loading

trexfeathers left a comment

trexfeathers commented Nov 10, 2021

trexfeathers commented Nov 10, 2021

trexfeathers left a comment

Support test data in benchmark workflows #4402

Support test data in benchmark workflows #4402

Conversation

wjbenfold commented Nov 8, 2021 • edited Loading

🚀 Pull Request

Description

trexfeathers commented Nov 9, 2021 • edited Loading

wjbenfold commented Nov 9, 2021 • edited Loading

trexfeathers left a comment

Choose a reason for hiding this comment

trexfeathers commented Nov 10, 2021

trexfeathers commented Nov 10, 2021

trexfeathers left a comment

Choose a reason for hiding this comment

wjbenfold commented Nov 8, 2021 •

edited

Loading

trexfeathers commented Nov 9, 2021 •

edited

Loading

wjbenfold commented Nov 9, 2021 •

edited

Loading