RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

neutrinoceros · 2024-04-18T20:48:25Z

As suggested in #115 (comment), this is a first step towards enabling code coverage for benchmarks.

When run locally I get 4 failures that are only reproduced when the all suite is run, which seems indicative of some test pollution. I'd like to get to the bottom of this before I undraft this PR.

neutrinoceros · 2024-04-18T21:06:05Z

Ok I resolved the test pollution issue by switching from setup_class to setup_method (which is also much simpler to plug into the existing framework). It's still a bit repetitive, and I'm not 100% sure that my custom test discovery rules are complete, but other than that, it's ready for feedback.

pllim

Thanks! Hopefully setup_method doesn't mean anything special to asv...

Maybe Zach can take this branch for a spin to be sure?

astrofrog · 2024-04-18T21:23:51Z

I do start to wonder sometimes whether the benchmarks shouldn't just be part of our normal test suite - they are just tests that we happen to monitor the execution time for. In principle we could even monitor the execution time of all our tests in the regular test suite! (but obviously the ones here are designed to be more consistent/minimal in what they measure)

pllim · 2024-04-18T21:25:43Z

whether the benchmarks shouldn't just be part of our normal test suite

Some are stress test that can significantly lengthen CI run time (which @neutrinoceros tried hard to cut down recently).

pllim · 2024-04-18T21:26:24Z

I guess cron is acceptable... 🤔

nstarman · 2024-04-18T21:43:40Z

Mostly unrelated to this PR, just bringing it to benchmarking-interested people's attention... https://docs.codspeed.io + pytest-benchmark.

neutrinoceros · 2024-04-19T07:59:24Z

I finished tweaking discovery rules to get (almost) every benchmark running with pytest. I'm explicitly excluding 3/238 benchmarks that pytest chokes on, but all 235 others run fine.
In one case, I needed to refactor the benchmark itself to use a temporary file instead of a shared, hardcoded file name. Hopefully this doesn't affect performances at all.

pllim · 2024-04-19T14:10:27Z

Re: #116 (comment) -- @nstarman , please open a new issue so your idea doesn't get buried here. Thanks!

nstarman · 2024-04-19T17:27:06Z

Done in #117

astrofrog · 2024-05-15T12:59:57Z

Some are stress test that can significantly lengthen CI run time

Just to be clear, I'm not saying they would be enabled by default - more that they could live alongside the regular tests albeit have a special marker/decorator to ensure they are not run by default.

pllim · 2024-05-15T13:12:58Z

Can you please rebase to pick up #119 by @astrofrog ? Thanks!

astrofrog

This is looking good so far! Just to put the idea on the table, we could always have a base class that defines:

class BaseBenchmarks:
    def setup_method(self, *args, **kwargs):
        sefl.setup(*args, **kwargs)

Also, does this PR work with parametrized tests? (see e.g. #118)

EDIT: oh I see it doesn't, because that's what's failing in cosmology

astrofrog · 2024-05-15T13:10:38Z

pytest.ini

+# customize test discovery to treat benchmarks as tests
+python_files = *.py
+python_classes = Time *Benchmarks
+python_functions = time_*


For future-proofness, we might want to support these different asv benchmark types:

https://asv.readthedocs.io/en/stable/benchmarks.html

specifically time_*, timeraw_*, mem_*, peakmem_*, and track_* although the latter might not work because I think pytest doesn't like tests that return things.

benchmarks/visualization/wcsaxes.py

astrofrog · 2024-05-15T13:19:25Z

A possible solution to parametrization might be to make use of pytest_generate_tests which can be used to customize parametrization.

neutrinoceros · 2024-05-15T14:21:57Z

rebased and took the first batch of comments into account. Looking at parametrization now.

pllim · 2024-05-15T14:23:21Z

Sorry, I just merged #118 so this will need another rebase to pick that up too. Thanks for your patience!

astrofrog · 2024-05-15T14:26:52Z

Just to put it on the table, another completely different idea to enable coverage is to run asv with the coverage tool although we will need to configure things so that subprocesses work. But just in case parametrization ends up not working.

neutrinoceros · 2024-05-15T14:47:38Z

regarding parametrization, I think this is how you'd do it:

def pytest_generate_tests(metafunc):
    if metafunc.cls is None or not hasattr(metafunc.cls, "params"):
        return
    
    name = metafunc.fixturenames[0]
    values = metafunc.cls.params
    
    metafunc.parametrize(name, values)

However, it still won't work with LambdaCDMBenchmarks because the setup and teardown methods take a cosmo argument that pytest interprets as a fixture, which isn't defined anywhere that it knows about (and for what it's worth, I also don't know where this argument comes from). As a result it just fails at collection with

E       fixture 'cosmo' not found
>       available fixtures: _xunit_setup_method_fixture_LambdaCDMBenchmarks, cache, capfd, capfdbinary, caplog, capsys, capsysbinary, class_mocker, cov, doctest_namespace, mocker, module_mocker, monkeypatch, no_cover, package_mocker, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, session_mocker, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

I don't know if we can work around this or how. This might be a dead end.

… to enable running benchmarks through pytest

neutrinoceros · 2024-05-15T14:48:22Z

I rebased again just in case we don't just close this

nstarman · 2024-05-15T16:29:46Z

I also opened #120, suggesting we use pytest as the manager for benchmarking, using the pytest ecosystem like pytest-benchmark.

neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch 3 times, most recently from bff6fcc to 9531a80 Compare April 18, 2024 21:02

neutrinoceros marked this pull request as ready for review April 18, 2024 21:06

neutrinoceros mentioned this pull request Apr 18, 2024

How to see which code is covered or not by benchmarks? #115

Open

pllim reviewed Apr 18, 2024

View reviewed changes

pllim requested a review from zacharyburnett April 18, 2024 21:19

neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from 9531a80 to c6dd80e Compare April 18, 2024 21:21

neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from c6dd80e to 84a744d Compare April 19, 2024 07:56

astrofrog requested changes May 15, 2024

View reviewed changes

neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from 84a744d to ff413fe Compare May 15, 2024 14:21

RFC: refactor benchmarks setups and customize pytest's discovery rule…

3bde28c

… to enable running benchmarks through pytest

neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from ff413fe to 3bde28c Compare May 15, 2024 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

neutrinoceros commented Apr 18, 2024

neutrinoceros commented Apr 18, 2024

pllim left a comment •

edited

Loading

astrofrog commented Apr 18, 2024

pllim commented Apr 18, 2024

pllim commented Apr 18, 2024

nstarman commented Apr 18, 2024 •

edited

Loading

neutrinoceros commented Apr 19, 2024 •

edited

Loading

pllim commented Apr 19, 2024

nstarman commented Apr 19, 2024 •

edited

Loading

astrofrog commented May 15, 2024

pllim commented May 15, 2024

astrofrog left a comment •

edited

Loading

astrofrog May 15, 2024

astrofrog commented May 15, 2024

neutrinoceros commented May 15, 2024

pllim commented May 15, 2024

astrofrog commented May 15, 2024

neutrinoceros commented May 15, 2024

neutrinoceros commented May 15, 2024

nstarman commented May 15, 2024

RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

Are you sure you want to change the base?

RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

Conversation

neutrinoceros commented Apr 18, 2024

neutrinoceros commented Apr 18, 2024

pllim left a comment • edited Loading

Choose a reason for hiding this comment

astrofrog commented Apr 18, 2024

pllim commented Apr 18, 2024

pllim commented Apr 18, 2024

nstarman commented Apr 18, 2024 • edited Loading

neutrinoceros commented Apr 19, 2024 • edited Loading

pllim commented Apr 19, 2024

nstarman commented Apr 19, 2024 • edited Loading

astrofrog commented May 15, 2024

pllim commented May 15, 2024

astrofrog left a comment • edited Loading

Choose a reason for hiding this comment

astrofrog May 15, 2024

Choose a reason for hiding this comment

astrofrog commented May 15, 2024

neutrinoceros commented May 15, 2024

pllim commented May 15, 2024

astrofrog commented May 15, 2024

neutrinoceros commented May 15, 2024

neutrinoceros commented May 15, 2024

nstarman commented May 15, 2024

pllim left a comment •

edited

Loading

nstarman commented Apr 18, 2024 •

edited

Loading

neutrinoceros commented Apr 19, 2024 •

edited

Loading

nstarman commented Apr 19, 2024 •

edited

Loading

astrofrog left a comment •

edited

Loading