Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: refactor benchmarks setups and customize pytest's discovery rule to enable running benchmarks through pytest #116

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

neutrinoceros
Copy link
Contributor

As suggested in #115 (comment), this is a first step towards enabling code coverage for benchmarks.

When run locally I get 4 failures that are only reproduced when the all suite is run, which seems indicative of some test pollution. I'd like to get to the bottom of this before I undraft this PR.

@neutrinoceros neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch 3 times, most recently from bff6fcc to 9531a80 Compare April 18, 2024 21:02
@neutrinoceros
Copy link
Contributor Author

Ok I resolved the test pollution issue by switching from setup_class to setup_method (which is also much simpler to plug into the existing framework). It's still a bit repetitive, and I'm not 100% sure that my custom test discovery rules are complete, but other than that, it's ready for feedback.

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Hopefully setup_method doesn't mean anything special to asv...

Maybe Zach can take this branch for a spin to be sure?

@pllim pllim requested a review from zacharyburnett April 18, 2024 21:19
@neutrinoceros neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from 9531a80 to c6dd80e Compare April 18, 2024 21:21
@astrofrog
Copy link
Member

I do start to wonder sometimes whether the benchmarks shouldn't just be part of our normal test suite - they are just tests that we happen to monitor the execution time for. In principle we could even monitor the execution time of all our tests in the regular test suite! (but obviously the ones here are designed to be more consistent/minimal in what they measure)

@pllim
Copy link
Member

pllim commented Apr 18, 2024

whether the benchmarks shouldn't just be part of our normal test suite

Some are stress test that can significantly lengthen CI run time (which @neutrinoceros tried hard to cut down recently).

@pllim
Copy link
Member

pllim commented Apr 18, 2024

I guess cron is acceptable... 🤔

@nstarman
Copy link
Member

nstarman commented Apr 18, 2024

Mostly unrelated to this PR, just bringing it to benchmarking-interested people's attention... https://docs.codspeed.io + pytest-benchmark.

@neutrinoceros neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from c6dd80e to 84a744d Compare April 19, 2024 07:56
@neutrinoceros
Copy link
Contributor Author

neutrinoceros commented Apr 19, 2024

I finished tweaking discovery rules to get (almost) every benchmark running with pytest. I'm explicitly excluding 3/238 benchmarks that pytest chokes on, but all 235 others run fine.
In one case, I needed to refactor the benchmark itself to use a temporary file instead of a shared, hardcoded file name. Hopefully this doesn't affect performances at all.

@pllim
Copy link
Member

pllim commented Apr 19, 2024

Re: #116 (comment) -- @nstarman , please open a new issue so your idea doesn't get buried here. Thanks!

@nstarman
Copy link
Member

nstarman commented Apr 19, 2024

Done in #117

@astrofrog
Copy link
Member

Some are stress test that can significantly lengthen CI run time

Just to be clear, I'm not saying they would be enabled by default - more that they could live alongside the regular tests albeit have a special marker/decorator to ensure they are not run by default.

@pllim
Copy link
Member

pllim commented May 15, 2024

Can you please rebase to pick up #119 by @astrofrog ? Thanks!

Copy link
Member

@astrofrog astrofrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good so far! Just to put the idea on the table, we could always have a base class that defines:

class BaseBenchmarks:
    def setup_method(self, *args, **kwargs):
        sefl.setup(*args, **kwargs)

Also, does this PR work with parametrized tests? (see e.g. #118)

EDIT: oh I see it doesn't, because that's what's failing in cosmology

pytest.ini Outdated
# customize test discovery to treat benchmarks as tests
python_files = *.py
python_classes = Time *Benchmarks
python_functions = time_*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future-proofness, we might want to support these different asv benchmark types:

https://asv.readthedocs.io/en/stable/benchmarks.html

specifically time_*, timeraw_*, mem_*, peakmem_*, and track_* although the latter might not work because I think pytest doesn't like tests that return things.

benchmarks/visualization/wcsaxes.py Outdated Show resolved Hide resolved
@astrofrog
Copy link
Member

A possible solution to parametrization might be to make use of pytest_generate_tests which can be used to customize parametrization.

@neutrinoceros neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from 84a744d to ff413fe Compare May 15, 2024 14:21
@neutrinoceros
Copy link
Contributor Author

rebased and took the first batch of comments into account. Looking at parametrization now.

@pllim
Copy link
Member

pllim commented May 15, 2024

Sorry, I just merged #118 so this will need another rebase to pick that up too. Thanks for your patience!

@astrofrog
Copy link
Member

Just to put it on the table, another completely different idea to enable coverage is to run asv with the coverage tool although we will need to configure things so that subprocesses work. But just in case parametrization ends up not working.

@neutrinoceros
Copy link
Contributor Author

regarding parametrization, I think this is how you'd do it:

def pytest_generate_tests(metafunc):
    if metafunc.cls is None or not hasattr(metafunc.cls, "params"):
        return
    
    name = metafunc.fixturenames[0]
    values = metafunc.cls.params
    
    metafunc.parametrize(name, values)

However, it still won't work with LambdaCDMBenchmarks because the setup and teardown methods take a cosmo argument that pytest interprets as a fixture, which isn't defined anywhere that it knows about (and for what it's worth, I also don't know where this argument comes from). As a result it just fails at collection with

E       fixture 'cosmo' not found
>       available fixtures: _xunit_setup_method_fixture_LambdaCDMBenchmarks, cache, capfd, capfdbinary, caplog, capsys, capsysbinary, class_mocker, cov, doctest_namespace, mocker, module_mocker, monkeypatch, no_cover, package_mocker, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, session_mocker, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.

I don't know if we can work around this or how. This might be a dead end.

… to enable running benchmarks through pytest
@neutrinoceros neutrinoceros force-pushed the enable_running_benchmarks_through_pytest branch from ff413fe to 3bde28c Compare May 15, 2024 14:48
@neutrinoceros
Copy link
Contributor Author

I rebased again just in case we don't just close this

@nstarman
Copy link
Member

I also opened #120, suggesting we use pytest as the manager for benchmarking, using the pytest ecosystem like pytest-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants