Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytest style guide #1444

Closed
kain88-de opened this issue Jun 28, 2017 · 34 comments
Closed

pytest style guide #1444

kain88-de opened this issue Jun 28, 2017 · 34 comments

Comments

@kain88-de
Copy link
Member

kain88-de commented Jun 28, 2017

With @utkbansal almost done on the transition and @richardjgowers and me are starting to experiment how tests could look like now. I've started to see some patterns we should talk
about. Below are 3 items I noticed where it would be good if we have a common
guide line for @utkbansal , ourselves and new developers.

usage of temporary directories

Current

In our current approach we use a decorator from the tempdir library.

@run_in_tmpdir
def test_a():
    assert True

This is of course nice because their is a big visual clue that this function
is not run in the current directory. A problem with this approach as of right
now is that the temporary directory names aren't predictable, in contrast to
pytest.

pytest

The idiom in pytest is to use the tmpdir fixture.

def test_a(tmpdir):
    with tmpdir.as_cwd():
        assert True

tmpdir is a py.local variable that assumes your normal file handling is
using streams. We do not handle file streams at all. We rather work on file
names and opening closing files in the functions itself. This leads to always
having to use the as_cwd context manager and creating an extra level of
indentation. A advantage of this is though that the temporary directory names
are predictable as /tmp/pytest-$user/test-name/.

Another option for pytest is

def test_a(tmpdir):
    tmpfile = str(tmpdir.join('fname'))
    assert True

richard suggestion

Richard suggested another fixtures we could use.

@pytest.fixture
def intmpdir(tmpdir):
    with tmpdir.as_cwd():
        yield

I don't like this option. Simple because it is now as visible as in the other
two versions that the test is run in a different directory.

Way to Proceed

Have run_in_tmpdir decorator create predictable folder names to run in based
on the $user and function name. It should also be possible to specify the tmp directory
on the command line. I think this means we need to add something to conftest.py.

Assertions

Current

Right now we use the assert_* functions provided by numpy. They are nice and very
expressive. They also work very nicely with array_likes.

def test_a():
    assert_equal(a, b)
    assert_almost_equal(a, b)
    assert_array_equal(a, b)
    assert_array_almost_equal(a, b)

assert_equal is actually pretty cool. It handles nested dictionaries correctly.

pytest

pytest prefers raw assert statements

def test_a():
    assert a == b
    assert pytest.approx(a) == b
    assert np.all(a == b)
    assert pytest.approx(a) == b

personally I prefer the numpy functions for their expressiveness.

Grouping tests

This section does not apply to the _BaseTests classes that check a child
class from an abstract class implements the API correctly. So here I'm not
talking about tests like the BaseReader tests

Current

Right now we write base test classes that implement all tests that we want
to run for different inputs.

class _BaseTests(TestCase):
    __test__ = False

    def test_a(self):
        assert self.a

    def test_b(self):
        assert self.b


class TestA(_BaseTests):
    __test__ == True
    a = True
    b = False

class TestB(_BaseTests):
    __test__ == True
    a = False
    b = True

This is a lot of code. It isn't clear when reading the tests what will the input be.
Also the actual test classes are now a collection of seemingly random values.

pytest

pytest prefers to use fixtures and provides a shortcut to create a fixtures
that cycles through different values for an input variable.

@pytest.mark.parametrize('a', (True, False))
def test_a(a):
    assert a


@pytest.mark.parametrize('b', (False, True))
def test_b(b):
  assert b

I personally like this. I can immediately see which values will be applied to
the test. Here also only test values that I immediately need are listed.

@kain88-de
Copy link
Member Author

skip test if module not found

current

right now we use skipif with a custom function.

@skipif(module_not_found("FooBar")
def test_a():
    assert True

pytest

pytest already has a function for this called importorskip. After some usage testing this does not seem to be thought of as a decorator but rather a normal function used in the test.

def test_a():
    pytest.importorskip("FOOBAR")
    assert True

A cool effect of this is together with fixtures. When a fixture uses he function any test that uses it is automatically skipped if the fixture is skipped. That makes it easy for developing test. I don't need to remember anymore that for a specific fixture I need to add the importorskip guard.

@pytest.fixture(scope='module')
def ncdf2dcd(tmpdir_factory):
     pytest.importorskip("netCDF4")
     return

@richardjgowers
Copy link
Member

tmpdirs

So I agree that @run_in_tmpdir is the best solution, and could be improved by having deterministic names like pytest. If more people agree we can spin this out into an issue.

assertions

For assertions, for simple cases I like using assert value == ref, for more complex data structures, perhaps we need to use actual np.testing.assert_* functions.

grouping

With grouping tests, I do like using classes to group together similar tests. Here is an example of where we've parametrized a class which tests all the Cython functions, under two different conditions. I like this class because the scope of it relates to a single thing in the code (in this case the cython compilation of distances.h). If we split this into functions, I wouldn't be as certain where the tests for that end, and other test began.

So tl;dr, I agree with not using mixins, but we should parametrize classes not functions, so that we can still use classes for grouping tests.

WRT the mixins. Currently we use a BaseReaderTest, then mix this into each format in a different file for each format, so that the test structure roughly mirrors the package layout. Defining tests in a mixin also lets us define the Reader API as a set of tests, then apply this to each format individually.

With what you're proposing, we could also have a pytest.parametrize which ran over all formats?

@pytest.mark.parametrize('format', [ 
    'GRO', 'PDB', etc
])
class TestReaderAPI(object):
    def test_iteration(self, format):
        etc

@mnmelo
Copy link
Member

mnmelo commented Jun 29, 2017

tempdir

I agree with the enhanced @run_in_tempdir approach (and as I understand we'd make no use of pytest's tmpdir fixture, right?)

Asserts

I see no point on moving away from the handy np.testing asserts unless there is added functionality if the raw asserts are used.
Reading back, in #724 @jbarnoud comments that there are indeed some pytest optimizations. What are we missing on if we use numpy's asserts? Also, what about the fears in the same issue that asserts in tests might be skipped due to optimization?

Grouping

Makes sense to leverage pytest's power here for increased readability and less typing. If @richardjgowers' concerns on grouping of related tests can also be taken into account, then I'm all for it.

Skipif

I see no cons (only pros) in @kain88-de's suggestion. Only maybe that if we somehow bork the travis setup of optional dependencies we can get a number of tests become silently skipped. This should still show up as a drop in coverage, anyway.

@kain88-de
Copy link
Member Author

I updated the tmpdir options above. We can use the tmpdir fixture also without the context manager

@kain88-de
Copy link
Member Author

So tl;dr, I agree with not using mixins, but we should parametrize classes not functions, so that we can still use classes for grouping tests.

I tried that for the libmdaxdr tests

@pytest.mark.parametrize('xdr, fname', ((XTCFile, XTC_file))
class TestAPI:
     @pytest.fixture
     def reader(xdr, fname):
         with xdr(fname) as f:
             yield f

     @staticmethod
     def test_a(reader):
           assert True

But these tests weren't executed in the end.

@kain88-de
Copy link
Member Author

I see no point on moving away from the handy np.testing asserts unless there is added functionality if the raw asserts are used.
Reading back, in #724 @jbarnoud comments that there are indeed some pytest optimizations. What are we missing on if we use numpy's asserts? Also, what about the fears in the same issue that asserts in tests might be skipped due to optimization?

See the FAQ about assert and optimizations with -O in pytest. I also like keeping the current asserts since we use them all over the code base and switching will be a pain.

@mnmelo
Copy link
Member

mnmelo commented Jun 30, 2017

Ok, from what I gathered with pytest there's no longer the risk of losing tests, even under optimization. This could make the case for replacing the use of np.testing.assert_ for plain assert. Nothing very serious, anyway. I might try it one of these days to check if pytest's extra information of plain assert is worth the hassle.

Regarding the other more complex asserts, I'm cool with keeping the np.testing ones. They are already quite informative, and indeed much harder to replace.

@kain88-de kain88-de mentioned this issue Jul 1, 2017
4 tasks
@orbeckst
Copy link
Member

orbeckst commented Jul 1, 2017

I would keep the fancy numpy.testing.assert_* and only swap out assert_ for plain assert.

(Personally, I like assert_equal(a, b, "reason") better than assert a == b but I think that comes down to individual preferences and I would rather people write tests the way that they feel comfortable about than not writing tests or getting pissed off during code review because we make them rewrite assert_equal into assert or vice versa – always assuming that there are no other functional differences that matter.)

For anything else I am happy with what comes out of this discussion. I just need a few examples to wrap my head around what pytest can do – I admit to not always finding it intuitive.

@kain88-de
Copy link
Member Author

For anything else I am happy with what comes out of this discussion. I just need a few examples to wrap my head around what pytest can do – I admit to not always finding it intuitive.

See #1442. By now it does show of quite a range of things. It also includes suggestions from discussions on this issue.

@kain88-de
Copy link
Member Author

grouping

With grouping tests, I do like using classes to group together similar tests. Here is an example of where we've parametrized a class which tests all the Cython functions, under two different conditions. I like this class because the scope of it relates to a single thing in the code (in this case the cython compilation of distances.h). If we split this into functions, I wouldn't be as certain where the tests for that end, and other test began.

So tl;dr, I agree with not using mixins, but we should parametrize classes not functions, so that we can still use classes for grouping tests.

For a common API yes. It does make things cleaner.

WRT the mixins. Currently we use a BaseReaderTest, then mix this into each format in a different file for each format, so that the test structure roughly mirrors the package layout. Defining tests in a mixin also lets us define the Reader API as a set of tests, then apply this to each format individually.

Yeah I know. That is pretty much the only way to do this. I also do like that we activate those test on a per file basis. This makes it easy to see if additional tests are run on the format reader. So I would consider this as the only exception. For tests in a single file mixins should be strongly discouraged.

@richardjgowers
Copy link
Member

https://docs.pytest.org/en/latest/fixture.html#fixtures-can-introspect-the-requesting-test-context

I ran across this in the docs, it lets the fixture inspect the module it's called from. This could be interesting considering our current test pattern..

## in base.py
@pytest.fixture()
def reader(request):
    cls = request.module.READER
    fn = request.module.FILENAME

    with cls(fn) as f:
        yield f

class ReaderTest(object):
    def test_this(reader):
        assert reader.this = 'that'

## in test_gro.py
READER = GROReader
FILENAME = GRO

class TestGro(ReaderTest):
    def test_extra_gro_thing(reader):
        assert reader.gro = True

So we design a fixture which is made to inspect the test file for parameters, then just drop this into each format's test.

@kain88-de
Copy link
Member Author

kain88-de commented Jul 5, 2017

A short summary

Strict Rules

Every test has to follow these rules

Temporary directory

use tmpdir fixture from pytest

def test_a(tmpdir):
    tmpfile = str(tmpdir.join('foo'))
    assert True

skipif

Use pytest skipif decorator pytest.mark.skipif(...) or when you want to skip a module pytest.importorskip has to be called inside of the function/fixture.

Known failure

Use pytest.mark.xfail. It's superior to our old plugin. We can even force run xfail marked tests to check if they still fail.

attribute decorators

decorators like @dec.attr('issue') or @dec.slow() should be removed.

Suggested Style

Here we give suggestions but allow dev to choose the style they prefer

Assertions

For simple equal comparison prefer plain assert True. If you compare array_like objects prefer numpy.testing asserts like asssert_array_equal.

Grouping of tests

We should try to avoid Mixin classes like we used to write under nose. Having a single class group tests into logical units is fine. in general pytest allows us a lot of flexibility to group tests thanks to fixtures. @richardjgowers and my own view is that a developer should decide what she/he thinks is best when writing a test. @utkbansal will write a wiki page with different example for grouping of tests with classes and fixtures.

@richardjgowers
Copy link
Member

@kain88-de not trying to be annoying, but the comment above this with the request.module.READER etc, is a pattern that allows mixins and would be most similar to our current test layout. Ie we could convert BaseReaderTest to accept a pytest.fixture which inspects the module it's called from to return an appropriate Reader.

@orbeckst
Copy link
Member

orbeckst commented Jul 5, 2017 via email

@kain88-de
Copy link
Member Author

@kain88-de not trying to be annoying,

That's why this point is under suggested style. If mixins are a good solution then I'm OK with using them. It just states that pure functions with fixtures should preferred. I'm aware that our reader tests are an edge case that would be better served with mixins.

@utkbansal
Copy link
Member

utkbansal commented Jul 10, 2017

One thing that has been missed out here is how to initialize variables that are not reusable resources per-se. Take for example this class

class TestAltloc(TestCase):
    def setUp(self):
        self.filename = PDB_full
        self.tempdir = tempdir.TempDir()
        self.outfile = os.path.join(self.tempdir.name, 'test.pdb')

    def tearDown(self):
        del self.tempdir

    def test_atomgroups(self):
        u = Universe(self.filename)
        segidB0 = len(u.select_atoms("segid B and (not altloc B)"))
        segidB1 = len(u.select_atoms("segid B and (not altloc A)"))
        assert_equal(segidB0, segidB1)
        altlocB0 = len(u.select_atoms("segid B and (altloc A)"))
        altlocB1 = len(u.select_atoms("segid B and (altloc B)"))
        assert_equal(altlocB0, altlocB1)
        sum = len(u.select_atoms("segid B"))
        assert_equal(sum, segidB0 + altlocB0)

    def test_bonds(self):
        u = Universe(self.filename, guess_bonds=True)
        # need to force topology to load before querying individual atom bonds
        bonds0 = u.select_atoms("segid B and (altloc A)")[0].bonds
        bonds1 = u.select_atoms("segid B and (altloc B)")[0].bonds
        assert_equal(len(bonds0), len(bonds1))

    def test_write_read(self):
        u = Universe(self.filename)
        u.select_atoms("all").write(self.outfile)
        u2 = Universe(self.outfile)
        assert_equal(len(u.atoms), len(u2.atoms))

filename and outfile aren't something I'd want to make fixtures for. Any thoughts on how we want to do this?

@utkbansal
Copy link
Member

utkbansal commented Jul 10, 2017

As far as I have understood pytest, it favours functional programming style. It's best to have everything as a function(you do have classes but its the same). Fixtures are a form of dependency injection, they initialize dependencies and then pass them to test functions as parameters.

And this makes sense why there is nothing to initialize stuff - because that builds state - which is opposed by functional programming. If we follow this paradigm, I'd suggest that we just move things like filename and outfile to their respective methods and not care much about the repetition.

@kain88-de
Copy link
Member Author

Use a fixture of the universes. If you check the tests carefully you will also notice that outfile is only needed in one test and therefore doesn't need to be defined globally in the class.

class TestAltloc(object):
    @pytest.fixture()
    def u():
        return mda.Universe(PDB_full, guess_bonds=True)

    def test_atomgroups(self, u):
        segidB0 = len(u.select_atoms("segid B and (not altloc B)"))
        segidB1 = len(u.select_atoms("segid B and (not altloc A)"))
        assert_equal(segidB0, segidB1)
        altlocB0 = len(u.select_atoms("segid B and (altloc A)"))
        altlocB1 = len(u.select_atoms("segid B and (altloc B)"))
        assert_equal(altlocB0, altlocB1)
        sum = len(u.select_atoms("segid B"))
        assert_equal(sum, segidB0 + altlocB0)

    def test_bonds(self, u):
        # need to force topology to load before querying individual atom bonds
        bonds0 = u.select_atoms("segid B and (altloc A)")[0].bonds
        bonds1 = u.select_atoms("segid B and (altloc B)")[0].bonds
        assert_equal(len(bonds0), len(bonds1))

    def test_write_read(self, u, tmpdir):
        outfile = str(tmpdir.join('test.pdb'))
        u.select_atoms("all").write(outfile)
        u2 = Universe(outfile)
        assert len(u.atoms) == len(u2.atoms)

@utkbansal
Copy link
Member

@kain88-de You got me! It can be fixed in this case. But there will be more cases for this I think.

@kain88-de
Copy link
Member Author

Mostly when we have variables as you describe they are defined globally as constants in the file. Like for example PDB_full. I don't think this will be a problem.

@kain88-de
Copy link
Member Author

Another method would be to scope them in a class

class TestA(object):
    filename = 'foobar'

    def test_a(self):
        assert self.filename == 'foobar'

@utkbansal
Copy link
Member

Scoping in a class makes sense when there is significant reuse. Or do you want to always do it?

@kain88-de
Copy link
Member Author

When there is significant reuse. Certainly doesn't make sense for a one time use.

@kain88-de
Copy link
Member Author

kain88-de commented Jul 11, 2017

test for raised exceptions/warnings

Current

There is only a decorator available requiring us to mark the whole test. The implicit assumption is that the last line throws the exception. We can't know if the setup did throw the same exception.

@raises(Exception)
def test_a():
    # setup
    raise Exception

pytest

pytest itself has a context-manager that checks if a exception is raised. I personally really like this because it separates the setup code from the code we want to test for exceptions. Another goodies is pytest.warnings which does the same for warnings. I think there is also a similar context-manager for deprecations.

def test_a():
    # setup
    with pytest.raises(Exception):
        raise Exception

pytest-raises

implements a mark decorator. This is the closest we can get to the old nose tests

@pytest.mark.raises(Exception)
def test_a():
    # setup
    raise Exception

I myself would like to use the pytest context managers to test for exceptions and warnings. They allow us to ensure that the setup code isn't throwing the exception.

@richardjgowers
Copy link
Member

@kain88-de we also have assert_raises(myfunc, arg1, arg2, kwarg1='this'), which is ugly as the function call isn't immediately obvious (myfunc gets called with all following things as args and kwargs). I prefer the with pytest.raises(ValueError): as it's immediately next to the thing that should raise the error, and the function call isn't mangled like the assert_raises way.

@jbarnoud
Copy link
Contributor

+1 for the context manager.

@mnmelo
Copy link
Member

mnmelo commented Jul 24, 2017

A small question: is there a preferred use of np.assert_equal vs np.assert_array_equal, or vice-versa? As far as I could tell their behavior is nearly identical, with assert_array_equal actually accepting more cases than what its name and docs would have you expect. (see the related and unresolved discussion in Numpy's repo).

FWIW I find assert_equal more readable, but wouldn't force it upon whoever prefers the alternative.

@jbarnoud
Copy link
Contributor

After reading the discussion you link to, I find assert_array_equal too lenient. I do not think a scalar should be considered equal to a list/tuple/array of just that scalar.

@mnmelo
Copy link
Member

mnmelo commented Jul 25, 2017

Ok, so let's tend to use the more readable and stricter assert_equal. (I actually was prompted to check the difference between the two when going through your changes in #1557).

@orbeckst
Copy link
Member

orbeckst commented Aug 8, 2017

Who is going to write the Style Guide: Tests page? No-one is assigned but this issue is blocking #1489.

As far as I am concerned, a first rough draft that collects the agreed bits from this discussion would suffice. Basically, go through the whole thread, copy and paste the good bits, and some formatting, and link to it from the UnitTest and Style Guide pages.

@utkbansal
Copy link
Member

@orbeckst That would be me. I prepared a draft yesterday, should be able to complete it by EOD.

@kain88-de
Copy link
Member Author

There is a cool new feature to filter warnings in pytest 3.2.0 docs. This will be useful to silence our topology parser warnings in the tests. My eventual goal would be that there are no more warnings in our testsuite but that is a long way off. This way we can at least have some coming through.

@orbeckst
Copy link
Member

Thanks @utkbansal , I am happy with Style Guide: Tests.

I added links to it to the Style Guide itself and to Unit Tests: Writing Test Cases.

As far as I am concerned, we can close the issue but I'd like other devs an opportunity to chime in.

@orbeckst
Copy link
Member

Looks good to me; if anyone else has comments, please raise new issue or just edit the wiki page https://github.com/MDAnalysis/mdanalysis/wiki/Style-Guide%3A-Tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants