Add support for numpy arrays (and dicts) to approx. #2492

kalekundert · 2017-06-12T03:19:46Z

This pull request addresses #1994 and adds support for numpy arrays to approx(). It also adds support for dicts (and other mapping types), because doing so was low hanging fruit and I thought the old behavior (accept dicts but only compare their keys) was potentially surprising.

Adding support for numpy arrays ended up requiring major refactoring of the approx code, due to some intricacies of whether the __eq__ operator is called on the left or right operand, and how that can be affected by inheritance. But the external interface is unchanged, the new code is all documented, and everything is tested.

At the moment, this code uses the same algorithm to compare numpy arrays as it does to compare anything else. However, @RonnyPfannschmidt may be right that it would be better to defer to numpy.allclose(), as he was commenting on the thread for #1994. It wouldn't be hard to get either behavior, but I think it's worth trying to come to a consensus on.

I don't really think it matters. The two algorithms will only give different answers if the difference between the two numbers being compared is extremely close to the tolerance. If those fine differences matter to you, both algorithms allow fine control over the precise tolerance. If they don't, both algorithms have defaults that can reasonably say whether two numbers are close enough to be considered the same or not. The advantage of using the pytest algorithm is that it allows us to print out the tolerance in the repr string, which I find useful for debugging.

This fixes pytest-dev#1994. It turned out to require a lot of refactoring because subclassing numpy.ndarray was necessary to coerce python into calling the right `__eq__` operator.

coveralls · 2017-06-12T03:56:51Z

Coverage decreased (-0.2%) to 91.919% when pulling 89292f0 on kalekundert:features into 9bd8907 on pytest-dev:features.

RonnyPfannschmidt

this change also begs fo additional documentation,
its not necessary to push trough in this pr, but it would be good to have it before the next release

RonnyPfannschmidt · 2017-06-12T06:42:57Z

_pytest/python_api.py

+
+
+try:
+    import numpy as np


a module level always imposed import of numy doesn't sit well with me, please discuss @nicoddemus

it means every time numpy is installed, it will be imported for a pytest run - no matter whether the tests actually use it or not

I agree, we should try to avoid importing non-standard library at the module level.

Since we are ApproxNumpy subclasses np.ndarray (for reasons well explained in the docstring 👍 ), one way to implement this cleanly would be to move ApproxNumpy to its own module, and import it conditionally in approx.

This is a good point, I'll try to avoid importing numpy unless I really need it.

Another possibility might be to construct a class on the fly using type(). That would be a little more magical, but it has the advantage of keeping all the code in one file.

you can construct a class in a function and return it - combine that with a memoize pattern and its good to go without type() usage

nicoddemus

Overall awesome work @kalekundert!

nicoddemus · 2017-06-12T15:31:26Z

_pytest/python_api.py

+
+
+try:
+    import numpy as np


I agree, we should try to avoid importing non-standard library at the module level.

Since we are ApproxNumpy subclasses np.ndarray (for reasons well explained in the docstring 👍 ), one way to implement this cleanly would be to move ApproxNumpy to its own module, and import it conditionally in approx.

nicoddemus · 2017-06-12T15:37:55Z

_pytest/python_api.py

+        return iter(self.expected)
+
+    def _yield_comparisons(self, actual):
+        return zip(actual, self.expected)


I think this should be itertools.izip in Python 2. Please add something like this to _pytest.compat:

if _PY2: zip = itertools.izip

And then import it here:

from _pytest.compat import zip

Sounds good.

nicoddemus · 2017-06-12T15:39:51Z

_pytest/python_api.py

+
+    from collections import Mapping, Sequence
+    try:
+        String = basestring  # python2


Please use STRING_TYPES from _pytest.compat instead

nicoddemus · 2017-06-12T15:41:21Z

_pytest/python_api.py

@@ -49,6 +310,8 @@ class approx(object):

        >>> (0.1 + 0.2, 0.2 + 0.4) == approx((0.3, 0.6))
        True
+        >>> {'a': 0.1 + 0.2, 'b': 0.2 + 0.4} == approx({'a': 0.3, 'b': 0.6})


Here I think it's worth emphasizing that it will compare only the values

nicoddemus · 2017-06-12T15:41:44Z

_pytest/python_api.py

@@ -49,6 +310,8 @@ class approx(object):

        >>> (0.1 + 0.2, 0.2 + 0.4) == approx((0.3, 0.6))
        True
+        >>> {'a': 0.1 + 0.2, 'b': 0.2 + 0.4} == approx({'a': 0.3, 'b': 0.6})
+        True


Also please mention that it also supports numpy arrays

How about:

The same syntax also works on sequences of numbers:: >>> (0.1 + 0.2, 0.2 + 0.4) == approx((0.3, 0.6)) True Dictionary *values*:: >>> {'a': 0.1 + 0.2, 'b': 0.2 + 0.4} == approx({'a': 0.3, 'b': 0.6}) True And ``numpy`` arrays:: <numpy array example>

That sounds good to me.

nicoddemus · 2017-06-12T15:45:40Z

testing/python/approx.py

+
+    def test_numpy_array(self):
+        try:
+            import numpy as np


You can use pytest.importorskip instead

Also I think we should add numpy as a tox dependency to run the tests in tox.ini.

Ah, good to know. I always learn something about using pytest whenever I sumbit a pull request!

nicoddemus · 2017-06-12T15:47:04Z

testing/python/approx.py

+
+    def test_numpy_array_wrong_shape(self):
+        try:
+            import numpy as np


Ditto for pytest.importorskip

tadeu · 2017-06-12T15:52:32Z

_pytest/python_api.py

+        """
+        Perform approximate comparisons for numpy arrays.
+
+        This class must inherit from numpy.ndarray in order to allow the approx 


Thanks for working on this, @kalekundert. I'll make some comments because I'm interested in using this :)

Nice documentation here, my gut reaction was "why inheritance when composition would be simpler", but this makes it clear that it's because approx also supports the syntax assert a == approx(b).

Do you think it's too late to break the approx interface and only allow assert approx(a) == b, since this could simplify the implementations a lot?

That's a good point. I think it is reasonable to document to users that approx should always be on the left side.

Btw, this wouldn't actually break anything because numpy arrays are not currently supported anyway. 😉

Ahh, but it would be inconsistent with approx usage for simple numbers and lists, since it currently supports both sides (it is indeed documented as being used on the right side (assert a == approx(b))).

I was thinking about this again... the problem on forcing it to be on the left-side is that it would be weird to use the standard assert obtained == expected, since the "approximation" is usually done on the "expected" value, i.e., this makes much more sense and is easier to read:

assert calculate_something() == approx(5.0, rtol=1e-3) # must be in range 5.0 +- 0.005

So it seems ok to complicate the implementation a little...

I think it is reasonable to document to users that approx should always be on the left side.

Wait sorry, brain fart on my part, I totally meant on the right side:

assert f(x) == approx(3.0)

Which I think is the more natural way to write it, as it even reads well in English ("f(x) is approximately 3.0").

So I guess it is important to support this feature. Sorry for the noise.

tadeu · 2017-06-12T17:04:14Z

_pytest/python_api.py

+            ensured by the approx() delegator function.
+            """
+            assert isinstance(expected, np.ndarray)
+            obj = super(ApproxNumpy, cls).__new__(cls, expected.shape)


Is this creating a copy of the array? Or allocating an array unnecessarily?

There's a "view casting" operation that could be helpful on eliminating this copy: https://docs.scipy.org/doc/numpy-1.12.0/user/basics.subclassing.html#view-casting

But maybe the shape here could be () (and the real shape kept on an attribute such as self.shape), since it seems the inheritance is being done only to make __eq__ work, as the real array is on ApproxBase.expected.

Yeah, I didn't think about this code too much when I wrote it. I'll check out view-casting, but my initial impression is that just passing () as the shape will be the simplest and least error-prone approach.

tadeu · 2017-06-12T17:12:54Z

_pytest/python_api.py

+
+        def __eq__(self, actual):
+            try:
+                actual = np.array(actual)


np.asarray should be more efficient if the argument is an array already

tadeu · 2017-06-12T17:19:36Z

_pytest/python_api.py

+
+    def __repr__(self):
+        item = lambda k, v: "'{0}': {1}".format(k, self._approx_scalar(v))
+        return '{' + ', '.join(item(k,v) for k,v in self.expected.items()) + '}'


Hmm, this code seems to be reimplementing dict repr. What about:

def __repr__(self): return repr({k: self._approx_scalar(v)} for k, v in self.expected.items())

Would it work?

I don't think it's a good idea to imitate a dict __repr__, or a list/tuple below - after all, those aren't dicts/lists/tuples, so if they act like they are, that can be pretty confusing.

The approx __repr__ is still distinguishable from the dict/list/tuple __repr__ because tolerances are included with each value. In other words, repr(approx(1.0)) will look like {'a': 1.0 ± 1e-6} rather than {'a': 1.0}.

Also, the reason I tried to mimic the format of the built-in data structures in the first place is that when I get assertion failures, I find it easier to spot the important differences if the actual and expected values are formatted the same. If they're formatted differently (i.e. one has braces and the other doesn't), my eye immediately goes to those differences first. So I think the braces improve clarity more than they might detract from it.

tadeu · 2017-06-12T17:56:37Z

testing/python/approx.py

+
+        assert a12 != approx(a21)
+        assert a21 != approx(a12)
+


I know this is a feature request 😄 , but it would be nice to have tests and documented behaviour for NaN's too, something like:

assert np.nan == approx(np.nan) assert np.inf == approx(np.inf) assert -np.inf == approx(-np.inf) assert np.array([np.nan, np.inf, -np.inf]) == approx(np.array([np.nan, np.inf, -np.inf])) assert np.nan != approx(np.inf) assert np.inf != approx(-np.inf) assert np.nan != approx(0.0)

Some applications of arrays rely on using NaN for missing values, so even though the default behaviour of NaN != NaN, it'd be nice to be able to test that the missing values are the same.

There are tests and documented behavior for float('inf') and float('nan'). Is that what you mean, or are you suggesting that np.inf and np.nan behave differently and should be tested/documented separately?

I like the idea of having an option to consider NaNs equal to each other, and it should be easy to implement.

Is that what you mean, or are you suggesting that np.inf and np.nan behave differently and should be tested/documented separately?

Ahh, no, I've used np.nan because I'm more used to that, but float('inf') and float('nan') should work in the same way, perhaps even better because they don't need numpy.

I like the idea of having an option to consider NaNs equal to each other, and it should be easy to implement.

Yeah, this was the main suggestion, and to test Numpy arrays with these values too. :)

tadeu · 2017-06-12T18:17:44Z

_pytest/python_api.py

+
+    def __repr__(self):
+        open, close = '()' if isinstance(self.expected, tuple) else '[]'
+        return open + ApproxBase.__repr__(self) + close


Watch out because tuples of one element should end with a ,, like (1.0,).
Suggestion:

return type(self.expected)(self._approx_scalar(x) for x in self._yield_expected())

A custom sequence can have an __init__ you know nothing about - this will break for any custom sequence which has an __init__ which doesn't work like that.

What about this:

def __repr__(self): seq_type = type(self.expected) if seq_type not in (tuple, list, set): seq_type = list return repr(seq_type(self._approx_scalar(x) for x in self._yield_expected()))

tadeu · 2017-06-12T18:18:43Z

_pytest/python_api.py

+    def _approx_scalar(self, x):
+        return ApproxScalar(x, rel=self.rel, abs=self.abs)
+
+    def _yield_expected(self, actual):


This seems to be unnecessary, most use cases are just iterating on self.expected, and the actual parameter is not present in the signature of the implementations in the derived classes

Oops, copy-and-paste bug.

- Avoid importing numpy unless necessary. - Mention numpy arrays and dictionaries in the docs. - Add numpy to the list of tox dependencies. - Don't unnecessarily copy arrays or allocate empty space for them. - Use code from compat.py rather than writing py2/3 versions of things myself. - Avoid reimplementing __repr__ for built-in types. - Add an option to consider NaN == NaN, because sometimes people use NaN to mean "missing data".

coveralls · 2017-06-15T16:56:38Z

Coverage decreased (-0.03%) to 92.101% when pulling 8badb47 on kalekundert:features into 9bd8907 on pytest-dev:features.

It used to be a class, but it's a function now.

They seem like more trouble that they're worth.

coveralls · 2017-06-15T22:35:23Z

Coverage decreased (-0.03%) to 92.101% when pulling 5076955 on kalekundert:features into 9bd8907 on pytest-dev:features.

Travis was not successfully installing numpy with python<=2.6, python<=3.3, or PyPy. I decided that it didn't make sense to use numpy for all the tests, so instead I made new testing environments specifically for numpy.

Not compatible with python26.

coveralls · 2017-06-16T02:24:25Z

Coverage decreased (-0.2%) to 91.917% when pulling 5d24968 on kalekundert:features into 9bd8907 on pytest-dev:features.

I thought the file was just out of date, but adding py36 made Travis complain "InterpreterNotFound: python3.6", so I guess it was correct as it was.

RonnyPfannschmidt · 2017-06-16T15:59:19Z

_pytest/python_api.py

-        return repr({
-            k: self._approx_scalar(v)
-            for k,v in self.expected.items()})
+        return repr(dict(


i propose using return "approx({data!r})".format(data=dict(...))

coveralls · 2017-06-16T16:05:25Z

Coverage decreased (-0.2%) to 91.917% when pulling 9597e67 on kalekundert:features into 9bd8907 on pytest-dev:features.

RonnyPfannschmidt · 2017-06-23T17:17:41Z

_pytest/python_api.py

+        # Create the delegate class on the fly.  This allow us to inherit from
+        # ``np.ndarray`` while still not importing numpy unless we need to.
+        import numpy as np
+        cls = type('ApproxNumpy', (ApproxNumpyBase, np.ndarray), {})


each use with numpy creates a new type here, i suggest a inner function with a mutable default argument to cache it

Good point. I added a class method to take care of creating and caching the new type. I think it made the code a little more clear, too.

coveralls · 2017-07-04T06:27:45Z

Coverage decreased (-0.2%) to 91.886% when pulling c111e9d on kalekundert:features into 9bd8907 on pytest-dev:features.

kalekundert · 2017-07-04T06:36:52Z

I think this is ready to be merged. Let me know if there's anything else that could be improved!

Slightly more general, probably doesn't make a difference.

coveralls · 2017-07-04T17:02:21Z

Coverage decreased (-0.2%) to 91.886% when pulling 7a1a439 on kalekundert:features into 9bd8907 on pytest-dev:features.

kalekundert added 3 commits June 11, 2017 19:27

Add support for numpy arrays (and dicts) to approx.

9f3122f

This fixes pytest-dev#1994. It turned out to require a lot of refactoring because subclassing numpy.ndarray was necessary to coerce python into calling the right `__eq__` operator.

Resolve merge conflict due to approx being moved.

8c22aee

Add a changelog entry.

89292f0

RonnyPfannschmidt reviewed Jun 12, 2017

View reviewed changes

nicoddemus requested changes Jun 12, 2017

View reviewed changes

tadeu reviewed Jun 12, 2017

View reviewed changes

kalekundert added 2 commits June 15, 2017 14:52

Use autofunction to document approx.

b41852c

It used to be a class, but it's a function now.

Skip the numpy doctests.

5076955

They seem like more trouble that they're worth.

kalekundert added 2 commits June 15, 2017 18:46

Only test numpy with py27 and py35.

5d24968

Travis was not successfully installing numpy with python<=2.6, python<=3.3, or PyPy. I decided that it didn't make sense to use numpy for all the tests, so instead I made new testing environments specifically for numpy.

Remove a dict-comprehension.

4d02863

Not compatible with python26.

kalekundert added 2 commits June 15, 2017 20:34

Remove py36 from .travis.yml

d6000e5

I thought the file was just out of date, but adding py36 made Travis complain "InterpreterNotFound: python3.6", so I guess it was correct as it was.

Use sets to compare dictionary keys.

9597e67

RonnyPfannschmidt reviewed Jun 16, 2017

View reviewed changes

RonnyPfannschmidt reviewed Jun 23, 2017

View reviewed changes

kalekundert added 2 commits July 3, 2017 22:44

Add "approx" to all the repr-strings.

8524a57

Avoid making multiple ApproxNumpy types.

c111e9d

RonnyPfannschmidt approved these changes Jul 4, 2017

View reviewed changes

Use cls instead of ApproxNumpyBase.

7a1a439

Slightly more general, probably doesn't make a difference.

nicoddemus approved these changes Jul 5, 2017

View reviewed changes

RonnyPfannschmidt merged commit ef62b86 into pytest-dev:features Jul 6, 2017

kalekundert mentioned this pull request Jul 13, 2017

Approx with inequalities #2003

Closed

kalekundert deleted the features branch July 22, 2017 17:25

Add support for numpy arrays (and dicts) to approx. #2492

Add support for numpy arrays (and dicts) to approx. #2492

Conversation

kalekundert commented Jun 12, 2017

coveralls commented Jun 12, 2017

RonnyPfannschmidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicoddemus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicoddemus Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeu Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeu Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeu Jun 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jun 15, 2017

coveralls commented Jun 15, 2017

coveralls commented Jun 16, 2017

RonnyPfannschmidt Jun 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jun 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jul 4, 2017

kalekundert commented Jul 4, 2017

coveralls commented Jul 4, 2017

nicoddemus Jun 12, 2017 •

edited

Loading

tadeu Jun 12, 2017 •

edited

Loading

tadeu Jun 12, 2017 •

edited

Loading

tadeu Jun 12, 2017 •

edited

Loading

RonnyPfannschmidt Jun 16, 2017 •

edited

Loading