-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perceptual image hashing #2206
Perceptual image hashing #2206
Conversation
@@ -69,7 +68,7 @@ def show_replaced_by_check_graphic(test_case, tol=_DEFAULT_IMAGE_TOLERANCE): | |||
""" | |||
def replacement_show(): | |||
# form a closure on test_case and tolerance | |||
test_case.check_graphic(tol=tol) | |||
test_case.check_graphic() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the hope of refraining from a culture of tweaking graphical testing tolerances, I've closed the door to passing a tol
through to check_graphic
.
@@ -33,7 +33,7 @@ def test_atlantic_profiles(self): | |||
with fail_any_deprecation_warnings(): | |||
with add_examples_to_path(): | |||
import atlantic_profiles | |||
with show_replaced_by_check_graphic(self, tol=14.0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This covers up an issue in the use of twiny
... changing the tolerance of an individual test is not healthy.
result = func(*args, **kwargs) | ||
return result | ||
return decorated_func | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple decorator to facilitate the use of a re-entrant lock to provide thread-safe plotting. A re-entrant lock is required here as iris.plot.contourf
calls iris.plot.contour
...
@@ -665,6 +686,7 @@ def _map_common(draw_method_name, arg_func, mode, cube, plot_defn, | |||
return plotfn(*new_args, **kwargs) | |||
|
|||
|
|||
@_locker | |||
def contour(cube, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the lock for all public api plotting functions ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this will have any unexpected behavioural impacts in normal usage of iris.plot
? I must admit that I expected that the lock would be applied within the scope of the tests only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to agree... it feels like this should be done when testing. We are mostly wrapping matplotlib calls, which I guess aren't thread safe either, so it doesn't seem consistent to lock our wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkillick @ajdawson I see your point, and I was in two minds whether to lock at this level. To be honest, I expected that discussion to have happen during the review 😉 ...
Regardless of testing, iris plotting is not thread safe for users thanks to mpl, hence why I locked the wrappers at the plotting level. The problem lies deep within iris plotting regarding calls to iris.plot._replace_axes_with_cartopy_axes
Locking at the graphics test level completely makes sense as it totally ensures that graphics tests are atomic (as I've already implemented), so at a minimum we definitely require to keep that level of locking. The plotting level of locking may help the user in a threaded context, but I'm less convinced of this ... So we may be able to compromise and drop the plotting level of locking ... I'll take a look and see how it behaves. But I'd be more than happy to do that.
@@ -143,7 +147,8 @@ | |||
plt.switch_backend('tkagg') | |||
_DISPLAY_FIGURES = True | |||
|
|||
_DEFAULT_IMAGE_TOLERANCE = 10.0 | |||
# Threading non re-entrant blocking lock to ensure thread-safe plotting. | |||
_lock = threading.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A non re-entrant lock is required here to ensure that there is no cross-pollination between graphic tests using the wrong plot figures or axes.
with open(result_fname, 'rb') as fi: | ||
sha1 = hashlib.sha1(fi.read()) | ||
return sha1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hot-fix didn't work and needs to be purged.
At this point the plot was already wrong i.e. another thread had already clobbered either the plot title or axes title/s, so re-saving the clobbered plot image was never going to yield the correct sha.
h = hexstr[i*2:i*2+2] | ||
v = int("0x" + h, 16) | ||
l.append([v & 2**i > 0 for i in range(8)]) | ||
return imagehash.ImageHash(np.array(l)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix has already been merged to imagehash
but the fix has not yet been pushed to pypi
... we can remove this function when the fix is available via pypi
.... or even conda-forge!
buffer = io.BytesIO() | ||
figure = plt.gcf() | ||
figure.savefig(buffer, format='png') | ||
buffer.seek(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory buffers are our friend ... only save to disk on failure, rather than save to disk and delete on success.
|
||
if sha1.hexdigest() not in expected: | ||
if np.all([hd > tol for hd in distances]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result image fails to meet expectation iff it is not similar to all our registered expected images.
@@ -40,6 +40,7 @@ | |||
@tests.skip_plot | |||
class IntegrationTest(tests.GraphicsTest): | |||
def setUp(self): | |||
super(IntegrationTest, self).setUp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure that the GraphicsTest.setUp
is called, as it aquires
the non re-entrant lock that protects plot figures for a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjlittle do we need to follow this pattern for all instances of test classes inheriting from tests.GraphicsTest
? If so, you'll also need to update the following:
- In
lib/iris/tests
:test_analysis.TestRotatedPole
test_coordsystem.Test_LambertConformal
test_mapping.TestBasic
- In
testPlot
:Test1dPlotMultiArgs
,TestMissingCoord
,TestMissingCS
,TestAttributePositive
, TestPlotOtherCoordSystems
- All affected test classes in
lib/iris/tests/integration/test_grib_load.py
- Test class in
lib/iris/tests/integration/test_netcdftime.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkillick Thanks. Yes, we do indeed need to follow a pattern here ...
A GraphicsTest
subclass must call the GraphicsTest.setUp
method (in order to acquire
the non-reentrant lock) if the subclass in question overrides the inherited setUp
method. If the subclass doesn't specialise the setUp
method then, thanks to inheritance, the right thing happens, in that the unittest
framework calls the (inherited) GraphicsTest.setUp
for the subclass (this behaviour also applies to GraphicsTest.tearDown
btw to release
the lock).
In all of the test cases that you identified above, there is no need to explicitly call GraphicsTest.setUp
, simply because the subclass does not specialise the setUp
method.
Note that, you incorrectly mentioned test_mapping.TestBasic
:
class TestBasic(tests.GraphicsTest):
def setUp(self):
super(TestBasic, self).setUp()
self.cube = iris.tests.stock.realistic_4d()
which does apply the pattern, and so does test_plot.Test1dPlotMultiArgs
:
class Test1dPlotMultiArgs(tests.GraphicsTest):
# tests for iris.plot using multi-argument calling convention
def setUp(self):
super(Test1dPlotMultiArgs, self).setUp()
self.cube1d = _load_4d_testcube()[0, :, 0, 0]
self.draw_method = iplt.plot
So, I'd argue that this is a non-issue, unless I'm missing something ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this is a non-issue
Agreed. I was sure there must be a pattern to it, but I couldn't work out what the pattern was! Good ol' inheritance 😉
@@ -29,7 +29,6 @@ | |||
import codecs | |||
import contextlib | |||
from glob import glob | |||
import hashlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashlib
is dead ...
@@ -38,10 +37,12 @@ | |||
|
|||
from PIL import Image | |||
import filelock | |||
import imagehash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... long live imagehash
! 😉
h = hexstr[i*2:i*2+2] | ||
v = int("0x" + h, 16) | ||
l.append([v & 2**i > 0 for i in range(8)]) | ||
return imagehash.ImageHash(np.array(l)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this will disappear ...
|
||
for expected, actual, diff in _failed_images_iter(): | ||
for expected, actual, diff in step_over_diffs(rdir, 'similar', False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the failed test runner --print-failed-images
capability, so that we can see graphical failures that occurred remotely on travis again 😄
Whoop! All the tests passed first time! 🍻 |
🍻 indeed marqh clinks glasses with @bjlittle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible this PR was merged a little hastily. Here are my extra review comments, which I'll reference in a new issue to increase the chance they're acted upon.
@@ -60,6 +60,10 @@ install: | |||
fi | |||
fi | |||
|
|||
# Perceptual image hashing (TBD: push recipe to conda-forge!) | |||
- conda install pip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW you don't need to install pip - it's already available on Travis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the conda environment (and installing to it), though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pip is installed by default when you create a new conda environment (e.g. https://travis-ci.org/SciTools/iris/jobs/170017955#L253) so it doesn't need to be done separately. The result of this line is a no-op (https://travis-ci.org/SciTools/iris/jobs/170017955#L464).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup
@@ -665,6 +686,7 @@ def _map_common(draw_method_name, arg_func, mode, cube, plot_defn, | |||
return plotfn(*new_args, **kwargs) | |||
|
|||
|
|||
@_locker | |||
def contour(cube, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this will have any unexpected behavioural impacts in normal usage of iris.plot
? I must admit that I expected that the lock would be applied within the scope of the tests only.
@@ -40,6 +40,7 @@ | |||
@tests.skip_plot | |||
class IntegrationTest(tests.GraphicsTest): | |||
def setUp(self): | |||
super(IntegrationTest, self).setUp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjlittle do we need to follow this pattern for all instances of test classes inheriting from tests.GraphicsTest
? If so, you'll also need to update the following:
- In
lib/iris/tests
:test_analysis.TestRotatedPole
test_coordsystem.Test_LambertConformal
test_mapping.TestBasic
- In
testPlot
:Test1dPlotMultiArgs
,TestMissingCoord
,TestMissingCS
,TestAttributePositive
, TestPlotOtherCoordSystems
- All affected test classes in
lib/iris/tests/integration/test_grib_load.py
- Test class in
lib/iris/tests/integration/test_netcdftime.py
This PR provides a more stable and scalable approach to the new graphic testing framework, replacing cryptographic sha based image hashes with perceptual image hashes.
It relies on the support of the new perceptual image hash repository test-iris-imagehash.
Perceptual image hashes allows checking for the degree of similarity between images through measuring the hamming distance difference (a count of bit differences) between image hashes.
The major benefit of this perceptual image hash approach is that it is simple, scalable (already almost halved the number of required reference images) and makes us less sensitive to scientific software stack dependency changes.
This PR also addresses the issues raised in #2195. Normally, I would have pushed those changes in a separate PR, but there is clearly a major overlap here with regards to graphical testing. In short, the problem simply relates to iris plotting not being thread-safe, and our
travis-ci
testing usesnose
, which highlighted this problem in a typically dreaded, threaded sporadic way.Closes #2195