Skip to content

Visual Regression Tests

Rodrigo Vilar edited this page Jul 3, 2021 · 25 revisions

About

VexFlow comes with a visual regression test system. All VexFlow QUnit tests (the ones in tests/) that generate images are automatically included in these regression tests.

The goal of this system is to detect regressions in the rendered output without having to rely on human eyeballs, especially given the huge number of tests that exist today. It does this by calculating a perceptual hash (PHASH) of each test image and comparing it with the hash of a good known blessed image. The larger the arithmetic distance between the hashes, the more different the two images are.

The system also generates a diff image, which is an overlay of the two images, with the differences highlighted to ease debugging.

Crude Example

Below you can see an example of a blessed image, a current image, and the visual difference (value 27.649.)

Prerequisites

The test system relies on the open-source libraries RSVG and ImageMagick.

Installing on OS X with HomeBrew: $ brew install librsvg imagemagick. NOTE: you might also need to run brew reinstall node to relink the new libraries.

Installing on Ubuntu Linux: $ apt-get install librsvg2-dev librsvg2-bin imagemagick

How to Test

After you install the dependencies, you can start the test by running the following command:

$ npm test
(or, on a headless machine) $ xvfb-run -a npm test

This will store the images of the failed tests (i.e. where there are visible differences) in build/images/diff. Visually inspect these images (including the diff image), and if you're happy with the output you can simply submit your changes. Be sure to include the before, after, and diff images in your pull request.

Alternatively, you can test against a particular commit (i.e.: master or any other commit). Run the following commands to generate the reference:

$ git checkout master
$ npm run reference

Then run the following command in your branch:

$ npm run test:reference

How it Works

The npm run generate command generates images from the current code-base. Files are named by their QUnit module and test name. The last-good-known-images are generated from the last released binaries and stored in build/images/blessed. The images from the current code are stored in build/images/current.

The npm run diff command calculates the PHASH values to detect images that have changed significantly. These images are stored in build/images/diff. There also the files results.txt and warnings.txt with the complete results.

The current version of visual_regression.sh does the following:

  • For all SVG files in build/images/blessed:
  • Find the relevant file in build/images/current.
  • Convert both to PNG.
  • Calculate the PHASH (perceptual hash) of each and report the difference.
  • Store the file name and difference value in build/images/diff/results.txt
  • If the difference is greater than a predefined threshold (0.01), copy the current, blessed, and a special "diff-image" to build/images/diff. The diff-image visually highlights the differences.
  • Sort build/images/diff/results.txt by PHASH difference.

You can examine the results.txt file and looks for images fall below the threshold. If you're seeing too many false positives, you can also adjust the threshold in tools/visual_regression.sh.