feat(ci): add test report check workflows (garris#1533) #91
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Draft
Summary
This introduces more thorough GitHub testing workflows.
It toes the line of schema validation, but picks up some low-hanging fruit for any potentially breaking changes in PRs. These could be considered complimentary to the basic
npm
scripts already configered.What are they Testing?
Perhaps a portion of the
backstop test
command was accidentally commented out:Running
npm run sanity-test
does not catch this:Comparing the
report.json
does, as it is never generated:![[Pasted image 20240115001855.png]]
Maybe someone changes a report property name:
![[Pasted image 20240114234903.png]]
Both are forced examples, but provide a glimpse into what is possible.
General Workflow Steps
Each workflow has the same basic structure, with caveats for
smoke
andintegration
:npm
testreport.json
to a fixture of the corresponding test (./test/__fixtures__/[npm-script-name].json
)diff
is run, filterreport.json
properties down to shape onlySmoke Caveat
I've seen a few smoke tests pass on GitHub but fail locally. For now, this is solved by deleting the properties we know will have different shapes (or not exist at all in a pass):
jq 'walk(if type == "object" then with_entries(.value |= if type == "object" or type == "array" then . else "" end) else . end) | del(.tests[].pair.diff, .tests[].pair.diffImage)' test/__fixtures__/smoke-test.json
diffImage
doesn't exist on passing tests, so it's removed before analyzingreport.json
."misMatchThreshold" : 0.1,
could also be bumped a bit, to be more forgiving.![[Pasted image 20240115022206.png]]
Integration Caveat
The
integration-test
script generates two reports. One when runningbackstop reference
, and the other afterbackstop test
. We apply a fancybash
one-liner to find the most recently modified directory, and onlydiff
the final report.Details
First and foremost,
.tests[.pair]
object values are empty. Values will never be 1:1, due to system runtime differences, browser changes over time, etc. Data shape only is being tested in these workflows.We use
jq
to traverse thereport.json
object and set nonarray
orobject
property values to empty strings:""
, and do the same for nested properties in any aforementionedobject
orarray
.This allows us to test the general "shape" of data we expect
backstop test
to produce, comparing it with corresponding JSON files intest/__fixtures/
.That ends up looking like this, which is the shape tested in
integration
andsanity
:Happy to discuss in detail :)
Integration
Runs
npm run integration-test
, then tests the resultantreport.json
, which is the last step in the project's integration test:backstop test
.The GitHub workflow results in a pass/fail based on shape alone. The unfiltered A/B fixture/CI
diff
is included in the workflow's summary for further analysis.![[Pasted image 20240114214019.png]]
Workflows
Sanity
Runs both
sanity-test
andsanity-test-playwright
then compares the corresponding fixture andreport.json
.Smoke
Runs both
smoke-test
andsmoke-test-playwright
then compares the corresponding fixture andreport.json
.Docker Sanity/Smoke
Same, but uses Docker.
Conclusion
A lot of options! Please let me know what you end up keeping and tossing. This is a level of testing I'm not sure the project needs, but would help identify PRs that introduce side-effects to the overall testing ecosystem.
Cheers!