Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flakiness quality gate for new tests #24009

Closed
9 tasks
hjetpoluru opened this issue Apr 12, 2024 · 0 comments · Fixed by #24556
Closed
9 tasks

Flakiness quality gate for new tests #24009

hjetpoluru opened this issue Apr 12, 2024 · 0 comments · Fixed by #24556
Assignees
Labels
release-12.2.0 Issue or pull request that will be included in release 12.2.0 team-extension-platform

Comments

@hjetpoluru
Copy link
Contributor

hjetpoluru commented Apr 12, 2024

What is this about?

  • Newly created tests may initially exhibit flakiness and require fine-tuning, could we introduce “check-new-flaky” CLI tool that will help specifically designed to execute them multiple times.

Scenario

Feature: Check New Flaky CLI Tool
In order to reduce the flakiness of newly created tests
As a developer
I want a CLI tool that can execute these tests multiple times to identify flakiness

Scenario: Identifying flaky tests with the CLI tool

Given a set of newly created tests
When I execute these tests using the "check-new-flaky" CLI tool
Then the tool should run the tests multiple times
And identify any tests that exhibit flakiness

Design

No response

Technical Details

@pedronfigueiredo provided below plan to achieve this task

  1. Currently we have a script from Pedro that runs locally multiple tests
  2. We need to convert it typescript script that modifies run-e2e-test.js to accept multiple test runs.
  • Collect test outputs as text file
  • optional: include notification nudge at the end of the test runs (needs to be modified to be cross platform)
  • ability to run headless
  • ideally we can reuse/expand test/e2e/run-e2e-test.js
  1. Creating a GitHub action that takes a code diff and extracts all e2e test cases that were added or modified.
  • Run these cases X amount of times using the script in 2.

Threat Modeling Framework

No response

Acceptance Criteria

No response

Stakeholder review needed before the work gets merged

  • Engineering (needed in most cases)
  • Design
  • Product
  • QA (automation tests are required to pass before merging PRs but not all changes are covered by automation tests - please review if QA is needed beyond automation tests)
  • Security
  • Legal
  • Marketing
  • Management (please specify)
  • Other (please specify)

References

No response

@gauthierpetetin gauthierpetetin changed the title Flakiness check for new test Flakiness quality gate for new tests Apr 24, 2024
seaona added a commit that referenced this issue Jul 1, 2024
… have been modified (#24556)

<!--
Please submit this PR as a draft initially.
Do not mark it as "Ready for review" until the template has been
completely filled out, and PR status checks have passed at least once.
-->

## **Description**
This PR adds a quality gate for new or modified e2e spec files. Whenever
there is a PR which modifies or changes a test, this will be run more
times, in order to prevent introducing a flakiness accidentally. It is
done as follows:


- Identifies any new or modified e2e file from inside the test/ folder
using `git diff` and using these 2 filters:
  - `file.filename.startsWith('test/e2e/') &&`
- `file.filename.endsWith('.spec.js') ||
file.filename.endsWith('.spec.ts') `
- Copies the given specs x5 times in the list of testpaths to execute ->
this number is arbitrary, we could modify it to any value we want. The
reason for taking this approach instead of changing the retrial number
is to benefit of the parallelization, as @HowardBraham pointed out in a
comment.
- Since we already had a flag which could support the re-running
successful tests, `--retry-until-failure` I just leveraged this into the
`for` loop for each test, and if that testcase was identified as
new/modified, the flag is added so the new tests fail fast without
retrials

### Incremental git fetch depth within shallow clone
We use git fetch with incremental depth as @danjm suggested. The ci
environment uses a shallow clone, meaning we won't be able to succeed
just by using git diff as it won't find the merge base. For fixing that,
we start with a git fetch depth of 1, and keep incrementing the depth
(1, 10, 100) it the error is `no merge base` up until 100. If the git
diff still fails, we then do a full git fetch with the `unshallow` flag.

- [Update] This is the working result with the last commit
https://app.circleci.com/pipelines/github/MetaMask/metamask-extension/89269/workflows/103b78a8-8f0d-4323-96b0-9e235c4cbc81/jobs/3296802
![Screenshot from 2024-06-26
11-39-19](https://github.com/MetaMask/metamask-extension/assets/54408225/a2a89d6a-3a73-48ba-91a3-20aeadc38573)


### New ci Job
The git diff is done in a new ci job which runs at the beginning in
parallel of prep-deps.





[![Open in GitHub
Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/MetaMask/metamask-extension/pull/24556?quickstart=1)


## **Related issues**

Fixes: #24009

## **Manual testing steps**

1. Check ci runs (notice previous runs had failing and changed tests on
purpose, in order to try the different scenarios described below)

## **Screenshots/Recordings**


=============================================== [UPDATE with the new
code changes]
- 🟢 Case 1: A test has changed -> it's rerun 1+5 times and
it's successful (it will be run in different buckets)




https://github.com/MetaMask/metamask-extension/assets/54408225/c1456104-1f5f-4ef3-9364-4e435f8797f4



https://app.circleci.com/pipelines/github/MetaMask/metamask-extension/89277/workflows/7fce0a2e-773f-46da-8ab9-1dbec7992b58/jobs/3297267/parallel-runs/10?filterBy=ALL


- 🟢 Case 2: A test has changed, but it has a mistake in the
code (intentionally to simulate a flaky test) -> it fails immediately
and there are no more retries. The rest of the tests, are retried if
they failed as usual


- Case for main test build test:
https://app.circleci.com/pipelines/github/MetaMask/metamask-extension/89277/workflows/7fce0a2e-773f-46da-8ab9-1dbec7992b58/jobs/3297267/artifacts
- Case for flask specific test:
https://app.circleci.com/pipelines/github/MetaMask/metamask-extension/89277/workflows/7fce0a2e-773f-46da-8ab9-1dbec7992b58/jobs/3297277/artifacts



- 🟢 Case 3: A PR has no test spec files changed -> nothing
different happens
- ci run: check current ci 


## **Pre-merge author checklist**

- [ ] I’ve followed [MetaMask Coding
Standards](https://github.com/MetaMask/metamask-extension/blob/develop/.github/guidelines/CODING_GUIDELINES.md).
- [ ] I've completed the PR template to the best of my ability
- [ ] I’ve included tests if applicable
- [ ] I’ve documented my code using [JSDoc](https://jsdoc.app/) format
if applicable
- [ ] I’ve applied the right labels on the PR (see [labeling
guidelines](https://github.com/MetaMask/metamask-extension/blob/develop/.github/guidelines/LABELING_GUIDELINES.md)).
Not required for external contributors.

## **Pre-merge reviewer checklist**

- [ ] I've manually tested the PR (e.g. pull and build branch, run the
app, test code being changed).
- [ ] I confirm that this PR addresses all acceptance criteria described
in the ticket it closes and includes the necessary testing evidence such
as recordings and or screenshots.

---------

Co-authored-by: Mark Stacey <[email protected]>
Co-authored-by: Howard Braham <[email protected]>
@metamaskbot metamaskbot added the release-12.2.0 Issue or pull request that will be included in release 12.2.0 label Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-12.2.0 Issue or pull request that will be included in release 12.2.0 team-extension-platform
Projects
None yet
3 participants