"Re-run failed jobs" will not work with a parallel workflow #574

mellis481 · 2022-06-22T17:29:01Z

After implementing a custom build ID to ensure I could re-run workflows which I've integrated with Cypress Dashboard and configured to run parallelly, I've run into an issue (oddly different than this one).

Here is my job FWIW:

- id: cypress-mocked-api-tests
  uses: cypress-io/github-action@v2
  with:
    wait-on: 'https://localhost:9001/index.js'
    start: npm run start:${{ env.NODE_ENV }}
    config-file: cypress/config/${{ env.NODE_ENV }}.json
    config: video=true,videoUploadOnPasses=false
    spec: '**/*.spec.ts'
    install: false
    record: true
    parallel: true
    group: 'Mocked-API'
    ci-build-id: ${{ needs.prepare.outputs.uuid }}

This job will load balance all my spec files across five containers under a "Mocked-API" group. This works great and I can re-run all jobs without issue.

On a recent run, one of the five containers failed because one test failed. I thought I'd test how "Re-run failed jobs" worked on just the failed container job. My hope/expectation was that it would be smart enough to know which spec files it ran when the entire workflow executed originally (which would have been six spec files which included 22 test) and run those. Instead it ran zero spec files and completed successfully. It seems like the matrix-level orchestration that is needed is not occurring when only a failed container job is re-run. It looks like someone else has run into this issue too and is trying to solve it by disabling the "Re-run failed jobs" option in Github (which doesn't seem possible).

This is a fairly big problem because it resulted in the group (which I've configured as a status check in my trunk branch protection rule) passing and the PR being able to be merged when it had never successfully run all tests.

The text was updated successfully, but these errors were encountered:

conversayShawn · 2022-06-28T19:18:31Z

@mellis481 We recommend passing the GITHUB_TOKEN secret (created by the GH Action automatically) as an environment variable. This will allow correctly identifying every build and avoid confusion when re-running a build.

You can find an example here: https://github.com/cypress-io/github-action#record-test-results-on-cypress-dashboard

mellis481 · 2022-06-28T19:54:32Z

@conversaShawn That did nothing. This is what happened:

I added the GITHUB_TOKEN as an env variable to my cypress-io/github-action@v2 job in my PR workflow.
I added a failing test to my suite.
I ran the workflow which is configured to run in parallel using 5 containers. The test failed on Machine 5.
I executed "Re-run failed jobs".
On the second workflow run, Machine 5 executed 0 tests and passed.

ilovegithub2 · 2022-07-04T09:22:25Z

We are seeing exactly the same issue - rerunning failed jobs only will not run any tests but mark each as passed.

Here is our configuration

      - name: Run integration tests
        timeout-minutes: 20
        uses: cypress-io/github-action@v4
        env:
          CYPRESS_RECORD_KEY: ${{ secrets.CYPRESS_RECORD_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          ci-build-id: ${{ needs.prepare.outputs.uuid }}
          config: baseUrl=${{ format('https://pr{0}-www.build.{1}', github.event.number, env.CBR_PROJECT_DOMAIN) }}
          wait-on: ${{ format('https://pr{0}-www.build.{1}', github.event.number, env.CBR_PROJECT_DOMAIN) }}
          wait-on-timeout: 120
          browser: chrome
          record: true
          parallel: true
          group: merge
          install: false
          working-directory: tests/web
          ```

mgambati · 2022-07-15T22:42:55Z

Same case here.
Tests pass without execution after retrying failed jobs.

modern-sapien · 2022-08-03T19:24:48Z

related: Github actions: "Re-run failed jobs" will run the entire test suite

admah · 2022-09-13T23:59:04Z

There were recently some changes in our services repo that may have taken care of this issue. Can someone retest with 10.7.0 or later and post results? Thanks!

mellis481 · 2022-09-14T18:25:11Z

There were recently some changes in our services repo that may have taken care of this issue. Can someone retest with 10.7.0 or later and post results? Thanks!

@admah Thanks for contributing to this thread! I just tested with 10.8.0 and it did NOT work correctly. What I'm seeing now is different and not nearly as problematic as the initially-reported issue (the most egregious part of which was passing a workflow after re-running a workflow with a failing Cypress test), but still incorrect. To provide more details...

I added a failing test to my repo that is currently configured to balance my 39 Cypress spec files across five containers. As expected, the job for the container that had the new failing test failed while all other jobs completed successfully.

I then selected to "Re-run failed jobs". When I did this, it created a new workflow run which essentially copied the jobs that completed successfully in the first run and started re-running the one failing job. When I went into Cypress Dashboard to inspect this re-run further, I found that it was running specs in only one container (good), but it was running all 39 specs in that container (bad/whacky).

It should have re-run only the specs that it originally ran in the first run in that container (in my case 7 specs). The failing test in this workflow run did end up failing the job and, subsequently, workflow as desired, but it's, of course, undesirable for "Re-run failed jobs" to re-run all Cypress specs. It's not re-running failed (Cypress) jobs at that point; it's re-running all Cypress tests using the number of containers that failed in the original run.

admah · 2022-09-14T18:51:11Z

@mellis481 thanks for the screenshots and additional context. That's very helpful. I was able to get some more clarity on this from our Cloud team.

Here is the current status:

Before, there was an issue where all re-runs got a PASS, regardless of actual status. This issue has been fixed.
Currently, if a re-run is initiated, all specs get run on the machines available. That is not optimal. The Cloud team is looking into the connection between GH Actions and Cypress in order to set up re-runs to be accurate and efficient.

mellis481 · 2022-09-14T18:53:29Z

@admah I'm glad the update from your Cloud team matches my findings (in many less words 😄).

Hoping additional info on the second bullet will be shared in this thread when available.

admah · 2022-09-14T18:58:58Z

@mellis481 yes, I will be providing updates as they're available.

I will be closing this and updating in #531 since this is a duplicate of that issue.

admah · 2022-09-14T18:59:32Z

Duplicate of #531

cgraham-rs · 2024-12-02T18:45:14Z

I do not believe this ticket is a dupe of #531

This issue documents a scenario where using Re-run failed jobs runs exactly 0 tests and then emits a false pass.

#531 documents a scenario where re-run executes ALL tests in the suite instead of JUST the failed tests. This is a completely separate issue.

jennifer-shehane · 2024-12-03T17:10:34Z

@cgraham-rs If you're experiencing this behavior, it is because there is not a unique ci-build-id associated with the rerun. We try to interpret unique buildIds on our side, but if you're encountering this you can pass a unique buildId via the ci-build-id flag: https://docs.cypress.io/cloud/features/smart-orchestration/parallelization#Linking-CI-machines-for-parallelization-or-grouping

cgraham-rs · 2024-12-03T18:39:22Z

@jennifer-shehane as I mentioned in #431 the Robust custom build id documentation instructs us to manually craft a ci-build-id as a separate job from the test job. Though it mentions ...If the user re-runs the workflow a new unique build id is generated... which is true. But causes problems when a user chooses to Re-run failed jobs which would ONLY re-run the test job and not the setup job, thus the test job has the same ci-build-id from the previous run.

jennifer-shehane · 2024-12-09T19:23:09Z

@cgraham-rs This is an issue on our radar - that re-running failed jobs has a less than ideal experience in most CIs. We intend to look more into addressing this.

cgraham-rs · 2024-12-09T19:28:44Z

@jennifer-shehane A false pass is a really serious issue. AFAIK there's currently no open ticket tracking this specific problem. My suggestion is that this ticket be re-opened until such a time the specific scenario of Github Re-run failed jobs cannot run 0 tests and throw a false pass.

danjohansenconsulting · 2024-12-11T01:58:13Z

@cgraham-rs we have an initiative in Cypress Cloud that is a precursor for support of Re-run failed jobs. That initial work is scheduled for Q1. Once we have a solution that launches we will announce the release of that in Cloud.

admah marked this as a duplicate of #531 Sep 14, 2022

admah closed this as completed Sep 14, 2022

admah mentioned this issue Sep 14, 2022

Github actions: "Re-run failed jobs" will run the entire test suite #531

Closed

andreas-aman mentioned this issue Dec 22, 2023

[Snyk] Upgrade @octokit/core from 4.2.0 to 4.2.4 vk-gh-actions-forks/github-action#22

Open

cgraham-rs mentioned this issue Dec 2, 2024

Change recommendation for ci-build-id? #431

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Re-run failed jobs" will not work with a parallel workflow #574

"Re-run failed jobs" will not work with a parallel workflow #574

mellis481 commented Jun 22, 2022 •

edited

Loading

conversayShawn commented Jun 28, 2022

mellis481 commented Jun 28, 2022 •

edited

Loading

ilovegithub2 commented Jul 4, 2022

mgambati commented Jul 15, 2022

modern-sapien commented Aug 3, 2022

admah commented Sep 13, 2022

mellis481 commented Sep 14, 2022

admah commented Sep 14, 2022

mellis481 commented Sep 14, 2022 •

edited

Loading

admah commented Sep 14, 2022 •

edited

Loading

admah commented Sep 14, 2022

cgraham-rs commented Dec 2, 2024

jennifer-shehane commented Dec 3, 2024

cgraham-rs commented Dec 3, 2024

jennifer-shehane commented Dec 9, 2024

cgraham-rs commented Dec 9, 2024

danjohansenconsulting commented Dec 11, 2024

"Re-run failed jobs" will not work with a parallel workflow #574

"Re-run failed jobs" will not work with a parallel workflow #574

Comments

mellis481 commented Jun 22, 2022 • edited Loading

conversayShawn commented Jun 28, 2022

mellis481 commented Jun 28, 2022 • edited Loading

ilovegithub2 commented Jul 4, 2022

mgambati commented Jul 15, 2022

modern-sapien commented Aug 3, 2022

admah commented Sep 13, 2022

mellis481 commented Sep 14, 2022

admah commented Sep 14, 2022

mellis481 commented Sep 14, 2022 • edited Loading

admah commented Sep 14, 2022 • edited Loading

admah commented Sep 14, 2022

cgraham-rs commented Dec 2, 2024

jennifer-shehane commented Dec 3, 2024

cgraham-rs commented Dec 3, 2024

jennifer-shehane commented Dec 9, 2024

cgraham-rs commented Dec 9, 2024

danjohansenconsulting commented Dec 11, 2024

mellis481 commented Jun 22, 2022 •

edited

Loading

mellis481 commented Jun 28, 2022 •

edited

Loading

mellis481 commented Sep 14, 2022 •

edited

Loading

admah commented Sep 14, 2022 •

edited

Loading