ci: send test failure notifications when not a PR #7934

michaelfig · 2023-06-15T17:55:01Z

refs: #7887

Description

Make test failures visible on Slack when they are not in the context of a PR, to prevent them from going unnoticed.

This is mainly so that the Endo integration tests are properly surfaced to anybody who's interested in them.

Security Considerations

n/a

Scaling Considerations

Documentation Considerations

n/a

Testing Considerations

n/a

warner · 2023-06-15T18:04:26Z

.github/workflows/test-all-packages.yml

@@ -119,39 +119,64 @@ jobs:
      #- name: yarn test (everything)
      #  run: yarn ${{ steps.vars.outputs.test }}
      - name: yarn test (access-token)
+        if: always()


these are unfortunate.. why do they become necessary?

these are unfortunate

There's already plenty of boilerplate in the test files, and better to be consistent than to be surgical when copy-pasta might fail to preserve the best behaviour.

why do they become necessary?

I want all the tests to run, even if some previous one failed (but still report the job status as failure if it did). It's been a longstanding annoyance for me that I have to push-wait-fix over and over again, gradually unveiling the errors in my code, rather than seeing all the errors my PR introduces at once the way that a local yarn test would have shown.

We probably want success() || failure() or ! cancelled() rather than always(), because the latter matches even if the job is cancelled: https://docs.github.com/en/actions/learn-github-actions/expressions#always .

the latter matches even if the job is cancelled

TIL that always() is not a synonym for success() || failure(). :)

Oh, ok, I guess it makes sense that the default is to not execute step 2 if the previous step 1 failed, and in our big batch of sequential tests, we don't want that short-circuiting.

I think we'd want a few early steps (like performing the git clone, or rebuilding the Node cache) to be mandatory; running cd packages/vat-data && yarn test should only happen if the rebuild-cache step succeeded. But we don't want it to depend upon whether the previous e.g. cd packages/swing-store && yarn test succeeded.

Is there a way to express that? Would adding if: success() || failure() to each yarn test step accidentally cause it to attempt to run all tests even if the preliminaries failed?

Oh damn I had forgot we had some tests that ran sequentially.

@warner I believe (but I'm not sure) that the whole test-quick wouldn't run if the build failed.

Reading those github docs now, I think that yes, adding if : success() || failure() will cause the test steps to run even if the prelimiinaries have failed, and the only way to avoid that (while still running all test cases even if one test case fails) is to add something that cancels the workflow run (except really just the job) from within the script, guarded by a if: failure() clause, in a step that sits after the preliminaries and before the batch of yarn test steps.

Sounds messy, and the preliminaries don't fail too frequently, and while it'd be confusing to perform the test (and publish their results?) even though the e.g. git clone failed, it'd probably be more messy to try and avoid it.

the whole test-quick wouldn't run if the build failed.

Correct. All the test-* jobs have needs: build so they won't run on failure to create the build cache. The build cache restore can fail, but that's pretty rare nowadays.

warner

Looks good, some suggestions but not critical.

warner · 2023-06-20T20:11:12Z

.github/workflows/test-all-packages.yml

        run: cd packages/xsnap && yarn ${{ steps.vars.outputs.test }} | $TEST_COLLECT
      # explicitly test the XS worker, for visibility
      - name: yarn test (SwingSet XS Worker)
-        if: matrix.engine != 'xs'
+        if: success() || failure() && matrix.engine != 'xs'


maybe ( success() || failure() ) && (matrix.engine != 'xs') to be clear

Absolutely! This was an incorrect search-and-replace of always().

warner · 2023-06-20T20:16:35Z

.github/workflows/test-all-packages.yml

+        if: failure() && github.event_name != 'pull_request'
+        uses: ./.github/actions/notify-status
+        with:
+          webhook: ${{ secrets.SLACK_WEBHOOK_URL }}


Given the number of times this with: block appears (20, by my count), would it make sense to replace notify-status with notify-slack-and-email, and to move these with: secrets.* lookups into the action? It would make the action less generic, but it's already living in our own repo anyways, and it's not like we're going to send email notifications to different addresses on a per-job basis. Or maybe have the action use e.g. secrets.SLACK_WEBHOOK_URL as the default, so each job could override it with like a specific slack channel, but you could otherwise cut out the 100 lines needed to provide the same secrets to every invocation.

secrets.* are not available in actions. You need to pass them explicitly.

mhofman · 2023-06-20T20:07:23Z

.github/workflows/test-all-packages.yml

@@ -119,39 +119,64 @@ jobs:
      #- name: yarn test (everything)
      #  run: yarn ${{ steps.vars.outputs.test }}
      - name: yarn test (access-token)
+        if: always()


Oh damn I had forgot we had some tests that ran sequentially.

@warner I believe (but I'm not sure) that the whole test-quick wouldn't run if the build failed.

mhofman · 2023-06-20T20:16:35Z

.github/workflows/test-all-packages.yml

        run: cd packages/xsnap && yarn ${{ steps.vars.outputs.test }} | $TEST_COLLECT
      # explicitly test the XS worker, for visibility
      - name: yarn test (SwingSet XS Worker)
-        if: always() && matrix.engine != 'xs'
+        if: success() || failure() && matrix.engine != 'xs'


What's the precedence here? Maybe the following is safer ?

Suggested change

if: success() || failure() && matrix.engine != 'xs'

if: (success() || failure()) && matrix.engine != 'xs'

ci: send test failure notifications when not a PR

michaelfig changed the title ci ci: send test failure notifications when not a PR Jun 15, 2023

michaelfig requested review from arirubinstein and warner June 15, 2023 17:57

michaelfig marked this pull request as ready for review June 15, 2023 17:58

warner reviewed Jun 15, 2023

View reviewed changes

michaelfig requested review from warner and gibson042 June 15, 2023 20:30

michaelfig self-assigned this Jun 20, 2023

michaelfig requested a review from mhofman June 20, 2023 18:24

warner approved these changes Jun 20, 2023

View reviewed changes

mhofman approved these changes Jun 20, 2023

View reviewed changes

michaelfig added 2 commits June 20, 2023 19:19

ci(notify-status): upgrade deprecated actions

00eb1f1

ci: send test failure notifications when not a PR

25ae4f3

michaelfig force-pushed the mfig-ci-notifications branch from 555c2e1 to 25ae4f3 Compare June 21, 2023 01:22

michaelfig enabled auto-merge June 21, 2023 01:22

michaelfig added this pull request to the merge queue Jun 21, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 21, 2023

michaelfig added this pull request to the merge queue Jun 21, 2023

Merged via the queue into master with commit 2c5dd59 Jun 21, 2023

michaelfig deleted the mfig-ci-notifications branch June 21, 2023 15:39

mhofman pushed a commit that referenced this pull request Aug 7, 2023

Merge pull request #7934 from Agoric/mfig-ci-notifications

ac1c923

ci: send test failure notifications when not a PR

mhofman pushed a commit that referenced this pull request Aug 7, 2023

Merge pull request #7934 from Agoric/mfig-ci-notifications

93c313b

ci: send test failure notifications when not a PR

mhofman pushed a commit that referenced this pull request Jan 12, 2024

Merge pull request #7934 from Agoric/mfig-ci-notifications

08e1bb2

ci: send test failure notifications when not a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: send test failure notifications when not a PR #7934

ci: send test failure notifications when not a PR #7934

michaelfig commented Jun 15, 2023 •

edited

Loading

warner Jun 15, 2023

michaelfig Jun 15, 2023

gibson042 Jun 15, 2023

michaelfig Jun 15, 2023 •

edited

Loading

warner Jun 20, 2023

mhofman Jun 20, 2023

warner Jun 20, 2023

michaelfig Jun 21, 2023

warner left a comment

warner Jun 20, 2023

michaelfig Jun 21, 2023

warner Jun 20, 2023

michaelfig Jun 21, 2023

mhofman Jun 20, 2023

mhofman Jun 20, 2023

michaelfig Jun 21, 2023

	if: success() \|\| failure() && matrix.engine != 'xs'
	if: (success() \|\| failure()) && matrix.engine != 'xs'

ci: send test failure notifications when not a PR #7934

ci: send test failure notifications when not a PR #7934

Conversation

michaelfig commented Jun 15, 2023 • edited Loading

Description

Security Considerations

Scaling Considerations

Documentation Considerations

Testing Considerations

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelfig Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

warner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelfig commented Jun 15, 2023 •

edited

Loading

michaelfig Jun 15, 2023 •

edited

Loading