Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: send test failure notifications when not a PR #7934

Merged
merged 2 commits into from
Jun 21, 2023

Conversation

michaelfig
Copy link
Member

@michaelfig michaelfig commented Jun 15, 2023

refs: #7887

Description

Make test failures visible on Slack when they are not in the context of a PR, to prevent them from going unnoticed.

This is mainly so that the Endo integration tests are properly surfaced to anybody who's interested in them.

Security Considerations

n/a

Scaling Considerations

Documentation Considerations

n/a

Testing Considerations

n/a

@michaelfig michaelfig changed the title ci ci: send test failure notifications when not a PR Jun 15, 2023
@michaelfig michaelfig marked this pull request as ready for review June 15, 2023 17:58
@@ -119,39 +119,64 @@ jobs:
#- name: yarn test (everything)
# run: yarn ${{ steps.vars.outputs.test }}
- name: yarn test (access-token)
if: always()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are unfortunate.. why do they become necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are unfortunate

There's already plenty of boilerplate in the test files, and better to be consistent than to be surgical when copy-pasta might fail to preserve the best behaviour.

why do they become necessary?

I want all the tests to run, even if some previous one failed (but still report the job status as failure if it did). It's been a longstanding annoyance for me that I have to push-wait-fix over and over again, gradually unveiling the errors in my code, rather than seeing all the errors my PR introduces at once the way that a local yarn test would have shown.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want success() || failure() or ! cancelled() rather than always(), because the latter matches even if the job is cancelled: https://docs.github.com/en/actions/learn-github-actions/expressions#always .

Copy link
Member Author

@michaelfig michaelfig Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the latter matches even if the job is cancelled

TIL that always() is not a synonym for success() || failure(). :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok, I guess it makes sense that the default is to not execute step 2 if the previous step 1 failed, and in our big batch of sequential tests, we don't want that short-circuiting.

I think we'd want a few early steps (like performing the git clone, or rebuilding the Node cache) to be mandatory; running cd packages/vat-data && yarn test should only happen if the rebuild-cache step succeeded. But we don't want it to depend upon whether the previous e.g. cd packages/swing-store && yarn test succeeded.

Is there a way to express that? Would adding if: success() || failure() to each yarn test step accidentally cause it to attempt to run all tests even if the preliminaries failed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh damn I had forgot we had some tests that ran sequentially.

@warner I believe (but I'm not sure) that the whole test-quick wouldn't run if the build failed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading those github docs now, I think that yes, adding if : success() || failure() will cause the test steps to run even if the prelimiinaries have failed, and the only way to avoid that (while still running all test cases even if one test case fails) is to add something that cancels the workflow run (except really just the job) from within the script, guarded by a if: failure() clause, in a step that sits after the preliminaries and before the batch of yarn test steps.

Sounds messy, and the preliminaries don't fail too frequently, and while it'd be confusing to perform the test (and publish their results?) even though the e.g. git clone failed, it'd probably be more messy to try and avoid it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole test-quick wouldn't run if the build failed.

Correct. All the test-* jobs have needs: build so they won't run on failure to create the build cache. The build cache restore can fail, but that's pretty rare nowadays.

@michaelfig michaelfig self-assigned this Jun 20, 2023
@michaelfig michaelfig requested a review from mhofman June 20, 2023 18:24
Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some suggestions but not critical.

run: cd packages/xsnap && yarn ${{ steps.vars.outputs.test }} | $TEST_COLLECT
# explicitly test the XS worker, for visibility
- name: yarn test (SwingSet XS Worker)
if: matrix.engine != 'xs'
if: success() || failure() && matrix.engine != 'xs'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe ( success() || failure() ) && (matrix.engine != 'xs') to be clear

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! This was an incorrect search-and-replace of always().

if: failure() && github.event_name != 'pull_request'
uses: ./.github/actions/notify-status
with:
webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the number of times this with: block appears (20, by my count), would it make sense to replace notify-status with notify-slack-and-email, and to move these with: secrets.* lookups into the action? It would make the action less generic, but it's already living in our own repo anyways, and it's not like we're going to send email notifications to different addresses on a per-job basis. Or maybe have the action use e.g. secrets.SLACK_WEBHOOK_URL as the default, so each job could override it with like a specific slack channel, but you could otherwise cut out the 100 lines needed to provide the same secrets to every invocation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secrets.* are not available in actions. You need to pass them explicitly.

@@ -119,39 +119,64 @@ jobs:
#- name: yarn test (everything)
# run: yarn ${{ steps.vars.outputs.test }}
- name: yarn test (access-token)
if: always()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh damn I had forgot we had some tests that ran sequentially.

@warner I believe (but I'm not sure) that the whole test-quick wouldn't run if the build failed.

run: cd packages/xsnap && yarn ${{ steps.vars.outputs.test }} | $TEST_COLLECT
# explicitly test the XS worker, for visibility
- name: yarn test (SwingSet XS Worker)
if: always() && matrix.engine != 'xs'
if: success() || failure() && matrix.engine != 'xs'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the precedence here? Maybe the following is safer ?

Suggested change
if: success() || failure() && matrix.engine != 'xs'
if: (success() || failure()) && matrix.engine != 'xs'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@michaelfig michaelfig added this pull request to the merge queue Jun 21, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 21, 2023
@michaelfig michaelfig added this pull request to the merge queue Jun 21, 2023
Merged via the queue into master with commit 2c5dd59 Jun 21, 2023
@michaelfig michaelfig deleted the mfig-ci-notifications branch June 21, 2023 15:39
mhofman pushed a commit that referenced this pull request Aug 7, 2023
ci: send test failure notifications when not a PR
mhofman pushed a commit that referenced this pull request Aug 7, 2023
ci: send test failure notifications when not a PR
mhofman pushed a commit that referenced this pull request Jan 12, 2024
ci: send test failure notifications when not a PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants