Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom GitHub action "restore-node" can fail on job retry #9708

Closed
gibson042 opened this issue Jul 13, 2024 · 0 comments · Fixed by #9709
Closed

Custom GitHub action "restore-node" can fail on job retry #9708

gibson042 opened this issue Jul 13, 2024 · 0 comments · Fixed by #9709
Labels
bug Something isn't working

Comments

@gibson042
Copy link
Member

gibson042 commented Jul 13, 2024

Describe the bug

Our custom restore-node GitHub action captures the relevant endo branch in local file endo-sha.txt (which is also staged in git), then associates the agoric-sdk directory with an @actions/cache entry keyed in part upon that branch, then executes yarn install and yarn build only on cache miss, then finally does a "dirty check" that tolerates staged files (cf. #8168) but not untracked files. However, if the cache was populated by a job that removed its own endo-sha.txt (such as workflow integration "getting-started" with matrix.cli value "registry/yarn", "registry/npm", or "registry/npx", all of which include a "Start local NPM registry" step that executes git reset --hard HEAD), then the restored git directory will be unaware of the local file and report it as untracked, failing the action and its containing job.

This happened in the CI for #9688, where:

  1. getting-started (registry/yarn) won the race to save the cache entry
  2. getting-started (link-cli/yarn) failed due to what looks like flakiness (Possible race in integration testing #9710)
  3. a retry of getting-started (link-cli/yarn) failed in restore-node

Expected behavior

restore-node should not expect that the endo-sha.txt file it creates is guaranteed to be included in a cache entry created at the end of its containing job. Specifically, I think it should tolerate absence of the file in its "dirty check" step.

@gibson042 gibson042 added the bug Something isn't working label Jul 13, 2024
@mergify mergify bot closed this as completed in #9709 Jul 16, 2024
@mergify mergify bot closed this as completed in 0b15167 Jul 16, 2024
mergify bot added a commit that referenced this issue Jul 16, 2024
…9709)

Fixes #9708

## Description
Tolerates removal of endo-sha.txt by e.g. `git reset --hard HEAD` contributing to a restored cache state. Also includes minor cleanup of GitHub actions files.

### Security Considerations
No relevant change.

### Scaling Considerations
n/a

### Documentation Considerations
n/a

### Testing Considerations
It's difficult to test these kinds of changes, but the proof of the pudding is in the taste.

### Upgrade Considerations
n/a
mergify bot pushed a commit that referenced this issue Jul 16, 2024
Ref #9709
Ref #9708

## Description
As of #9709, git state may not include untracked files after our custom [restore-node GitHub action](https://github.com/Agoric/agoric-sdk/blob/7bde505dd23289218a19bc86f8e94bbff48b65e0/.github/actions/restore-node/action.yml), in which case the after-merge workflow job "dev-canary" will execute a [no-op `git stash` followed later by a no-op `git stash apply`](https://github.com/Agoric/agoric-sdk/blob/7bde505dd23289218a19bc86f8e94bbff48b65e0/.github/workflows/after-merge.yml#L78-L84), the latter of which exits with a non-zero status code that fails the step and thereby the containing job.

This PR fixes things to tolerate absence of any stashed changes.

### Security Considerations
No relevant change.

### Scaling Considerations
n/a

### Documentation Considerations
n/a

### Testing Considerations
It's difficult to test these kinds of changes, but the proof of the pudding is in the taste.

### Upgrade Considerations
n/a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant