Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid duplicating CI runs on pull requests. #3950

Closed
wants to merge 1 commit into from
Closed

Conversation

sengi
Copy link
Contributor

@sengi sengi commented Apr 13, 2023

The CI workflow for devdocs really hammers the GitHub API because of the way devdocs works currently — sometimes so much that it fails because of rate-limiting — so it's worth avoiding running duplicate jobs.

This is the simplest way to avoid duplicate workflow runs; the only downside is that we'll no longer automatically trigger on third-party pull requests — but our process for handling external contributions already covers that.

If we want to make it automatic for external contributions in future, that's entirely doable — just involves some slightly more fancy YAML (hence not doing it in this PR, because I'm already two yaks deep and I just want to stop the tests from flaking).

The CI workflow for devdocs really hammers the GitHub API because of the way devdocs works currently — sometimes so much that it fails because of rate-limiting — so it's worth avoiding running duplicate jobs.

This is the simplest way to avoid duplicate workflow runs; the only downside is that we'll no longer automatically trigger on third-party pull requests — but our [process for handling external contributions](https://docs.publishing.service.gov.uk/manual/merge-pr.html#a-change-from-an-external-contributor) already covers that.

If we want to make it automatic for external contributions in future, that's entirely doable — just involves some slightly more fancy YAML.
@sengi sengi enabled auto-merge April 13, 2023 21:22
@ChrisBAshton
Copy link
Contributor

Quick drive-by comment - we did once cut down to one run instead of two, but that got reverted in #3818 for reasons outlined in the PR. In that instance, it was push that was removed (as opposed to pull_request here) so I'd just want to double-check that we're not walking into the same situation as before 👀

@kevindew
Copy link
Member

Yeah ends up similar situation as previous PR. If we drop pull_request we also lose testing of a branch merged into master, as push will just test the specific commit.

This also goes against what we have documented as our practice for GitHub Actions in: https://docs.publishing.service.gov.uk/manual/test-and-build-a-project-with-github-actions.html#when-the-ci-workflow-should-run as per RFC 123

It is becoming increasing clear we need a new agreed approach as all the CI files replatforming created don't use the documented approach, so we have a mixture across our stack.

It's clear that people don't find push and pull_request intuitive given this being another time the approach is questioned in this repo. Though I think rather than changing it in just this repo we should work towards updating the docs with an agreed approach and then proliferating that out.

The compromise I'd suggest we take is we have:

  • push (on main branch only) - needed to build a merge into main
  • pull_request - runs when a pull request is opened, tests against a merge of main branch not the individual commit, will not run until someone opens a PR
  • workflow_call - can be run manually by someone to run CI against an individual commit prior to a PR, since we know from @theseanything's poll that this is a common use case

I imagine that'd cover all the use cases and remove the duplicate run concern

@theseanything
Copy link
Contributor

@kevindew - was just a bit confused about when you said:

If we drop pull_request we also lose testing of a branch merged into master, as push will just test the specific commit.

Do you mean if we've only got push enabled, it doesn't trigger for merge commits on the main branch?

@kevindew
Copy link
Member

kevindew commented Apr 14, 2023

@kevindew - was just a bit confused about when you said:

If we drop pull_request we also lose testing of a branch merged into master, as push will just test the specific commit.

Do you mean if we've only got push enabled, it doesn't trigger for merge commits on the main branch?

Nope, that does get triggered then

What I meant that for a given commit push and pull_request run against two different targets. push runs against the exact commit you pushed up whereas pull_request runs against the commit hash being merged into default branch (i.e. same behaviour of: govuk-jenkinslib)

@sengi sengi disabled auto-merge April 17, 2023 10:07
@sengi
Copy link
Contributor Author

sengi commented Apr 17, 2023

Right now the situation is pretty bad — I was unable to make a simple documentation change last week because the tests started failing because all the duplicate runs caused us to hit a GitHub ratelimit.

I understand there's plenty more room for improvement here, but it's important that we don't let the lack of an ideal solution prevent us from fixing real, practical problems that we have right now.

push runs against the exact commit you pushed up whereas pull_request runs against the commit hash being merged into default branch

Aren't those always going to be the same, provided the branch is reasonably up-to-date? (and there's an option to require that, if need be)

@sengi
Copy link
Contributor Author

sengi commented Apr 17, 2023

I definitely want to understand if this change would cause some important regression that I've missed. If not though, I think we need to not let perfect be the enemy of good — otherwise people are gonna be discouraged from improving documentation.

@sengi
Copy link
Contributor Author

sengi commented Apr 17, 2023

+1 to adding a manual trigger (workflow_dispatch) btw — I considered doing that in this PR but I wanted to keep this as simple as possible as it was supposed to be a quick fix for a broken situation. Oh well.

@kevindew
Copy link
Member

push runs against the exact commit you pushed up whereas pull_request runs against the commit hash being merged into default branch

Aren't those always going to be the same, provided the branch is reasonably up-to-date? (and there's an option to require that, if need be)

They're only the same if the branch is ahead of main, otherwise it's the result of the merge.

I definitely want to understand if this change would cause some important regression that I've missed. If not though, I think we need to not let perfect be the enemy of good — otherwise people are gonna be discouraged from improving documentation.

The main problem I see with this is it creates a fresh inconsistency, so we will now have a third approach to GitHub Action CI triggering and it again deviates from the documented approach. I can understand your frustration, but I think just moving ahead with this makes us further away from consistency. If you do just want to get something done quick it'd at least make more sense to me to use the convention used in app CI files rather than create a new one.

I think we just need to agree a team that is responsible for proposing an approach (Platform Reliability or Replatforming), getting some feedback and documenting the agreement

@sengi
Copy link
Contributor Author

sengi commented Apr 17, 2023

I think we need to be careful about our overall goal here. What problem are we trying to solve via "consistency"?

@kevindew
Copy link
Member

I think we need to be careful about our overall goal here. What problem are we trying to solve via "consistency"?

A quick list I'd have is: surprise when working in different projects, superfluous discussions about approaches (like this) because there's no consistency, ability to quickly tell when project is configured "right" or "wrong"

Can you think why we wouldn't want to be consistent in our approach to when CI runs?

kevindew added a commit that referenced this pull request Apr 18, 2023
This has been produced following the discussiona in:
#3950 and
#3816

This is to try resolve a consistency issue we have in GOV.UK where we
have an agreed approach to using GitHub Actions documented which doesn't
reflect the configurations used across GOV.UK.

Since we agreed the approach of running CI GitHub Action on `[push,
pull_request]` and adopted it has become clear that this is not
unanimously popular and, due to the appearance of duplicate jobs, can
imply misconfiguration.

The motivation to document a new approach is to try form consensus on an
approach that can be good enough for most cases and cover common
concerns. This can then reduce the time we spend debating how to
configure individual projects.

Aspects I've wanted to cover in this are:

- Stay broadly consistent with approach replatforming popularised since
  that is what is predominantly in use
- Remove the appearance of duplicate runs by default in pull requests
- Allow CI for external contributors (our [current process for dealing
  with them][1] seems more of a legacy Jenkins concern)
- Offer an approach for running CI without opening a PR, as that seems
  to be a common point of contention (it's a key part of some workflows,
  yet not a concern for others).

A risk of this approach is that the use of workflow_dispatch is somewhat
ignored and thus unreliable. This doesn't seem a huge concern to me as I
imagine this would be fixed if it's used frequently or be a sign that it
can be removed.

[1]: https://docs.publishing.service.gov.uk/manual/merge-pr.html#a-change-from-an-external-contributor
@kevindew
Copy link
Member

@sengi @ChrisBAshton @theseanything I've had a go at implementing and documenting a consistent approach. Could you take a look and let me know if you have concerns with it? If not, I'll share more broadly.

@sengi
Copy link
Contributor Author

sengi commented Apr 18, 2023

What do I need to do to make this change acceptable?

If it's a question of keeping all the repos the same then I don't mind applying the same change everywhere else.

If there's a problem with the logic of the change (leaving aside the question of the other repos for a moment) then what would be an acceptable fix?

TIA! 🙇

@sengi
Copy link
Contributor Author

sengi commented Apr 18, 2023

Oh, thanks! We commented at almost exactly the same time. Snap! 😅 I'll take a look now :)

@sengi
Copy link
Contributor Author

sengi commented Apr 18, 2023

Closing in favour of #3961, which solves the problem in a more general way that we can reuse elsewhere (thanks @kevindew!)

@sengi sengi closed this Apr 18, 2023
@sengi sengi deleted the sengi/fix-dup-ci-runs branch April 18, 2023 12:00
kevindew added a commit that referenced this pull request Apr 18, 2023
This has been produced following the discussiona in:
#3950 and
#3816

This is to try resolve a consistency issue we have in GOV.UK where we
have an agreed approach to using GitHub Actions documented which doesn't
reflect the configurations used across GOV.UK.

Since we agreed the approach of running CI GitHub Action on `[push,
pull_request]` and adopted it has become clear that this is not
unanimously popular and, due to the appearance of duplicate jobs, can
imply misconfiguration.

The motivation to document a new approach is to try form consensus on an
approach that can be good enough for most cases and cover common
concerns. This can then reduce the time we spend debating how to
configure individual projects.

Aspects I've wanted to cover in this are:

- Stay broadly consistent with approach replatforming popularised since
  that is what is predominantly in use
- Remove the appearance of duplicate runs by default in pull requests
- Allow CI for external contributors (our [current process for dealing
  with them][1] seems more of a legacy Jenkins concern)
- Offer an approach for running CI without opening a PR, as that seems
  to be a common point of contention (it's a key part of some workflows,
  yet not a concern for others).

A risk of this approach is that the use of workflow_dispatch is somewhat
ignored and thus unreliable. This doesn't seem a huge concern to me as I
imagine this would be fixed if it's used frequently or be a sign that it
can be removed.

[1]: https://docs.publishing.service.gov.uk/manual/merge-pr.html#a-change-from-an-external-contributor
kevindew added a commit that referenced this pull request Apr 21, 2023
This has been produced following the discussiona in:
#3950 and
#3816

This is to try resolve a consistency issue we have in GOV.UK where we
have an agreed approach to using GitHub Actions documented which doesn't
reflect the configurations used across GOV.UK.

Since we agreed the approach of running CI GitHub Action on `[push,
pull_request]` and adopted it has become clear that this is not
unanimously popular and, due to the appearance of duplicate jobs, can
imply misconfiguration.

The motivation to document a new approach is to try form consensus on an
approach that can be good enough for most cases and cover common
concerns. This can then reduce the time we spend debating how to
configure individual projects.

Aspects I've wanted to cover in this are:

- Stay broadly consistent with approach replatforming popularised since
  that is what is predominantly in use
- Remove the appearance of duplicate runs by default in pull requests
- Allow CI for external contributors (our [current process for dealing
  with them][1] seems more of a legacy Jenkins concern)
- Offer an approach for running CI without opening a PR, as that seems
  to be a common point of contention (it's a key part of some workflows,
  yet not a concern for others).

A risk of this approach is that the use of workflow_dispatch is somewhat
ignored and thus unreliable. This doesn't seem a huge concern to me as I
imagine this would be fixed if it's used frequently or be a sign that it
can be removed.

[1]: https://docs.publishing.service.gov.uk/manual/merge-pr.html#a-change-from-an-external-contributor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants