Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

golangci-lint GitHub action is sometimes not triggered on PRs opened by dependabot #9566

Closed
sbueringer opened this issue Oct 18, 2023 · 22 comments
Labels
area/ci Issues or PRs related to ci help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Oct 18, 2023

What steps did you take and what happened?

Sometimes when dependabot opens new bump PRs the golangci-lint GitHub action is not triggered

The PR is then stuck and can't be merged because it's waiting for "lint"

What did you expect to happen?

Action should be triggered as usual

Cluster API version

main

Kubernetes version

Anything else you would like to add?

Workaround is to manually edit the PR description. Potentially also works to edit the PR title but I didn't try that yet

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 18, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer sbueringer added area/ci Issues or PRs related to ci and removed kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 18, 2023
@sbueringer
Copy link
Member Author

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 18, 2023
@killianmuldoon
Copy link
Contributor

I wonder if this is down to our dependabot action somehow - that's the main diff IMO between these PRs and the ones we create manually.

This is the current action: https://github.com/kubernetes-sigs/cluster-api/blob/main/.github/workflows/pr-dependabot.yaml

It checks out the code, runs make generate and then commits the result. I don't have any idea how that could be impacting the lint action right now, but it's a first place to look.

@sbueringer
Copy link
Member Author

Maybe somehow. We should try to gather more data, e.g. under which circumstances it exactly happens (e.g. is the pr-dependabot action pushing changes in these cases)

@killianmuldoon
Copy link
Contributor

I did a bit of digging and it looks like we only have this problem on PRs which have 2 commits i.e. those that ran the github action and made an additional change. Most recently: #9564

@killianmuldoon
Copy link
Contributor

Okay - I think I found the root cause for this. When using the repo's own repository taken workflows aren't triggered:

When you use the repository's GITHUB_TOKEN to perform tasks, events triggered by the GITHUB_TOKEN, with the exception of workflow_dispatch and repository_dispatch, will not create a new workflow run. This prevents you from accidentally creating recursive workflow runs. For example, if a workflow run pushes code using the repository's GITHUB_TOKEN, a new workflow will not run even when the repository contains a workflow configured to run when push events occur.

The pr-dependabot action uses EndBug/add-and-commit to commit the generated code which warns specifically about this issue.

That's because you're checking out the repo using the built-in GITHUB_TOKEN secret: GitHub sees that the push event has been triggered by a commit generated by CI, and doesn't run any further checks to avoid unintentional check loops.

The advice is to use a Personal Access Token in the action, which I don't think we want to do. I'm not sure what the best solution to this is right now, but it's working as intended from the GitHub side.

I have the subjective feeling that this is happening more often, but I'm not sure that's right. Maybe changing the schedule for dependabot PRs to happen on different days results in more generated commits?

@killianmuldoon
Copy link
Contributor

Only curious as to why this doesn't seem to happen in the same way on the controller-runtime repo: e.g. kubernetes-sigs/controller-runtime#2521

@fabriziopandini
Copy link
Member

/kind flake
(not really a test flake but probably the best fit)
/priority important-long-term

@k8s-ci-robot k8s-ci-robot added the kind/flake Categorizes issue or PR as related to a flaky test. label Apr 11, 2024
@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: The label(s) priority/important-long-term cannot be applied, because the repository doesn't have them.

In response to this:

/kind flake
(not really a test flake but probably the best fit)
/priority important-long-term

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fabriziopandini
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 11, 2024
@sbueringer
Copy link
Member Author

If someone has time, might be worth to spend some more time on research to see if there is now some solution to this.
This is really really enoying (I'm usually modifying 5-10 PRs per week to trigger the lint action manually)

(cc @adityabhatia)

@fabriziopandini fabriziopandini added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 6, 2024
@sbueringer
Copy link
Member Author

sbueringer commented May 7, 2024

I was trying to make the following work:

  • dependabot action should trigger lint action after the commit

I made the following changes:

pr-dependabot.yaml

permissions:
  # Add
  actions: write # Allow to trigger the lint action.


    # Add this as last step
    - name: Trigger lint action
      uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1
      with:
        github-token: ${{ secrets.GITHUB_TOKEN }}
        script: |
          await github.rest.actions.createWorkflowDispatch({
            owner: context.repo.owner,
            repo: context.repo.repo,
            workflow_id: 'pr-golangci-lint.yaml',
            ref: context.ref,
          })

pr-golangci-lint.yaml

on:
  # Add
  workflow_dispatch:

This actually triggers the lint action with the right branch, the problem is that the lint action is not reporting back to the PR...

@sbueringer
Copy link
Member Author

Okay spend a few hours trying a lot of stuff. I think this is simply not possible either:

@sbueringer
Copy link
Member Author

sbueringer commented May 8, 2024

This leads me to, let's get rid of the golangci-lint GitHub action. Let's instead just run it as part of the verify job. I think this won't extend the runtime of that job significantly but let's see.

I think there is basically no benefit today to run the linter in an action and the output is relatively hard to understand anyway (compared to a make lint locally)

(more generically this means we shouldn't run any mandatory checks in GitHub actions, but I think this only really applies to the lint action today)

Steps:

  1. [WIP] 🌱 Run lint as part of verify #10575
  2. Update test-infra so lint is not mandatory anymore
  3. Drop golangci-lint action in CAPI

@sbueringer
Copy link
Member Author

@killianmuldoon @chrischdi @fabriziopandini Opinions?

@chrischdi
Copy link
Member

I'm okay with that 👍

@sbueringer
Copy link
Member Author

Just had an additional idea for #10571. Let's wait until we see the impact of that. Maybe afterwards the dependabot action is very rarely run (today it's run on every bump PR)

@stmcginnis
Copy link
Contributor

I think there is basically no benefit today to run the linter in an action

The one benefit - I think - is that any lint issues found will actually have comments on the offending line of code in the PR diff view. That makes it very easy for newcomers to understand what the problem is and where to find it.

@sbueringer
Copy link
Member Author

sbueringer commented May 8, 2024

Good point. I'm not sure if that works at the moment (but I would assume it's at least fixable)

@sbueringer
Copy link
Member Author

Just realized (when discussing it with @chrischdi) that the current action implementation has the following issue:

  • Let's assume we have a PR ("PR 1") which changes the linter configuration & the linter on the PR is green
  • Now in parallel a few other PRs are merged
  • The action on PR 1 will only be re-run if it's either explicitly rebased or if some of the other PRs lead to merge conflicts with PR 1 (which then requires a rebase)

This way we recently ended up with a few linter findings on main.

@sbueringer
Copy link
Member Author

sbueringer commented Jun 17, 2024

Okay as we now tried the new dependabot configuration for a while I would close this issue as "won't fix".

Basically we now have to manually change a PR description / title once per week in the worst case. This seems okay and not worth additional effort at this point.

(In a lot of cases we don't have to do it at all, as dependabot is running go mod tiday and that seems to be enough in a lot of cases, but not always)

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

Okay as we now tried the new dependabot configuration for a while I would close this issue as "won't fix".

Basically we now have to manually change a PR description / title once per week in the worst case. This seems okay and not worth additional effort at this point.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci Issues or PRs related to ci help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants