Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged but not deleted push branch halts automation #164

Closed
plockaby opened this issue May 11, 2021 · 6 comments · Fixed by #177
Closed

Merged but not deleted push branch halts automation #164

plockaby opened this issue May 11, 2021 · 6 comments · Fixed by #177
Labels
bug Something isn't working

Comments

@plockaby
Copy link

I've got the basic image update automation configuration set up:

apiVersion: image.toolkit.fluxcd.io/v1alpha2
kind: ImageUpdateAutomation
metadata:
  name: flux-system
  namespace: flux-system
spec:
  git:
    checkout:
      ref:
        branch: main
    push:
      branch: pending
    commit:
      author:
        email: [email protected]
        name: fluxcd
      messageTemplate: '{{range .Updated.Images}}{{println .}}{{end}}'
  interval: 1m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  update:
    path: ./flux-development
    strategy: Setters

It pushes to GitHub on a branch called "pending". If I merge "pending" but do not delete pending then I end up with this error message in my logs, repeating forever. As long as this error message is happening automation updates stop happening The moment that I delete the "pending" branch from GitHub the error message goes away and automation resumes.

{"level":"error","ts":"2021-05-11T00:21:28.837Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"Reconciler error","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"flux-system","namespace":"flux-system","error":"object not found"}

I'd also like to ask if the image automation will automatically detect changes to the image even when the tag doesn't change. For example, if I'm deploying haproxy:2.2.14 the maintainers will periodically rebuild their image when their base image changes (e.g. they rebuild 2.2.14 because of a CVE in a dependent library like SSL) and they'll release a new "version" of 2.2.14 with the same exact tag but with a different hash. Can/does flux2 pick that up and deploy the new image?

@plockaby plockaby changed the title Problem with custom Problem with custom push branch May 11, 2021
@yebyen
Copy link

yebyen commented May 11, 2021

For example, if I'm deploying haproxy:2.2.14 the maintainers will periodically rebuild their image when their base image changes (e.g. they rebuild 2.2.14 because of a CVE in a dependent library like SSL) and they'll release a new "version" of 2.2.14 with the same exact tag but with a different hash. Can/does flux2 pick that up and deploy the new image

This is not possible with any version of Flux today, since Flux works from git, there is no change to apply, and since the image metadata is the only place where Flux could look to know that something has changed, and Flux no longer does anything with image metadata as it is commonly rate limited and expensive to pull the metadata from each image tag one by one, there would not currently be any way for Flux to know if something has changed in the manner you are describing.

If you have imagePullPolicy set to Always, you can trigger this kind of update by recycling the pods with eg. kubectl rollout restart. There isn't a good way I can suggest to detect the change, but if you run the rollout restart command on a schedule (in a CronJob for example), your deploys will be receiving those updates with that interval.

The rest of your report does sound like a bug. Flux should be smart enough to catch up the pending branch with the base branch if it can be fast-forwarded (but apparently it isn't!)

@stefanprodan stefanprodan added the bug Something isn't working label May 26, 2021
@squaremo
Copy link
Member

squaremo commented May 26, 2021

I'd also like to ask if the image automation will automatically detect changes to the image even when the tag doesn't change. For example, if I'm deploying haproxy:2.2.14 the maintainers will periodically rebuild their image when their base image changes (e.g. they rebuild 2.2.14 because of a CVE in a dependent library like SSL) and they'll release a new "version" of 2.2.14 with the same exact tag but with a different hash. Can/does flux2 pick that up and deploy the new image?

See #165, which I think is exactly what you want. (EDIT: oh I see you already found it!)

@squaremo squaremo changed the title Problem with custom push branch Merged but not deleted push branch halts automation May 27, 2021
@squaremo
Copy link
Member

It pushes to GitHub on a branch called "pending". If I merge "pending" but do not delete pending then I end up with this error message in my logs, repeating forever.

I've reproduced this by following the image automation guide. That should make it easier to debug!

@squaremo
Copy link
Member

squaremo commented May 27, 2021

OK the short answer is: cloning with depth=1 screws the branching here up.

The longer answer: the "push branch" part of the API here is implemented like this:

  1. clone the repo at the checkout branch
  2. try to fetch the remote push branch into a local branch
  3. if the push branch cannot be found locally, create the push branch locally, starting at HEAD (the checkout branch)
  4. if the push branch can be found locally, check it out
  5. commit, push, etc.

In usual circumstances, after 2.) there's either a local push branch, or not. But when the clone is shallow, the ref is found in 4.) but it can't be checked out. The contract of fetch (both in git and in the gogit library) is that all objects needed to populate the history of the fetched ref are also fetched, so this feels like a bug in gogit to me. I'll try to log an issue there, for a start.

@squaremo
Copy link
Member

squaremo commented Jun 1, 2021

This issue looks pretty familiar: go-git/go-git#207 (so I won't file one myself, for now).

While that (or the yet to be diagnosed but similar problem here) is not fixed upstream in go-git, the most straight-forward thing to do would be to not use shallow clones.

@squaremo
Copy link
Member

squaremo commented Jun 2, 2021

By manipulating the tests and changing which of {go-git, libgit2} is used for each of cloning, fetching the push branch, and push to the upstream, I see this in summary:

Clone Fetch Push outcome
go-git any any many tests fail
libgit2 any any all tests pass

In other words, if libgit2 is used to clone, it all works no matter what's used for other operations; and if go-git is used to clone, at least some things will fail. So I think my comments above are incorrect, in that go-git does not uniquely have a problem with shallow clones vs fetching and pushing.

There is some code that does local operations (changing branches, creating a commit) that always uses go-git -- I have not tried reimplementing that, to see if it makes a difference. I hope not, because that would be pretty disappointing, and using libgit2 for those operations is painful.

The simplest solution seems to be just ignoring the GitImplementation field in the GitRepository specification, and using libgit2. Once using libgit2, there's no reason not to use it for fetching and pushing, since it is better than go-git once you set aside cloning (shallow clones and submodules). This will remove a lot of complexity in bookkeeping.

squaremo added a commit that referenced this issue Jun 2, 2021
source-controller/pkg/git does shallow clones when using the go-git
implementation, and apparently this causes problems when fetching a
branch that has been merged at the origin:

    #164

So far as I can tell, getting a shallow clone breaks the automation,
no matter whether go-git or libgit2 is used for operations after
cloning. So: just use libgit2 for cloning, which means non-shallow
clones; and, for fetch and push, since there's no functional
difference between the implementations for those.

Signed-off-by: Michael Bridgen <[email protected]>
squaremo added a commit that referenced this issue Jun 2, 2021
source-controller/pkg/git does shallow clones when using the go-git
implementation, and apparently this causes problems when fetching a
branch that has been merged at the origin:

    #164

So far as I can tell, getting a shallow clone breaks the automation,
no matter whether go-git or libgit2 is used for operations after
cloning. So: just use libgit2 for cloning, which means non-shallow
clones; and, for fetch and push, since there's no functional
difference between the implementations for those.

Signed-off-by: Michael Bridgen <[email protected]>
squaremo added a commit that referenced this issue Jun 2, 2021
source-controller/pkg/git does shallow clones when using the go-git
implementation, and apparently this causes problems when fetching a
branch that has been merged at the origin:

    #164

So far as I can tell, getting a shallow clone breaks the automation,
no matter whether go-git or libgit2 is used for operations after
cloning. So: just use libgit2 for cloning, which means non-shallow
clones; and, for fetch and push, since there's no functional
difference between the implementations for those.

Signed-off-by: Michael Bridgen <[email protected]>
squaremo added a commit that referenced this issue Jun 2, 2021
source-controller/pkg/git does shallow clones when using the go-git
implementation, and apparently this causes problems when fetching a
branch that has been merged at the origin:

    #164

So far as I can tell, getting a shallow clone breaks the automation,
no matter whether go-git or libgit2 is used for operations after
cloning. So: just use libgit2 for cloning, which means non-shallow
clones; and, for fetch and push, since there's no functional
difference between the implementations for those.

Signed-off-by: Michael Bridgen <[email protected]>
@squaremo squaremo mentioned this issue Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants