Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterctl upgrades fail if a component is branched but not released #8966

Closed
mnaser opened this issue Jul 5, 2023 · 14 comments
Closed

clusterctl upgrades fail if a component is branched but not released #8966

mnaser opened this issue Jul 5, 2023 · 14 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@mnaser
Copy link

mnaser commented Jul 5, 2023

What steps did you take and what happened?

It seems that the upgrade process is failing for checking a new release if there is a component that has been branched but not released yet. You can reproduce with the following steps:

Setup:

kind create cluster
sudo curl -Lo /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.4.4/clusterctl-linux-amd64
sudo chmod +x /usr/local/bin/clusterctl
clusterctl init \
  --core cluster-api:v1.4.4 \
  --bootstrap kubeadm:v1.4.4 \
  --control-plane kubeadm:v1.4.4 \
  --infrastructure openstack:v0.7.1

Error:

❯ clusterctl upgrade plan
Checking cert-manager version...
Cert-Manager is already up to date

Checking new release availability...
Error: failed to read "metadata.yaml" from the repository for provider "infrastructure-openstack": failed to get GitHub release v0.8.0-alpha.0: failed to read release "v0.8.0-alpha.0": GET https://api.github.com/repos/kubernetes-sigs/cluster-api-provider-openstack/releases/tags/v0.8.0-alpha.0: 404 Not Found []

There was a suggestion by @mdbooth that it mgiht have been fixed in #8253 however I ugpraded clusterctl to 1.5 which should include this fix:

sudo curl -Lo /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.5.0-beta.0/clusterctl-darwin-arm64
sudo chmod +x /usr/local/bin/clusterctl

and alas:

❯ clusterctl upgrade plan
Checking cert-manager version...
Cert-Manager is already up to date

Checking new release availability...
Error: failed to read "metadata.yaml" from the repository for provider "infrastructure-openstack": release not found for version v0.8.0-alpha.0, please retry later or set "GOPROXY=off" to get the current stable release: 404 Not Found

What did you expect to happen?

The clusterctl command should not fail even in the circumstance that a project is branched but not released yet

Cluster API version

Tried both:

clusterctl version: &version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.0-beta.0", GitCommit:"2b0dd2eda7ba9c41e481a1c395d9dfef637dca19", GitTreeState:"clean", BuildDate:"2023-06-27T15:59:44Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"darwin/arm64"}
clusterctl version: &version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"00dbf7b9f6322d7ebd06ae2efa703b23354dd37d", GitTreeState:"clean", BuildDate:"2023-06-27T15:57:57Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"darwin/arm64"}

Kubernetes version

No response

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 5, 2023
@killianmuldoon
Copy link
Contributor

/triage accepted

The error: Error: failed to read "metadata.yaml" from the repository for provider "infrastructure-openstack": release not found for version v0.8.0-alpha.0, please retry later or set "GOPROXY=off" to get the current stable release: 404 Not Found is what's expected at this point. Have you tried running the command with GOPROXY=off as described?

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 5, 2023
@sbueringer
Copy link
Member

Q: Is this related to #7889 in any way?

@killianmuldoon
Copy link
Contributor

Q: Is this related to #7889 in any way?

It's a combination of this and the fix from #8253 AFAICT.

@mnaser
Copy link
Author

mnaser commented Jul 5, 2023

Indeed, using the following works:

GOPROXY=off clusterctl upgrade plan

However, it feels like that it should fall back automatically perhaps? In this case it's broken all of our CI and new deployments that test the Cluster API upgrades.

@vincepri
Copy link
Member

vincepri commented Jul 6, 2023

Catching up on this thread, @killianmuldoon was the original fix put in place because we try to check goproxy tags before asking GitHub for a release? I assume this helps with rate limiting?

@killianmuldoon
Copy link
Contributor

Exactly - the original PR introducing go modules for this is here: #7192. And the issue covering github rate limiting is here: #3982.

It comes down to a balance for the UX of clusterctl for users that aren't using a github token. Currently by default - with goproxy enabled - there's a window between a tag being created and a release being published where trying to run clusterctl init without specifying a version fails. Without goproxy enabled there's a constant chance of failures related to the github rate limit.

@vincepri
Copy link
Member

vincepri commented Jul 6, 2023

It sounds like it would make sense to try to retrieve the tag with goproxy=off/direct if there is one set?

@CecileRobertMichon
Copy link
Contributor

See similar reports in kubernetes-sigs/cluster-api-provider-azure#3679

@sbueringer
Copy link
Member

sbueringer commented Jul 7, 2023

Q: Is this related to #7889 in any way?

It's a combination of this and the fix from #8253 AFAICT.

Those are the same issue right? (Issue + fix PR). I assume once we have a release with 8253 the issue should be resolved? (we just didn't cherry-pick it in time for the last 1.4 release)

@killianmuldoon
Copy link
Contributor

Those are the same issue right? (Issue + fix PR). I assume once we have a release with 8253 the issue should be resolved? (we just didn't cherry-pick it in time for the last 1.4 release)

This problem still occurs with 1.5-beta.0. The error reported in this issue is Error: failed to read "metadata.yaml" from the repository for provider "infrastructure-openstack": release not found for version v0.8.0-alpha.0, please retry later or set "GOPROXY=off" to get the current stable release: 404 Not Found which is from the fix in #8253.

@sbueringer
Copy link
Member

sbueringer commented Jul 7, 2023

Ah I see. I somehow remembered that we fixed the issue by ignoring this release and picking another one instead.

But if I look at the PR now we just changed the error message? (which is probably why @chrischdi wrote "Partially fixing #7889" in the PR description)

So basically we never implemented:

Or try to adjust our code to use the latest - 1 tag instead or try to fallback to detect without goproxy immediately (requires some refactoring)
#7889 (comment)

@killianmuldoon
Copy link
Contributor

Yeah - I didn't think there was any consensus on that fix, but the bug seems common enough that someone might want to fix it. I reopened the original issue and added help wanted. I think we should close this issue as a duplicate.

/close

In favor of the re-opened issue at #7889

@k8s-ci-robot
Copy link
Contributor

@killianmuldoon: Closing this issue.

In response to this:

Yeah - I didn't think there was any consensus on that fix, but the bug seems common enough that someone might want to fix it. I reopened the original issue and added help wanted. I think we should close this issue as a duplicate.

/close

In favor of the re-opened issue at #7889

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member

Yup makes sense, thx. I'm mostly just trying to figure out how it all fits together :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants