Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REQUEST: Create a Github token for Cluster API jobs #4165

Closed
killianmuldoon opened this issue Apr 14, 2023 · 18 comments
Closed

REQUEST: Create a Github token for Cluster API jobs #4165

killianmuldoon opened this issue Apr 14, 2023 · 18 comments

Comments

@killianmuldoon
Copy link
Contributor

Organization or Repo

kubernetes-sigs/cluster-api

User affected

No response

Describe the issue

We're having an issue with rate-limits in some of the Cluster API end-to-end tests running on Prow. Adding a github token would make the tests much less flaky.

The token only requires basic read permissions in order to prevent rate-limiting.

@sbueringer

@killianmuldoon
Copy link
Contributor Author

@ameukam I see you worked on something similar here - kubernetes/k8s.io#4259. Any insight into how this should work?

@ameukam
Copy link
Member

ameukam commented Apr 27, 2023

Do you have an example of job failing because of rate-limit ?

we have a proxy to cache Github tokens (ghproxy.default.cluster.local). Maybe we can use that ?

@killianmuldoon
Copy link
Contributor Author

Here's a link to a failing job - https://storage.googleapis.com/k8s-triage/index.html?job=.*-cluster-api-.*&xjob=.*-provider-.*#248b18d3604f56466fe3

I think the reason we were thinking about using a secret is that there's already an existing cluster-lifecycle, but we don't know what permissions it has or where it's configured. More info on that in this slack thread: https://kubernetes.slack.com/archives/C09QZ4DQB/p1681481907191729

How would we use the ghproxy?

@sbueringer
Copy link
Member

sbueringer commented Apr 27, 2023

I was wondering if using ghproxy is an option. To me it felt like depending on an implementation detail of Prow.

Basically clusterctl - our Cluster API cli tool - is pulling files from GitHub releases. But there are more places where we call the GitHub API.

Ideally we would prefer testing clusterctl like our users are using it, by interacting with the normal GitHub API.

I think we would probably also have to make a bunch of adjustments to clusterctl to be able to talk to ghproxy instead of the normal GitHub endpoints.

@sbueringer
Copy link
Member

We just saw that a bunch of jobs across Prow are using their own GitHub tokens, so that felt easier.

@killianmuldoon killianmuldoon changed the title Create a Github token for Cluster API jobs REQUEST: Create a Github token for Cluster API jobs May 3, 2023
@ameukam
Copy link
Member

ameukam commented May 10, 2023

I'm not sure about the usage of a GitHub token in order to solve the rate limit issue. The prowjob will pull github repos using anonymous auth through HTTPS to run tests. A GitHub token is only used when there is an authentication step needed to the Github API.

@chrischdi
Copy link
Member

In CAPI's case it is not the prow part, pulling the github repo which fails.

The test itself (after Prow pulled the repo and started the test) has code (in cluster API's CLI) which fetches artifacts from Github (e.g. yaml's from cert-manager releases). And there is the known issue that this may hit the rate-limit of Github.

The code in CAPI is able to leverage a github token (via the GITHUB_TOKEN environment variable) and by that bypass the anonymous rate-limit.

@killianmuldoon
Copy link
Contributor Author

After moving to the EKS community cluster this issue has become much more disruptive, and means CAPI's tests can't be moved to the community cluster as previously planned see kubernetes-sigs/cluster-api#8426 (comment)

@ameukam - what's the process for providing a Github token on the EKS community cluster?

@richardcase
Copy link
Contributor

We are running into rate limiting issues with the e2e tests for CAPA as well for the same reasons.

It would be great if there was a GitHub token available to the tests via a secret (or boskos) in the prow cluster.

We're using the EKS Prow cluster for all our e2e tests.

@upodroid
Copy link
Member

upodroid commented Aug 16, 2023

If you were using anonymous auth to download release blobs or any other operation, the problem would be magnified in the EKS Build Cluster, given that it uses NAT Gateways for egress(as in a smaller pool of egress IPs that hit the rates much quicker).

Also, clones don't count against API rate limits. https://github.com/orgs/community/discussions/44515

https://github.com/kubernetes/k8s.io/blob/main/infra/aws/terraform/prow-build-cluster/vpc.tf

@sbueringer
Copy link
Member

sbueringer commented Aug 16, 2023

Makes sense. We already had some flakes on the other Prow cluster. I think the only good solution for us is a GitHub token

@AverageMarcus
Copy link
Member

We're also seeing this issue a lot in the tests for image-builder.

I've noticed that the cluster-api-operator jobs seem to have a GITHUB_TOKEN defined - https://github.com/kubernetes/test-infra/blob/5cebb74d87a113a8234e87029fc320ca46c15489/config/jobs/kubernetes-sigs/cluster-api-operator/cluster-api-operator-presubmits-main.yaml#L160-L164

Is this something we could reuse? Or should we set up per-project tokens?

@killianmuldoon
Copy link
Contributor Author

Is this something we could reuse? Or should we set up per-project tokens?

I'm not sure we can reuse this token in the EKS cluster as I'm not sure it even exists there. In any case I think it would be good to get a new token created per project that individual projects can control.

@chrischdi
Copy link
Member

chrischdi commented Aug 25, 2023

xref: kubernetes/test-infra#30501 (comment)

with regards to getting a token:

@richardcase
Copy link
Contributor

Having a new read-only token we could use would really help with the e2e tests. Without it, I think we'd have to consider moving the CAPA jobs back to the GCP Prow cluster.

@killianmuldoon
Copy link
Contributor Author

I think we can close this as I'm no longer seeing these flakes in the CAPI e2e tests. If someone wants to reopen though, please do.

/close

@k8s-ci-robot
Copy link
Contributor

@killianmuldoon: Closing this issue.

In response to this:

I think we can close this as I'm no longer seeing these flakes in the CAPI e2e tests. If someone wants to reopen though, please do.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants