Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Reduce github api requests in clusterctl by querying go modules #7192

Merged

Conversation

chrischdi
Copy link
Member

What this PR does / why we need it:

Reduce github api requests in clusterctl

  • Removes additional cert-manager latest version detection because it always gets overwritten afterwards anyway.
  • Uses goproxy instead of github api for listing repository versions.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Part of #3982

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 8, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 8, 2022
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch 3 times, most recently from 5fd4b37 to cc65b30 Compare September 9, 2022 08:36
@chrischdi chrischdi changed the title 🌱 [WIP] Reduce github api requests in clusterctl 🌱 Reduce github api requests in clusterctl Sep 9, 2022
@sbueringer
Copy link
Member

cc @ykakarap

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome - do we have any idea how much this helps? Is there many fewer calls in a clusterctl init for example?

Copy link
Contributor

@ykakarap ykakarap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just took a high-level look at this and this looks great.

I think technically a provider can be written in any language and not necessarily be in golang. This would enfore that the provider be a go project for clusterctl to be able to use the github repo to perform any actions.
I dont know if there are any providers that are actually not written in go (I dont imagine there are) but I wanted to call out the dependency this will establish.

How about adding a fallback to list the versions using the old github method? This would also cover the case if the provider's go module name does not match the github owner name (like "kubernetes-sigs" -> "sigs.k8s.io" and "kubernetes" -> "k8s.io").

@chrischdi
Copy link
Member Author

Just took a high-level look at this and this looks great.

I think technically a provider can be written in any language and not necessarily be in golang. This would enfore that the provider be a go project for clusterctl to be able to use the github repo to perform any actions. I dont know if there are any providers that are actually not written in go (I dont imagine there are) but I wanted to call out the dependency this will establish.

How about adding a fallback to list the versions using the old github method? This would also cover the case if the provider's go module name does not match the github owner name (like "kubernetes-sigs" -> "sigs.k8s.io" and "kubernetes" -> "k8s.io").

I agree with that, looks even more to 👍 for the fallback to the old method as in #7192 (comment) :-)

@chrischdi
Copy link
Member Author

chrischdi commented Sep 16, 2022

Looks awesome - do we have any idea how much this helps? Is there many fewer calls in a clusterctl init for example?

Yes:

  • The cert-manager part removes 1 request to detect the latest version, which was always made without any outcome because we pin the version anyway.
  • The list releases part: this replaces 1 request per provider by using the goproxy (if no fallback is done)

I analyzed it by adding print statements in the code.

Prior to this change for installing the AWS provider:

❯ AWS_B64ENCODED_CREDENTIALS=$(echo foo | base64) go run ./cmd/clusterctl init --infrastructure aws
Fetching providers
>> client.Repositories.ListReleases g.owner kubernetes-sigs g.repository cluster-api
>> client.Repositories.GetReleaseByTag g.owner kubernetes-sigs g.repository cluster-api tag v1.2.2
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826385
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826397
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826387
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826389
>> client.Repositories.ListReleases g.owner kubernetes-sigs g.repository cluster-api-provider-aws
>> client.Repositories.GetReleaseByTag g.owner kubernetes-sigs g.repository cluster-api-provider-aws tag v1.5.0
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api-provider-aws *assetID 74081055
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api-provider-aws *assetID 74081043
Installing cert-manager Version="v1.9.1"
>> client.Repositories.ListReleases g.owner cert-manager g.repository cert-manager
>> client.Repositories.GetReleaseByTag g.owner cert-manager g.repository cert-manager tag v1.9.1
>> client.Repositories.DownloadReleaseAsset g.owner cert-manager g.repository cert-manager *assetID 72738262
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.2.2" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.2.2" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.2.2" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-aws" Version="v1.5.0" TargetNamespace="capa-system"
  • 3 calls to client.Repositories.ListReleases
    • 1 for each provider (core/aws): gets replaced by goproxy request
    • 1 for cert-manager: gets removed because its not required
  • 3 calls to client.Repositories.GetReleaseByTag
    • 1 for each provider (core / aws)
    • 1 for cert-manager
  • 7 calls to client.Repositories.DownloadReleaseAsset
    • 4 for core provider currently
    • 2 for aws provider
    • 1 for cert-manager

In sum: 13 API Calls to github.

With this PR (if the providers are go based and supported via goproxy) we reduce the amount of API calls by 3 (to 10) in this example:

❯ AWS_B64ENCODED_CREDENTIALS=$(echo foo | base64) go run ./cmd/clusterctl init --infrastructure aws
Fetching providers
>> !!goproxy!! versionClient.List g.owner kubernetes-sigs g.repository cluster-api
>> client.Repositories.GetReleaseByTag g.owner kubernetes-sigs g.repository cluster-api tag v1.2.2
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826385
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826397
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826387
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api *assetID 77826389
>> !!goproxy!! versionClient.List g.owner kubernetes-sigs g.repository cluster-api-provider-aws
>> client.Repositories.GetReleaseByTag g.owner kubernetes-sigs g.repository cluster-api-provider-aws tag v1.5.0
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api-provider-aws *assetID 74081055
>> client.Repositories.DownloadReleaseAsset g.owner kubernetes-sigs g.repository cluster-api-provider-aws *assetID 74081043
Installing cert-manager Version="v1.9.1"
>> client.Repositories.GetReleaseByTag g.owner cert-manager g.repository cert-manager tag v1.9.1
>> client.Repositories.DownloadReleaseAsset g.owner cert-manager g.repository cert-manager *assetID 72738262
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.2.2" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.2.2" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.2.2" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-aws" Version="v1.5.0" TargetNamespace="capa-system"

I will go ahead and implement the fallback mechanism.

@chrischdi
Copy link
Member Author

Refactored the code and moved the caching + goproxy call up by one func so the fallback mechanism seems to be cleaner.

@chrischdi
Copy link
Member Author

/test help

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-test-mink8s-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-ipv6-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-workload-upgrade-1-25-latest-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-ipv6-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-test-mink8s-main
  • pull-cluster-api-verify-main

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrischdi
Copy link
Member Author

/test pull-cluster-api-apidiff-main
/test pull-cluster-api-e2e-full-main
/test pull-cluster-api-e2e-informing-ipv6-main
/test pull-cluster-api-e2e-informing-main
/test pull-cluster-api-e2e-workload-upgrade-1-25-latest-main

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-full-main

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!
I will handle GOPROXY=off or directly by silently falling back to GitHub API; also, I will document we are relying on goproxy for reducing API calls + we are respecting GOPROXY env var in https://cluster-api.sigs.k8s.io/clusterctl/overview.html?highlight=github#avoiding-github-rate-limiting and reference the same paragraph from https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html?highlight=github#creating-a-provider-repository-on-github.

cmd/clusterctl/client/cluster/cert_manager.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Show resolved Hide resolved
cmd/clusterctl/client/repository/repository_github.go Outdated Show resolved Hide resolved
@fabriziopandini
Copy link
Member

/retitle ✨ Reduce github api requests in clusterctl by querying go modules

@k8s-ci-robot k8s-ci-robot changed the title 🌱 Reduce github api requests in clusterctl ✨ Reduce github api requests in clusterctl by querying go modules Sep 22, 2022
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch 2 times, most recently from 0101c0b to c88740e Compare September 23, 2022 12:06
@chrischdi
Copy link
Member Author

/test pull-cluster-api-apidiff-main
/test pull-cluster-api-e2e-full-main
/test pull-cluster-api-e2e-informing-ipv6-main
/test pull-cluster-api-e2e-informing-main
/test pull-cluster-api-e2e-workload-upgrade-1-25-latest-main

@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch 2 times, most recently from 0210a05 to 982bb18 Compare September 23, 2022 12:15
@chrischdi
Copy link
Member Author

/test pull-cluster-api-apidiff-main
/test pull-cluster-api-e2e-full-main
/test pull-cluster-api-e2e-informing-ipv6-main
/test pull-cluster-api-e2e-informing-main
/test pull-cluster-api-e2e-workload-upgrade-1-25-latest-main

@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch from 982bb18 to 87bcb54 Compare September 26, 2022 06:18
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 26, 2022
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch from 87bcb54 to ea2e4e2 Compare September 27, 2022 06:02
@chrischdi
Copy link
Member Author

@fabriziopandini / @sbueringer : friendly reminder, this one is still around :-) It may be a nice small improvement, although it may already be too late for v1.3.

@sbueringer
Copy link
Member

/test pull-cluster-api-e2e-full-main
/test pull-cluster-api-e2e-workload-upgrade-1-25-latest-main

Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits

cmd/clusterctl/client/cluster/cert_manager.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/repository/goproxy.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 15, 2022
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch from 1fe0b50 to f54300e Compare November 15, 2022 08:25
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 15, 2022
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch from f54300e to 7c1df3a Compare November 15, 2022 08:26
@fabriziopandini
Copy link
Member

Only one nit from my side.
Can we document in https://cluster-api.sigs.k8s.io/clusterctl/commands/init.html#provider-repositories that clusterctl uses go proxy to detect available versions without calling GitHub API and that users can control/disable this behavior by setting the go proxy variable?

A similar note should go into https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html#creating-a-provider-repository-on-github and probably in https://cluster-api.sigs.k8s.io/clusterctl/overview.html#avoiding-github-rate-limiting

* Removes additional cert-manager latest version detection because it always gets overwritten.
* Uses goproxy instead of github api for listing repository versions.
@chrischdi chrischdi force-pushed the pr-optimize-gh-requests branch from 7c1df3a to f7db0c7 Compare November 15, 2022 11:22
@k8s-ci-robot
Copy link
Contributor

@chrischdi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-apidiff-main f7db0c7 link false /test pull-cluster-api-apidiff-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sbueringer
Copy link
Member

Thx!!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2022
@sbueringer
Copy link
Member

/assign @fabriziopandini

@fabriziopandini
Copy link
Member

Great work!
/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2022
@k8s-ci-robot k8s-ci-robot merged commit 5ff76fb into kubernetes-sigs:main Nov 15, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.3 milestone Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants