Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Cluster labels in OOT cloud provider templates #2462

Merged

Conversation

CecileRobertMichon
Copy link
Contributor

Co-Authored-By: Mark Rossetti [email protected]

What type of PR is this?
/kind failing-test

What this PR does / why we need it: The ClusterResourceSets for external cloud-provider (CCM) were removed as part of #2209. This PR removes the leftover cluster labels that are no longer needed to match Clusters. It also fixes a bug where the calico CNI label was being overwritten by the external cloud provider patch, causing Calico not to get installed, on the external cloud provider CI version template.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2341

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Fix Cluster labels in OOT cloud provider templates

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 7, 2022
@@ -2,8 +2,7 @@ apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
labels:
ccm: external
cni: calico
cni: ${CLUSTER_NAME}-calico
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what fixes #2341

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the path above was overriding these values?

I search the for the label cmm: external and don't see it used anywhere. Are we sure the cloud-provider test code doesn't use it?

Copy link
Contributor Author

@CecileRobertMichon CecileRobertMichon Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage was removed in #2209. cc @jackfrancis to confirm.

So the path above was overriding these values?

yes, the external cloud provider patch was modifying the labels and overwriting the cni one.

@CecileRobertMichon
Copy link
Contributor Author

/assign @jsturtevant @jackfrancis

@marosset
Copy link
Contributor

marosset commented Jul 7, 2022

/test ?

@k8s-ci-robot
Copy link
Contributor

@marosset: The following commands are available to trigger required jobs:

  • /test pull-cluster-api-provider-azure-build
  • /test pull-cluster-api-provider-azure-ci-entrypoint
  • /test pull-cluster-api-provider-azure-e2e
  • /test pull-cluster-api-provider-azure-test
  • /test pull-cluster-api-provider-azure-verify

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-provider-azure-apidiff
  • /test pull-cluster-api-provider-azure-apiversion-upgrade
  • /test pull-cluster-api-provider-azure-capi-e2e
  • /test pull-cluster-api-provider-azure-conformance
  • /test pull-cluster-api-provider-azure-conformance-with-ci-artifacts
  • /test pull-cluster-api-provider-azure-coverage
  • /test pull-cluster-api-provider-azure-e2e-exp
  • /test pull-cluster-api-provider-azure-e2e-optional
  • /test pull-cluster-api-provider-azure-e2e-workload-upgrade
  • /test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
  • /test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts-serial-slow

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-provider-azure-apidiff
  • pull-cluster-api-provider-azure-build
  • pull-cluster-api-provider-azure-ci-entrypoint
  • pull-cluster-api-provider-azure-coverage
  • pull-cluster-api-provider-azure-e2e
  • pull-cluster-api-provider-azure-test
  • pull-cluster-api-provider-azure-verify

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@marosset
Copy link
Contributor

marosset commented Jul 7, 2022

/test pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts

@CecileRobertMichon
Copy link
Contributor Author

I don't know that we have a test that uses the out of tree template + CI version specifically outside of https://testgrid.k8s.io/provider-azure-cloud-provider-azure#cloud-provider-azure-conformance-windows-capz, will run a local validation

@CecileRobertMichon
Copy link
Contributor Author

Not sure why cloud-provider failed to install (maybe it was a transient failure? @lzhecheng @jackfrancis) but in any case calico install was successful:

Installing cloud-provider-azure components via helm
Error: INSTALLATION FAILED: failed to download "https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo/cloud-provider-azure-1.24.2.tgz"
NAME                              STATUS   ROLES           AGE     VERSION                              INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION     CONTAINER-RUNTIME
capz-c31s-6qlf9                   Ready    <none>          2m44s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.6      <none>        Windows Server 2019 Datacenter   10.0.17763.3046    containerd://1.6.2
capz-c31s-vfqtj                   Ready    <none>          2m44s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.7      <none>        Windows Server 2019 Datacenter   10.0.17763.3046    containerd://1.6.2
capz-c31sw3-control-plane-trh7k   Ready    control-plane   5m41s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.0.0.4      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
capz-c31sw3-md-0-cjzhl            Ready    <none>          4m7s    v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.4      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
capz-c31sw3-md-0-kwqv7            Ready    <none>          4m7s    v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.5      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
NAMESPACE     NAME                                                      READY   STATUS    RESTARTS      AGE     IP         NODE                              NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-57cb778775-98pkx                  0/1     Pending   0             5m37s   <none>     <none>                            <none>           <none>
kube-system   calico-node-gcpz7                                         1/1     Running   0             4m8s    10.1.0.4   capz-c31sw3-md-0-cjzhl            <none>           <none>
kube-system   calico-node-nj2js                                         1/1     Running   0             5m36s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   calico-node-windows-hs52t                                 2/2     Running   1 (55s ago)   2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   calico-node-windows-z57bs                                 2/2     Running   1 (42s ago)   2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   calico-node-zhxqx                                         1/1     Running   0             4m8s    10.1.0.5   capz-c31sw3-md-0-kwqv7            <none>           <none>
kube-system   containerd-logger-chrv6                                   1/1     Running   0             2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   containerd-logger-mfmwc                                   1/1     Running   0             2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   coredns-6bd5b8bf54-2kmbw                                  0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>
kube-system   coredns-6bd5b8bf54-mc7vr                                  0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>
kube-system   csi-proxy-8b5j2                                           1/1     Running   0             94s     10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   csi-proxy-gqccj                                           1/1     Running   0             93s     10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   etcd-capz-c31sw3-control-plane-trh7k                      1/1     Running   0             5m41s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-apiserver-capz-c31sw3-control-plane-trh7k            1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-controller-manager-capz-c31sw3-control-plane-trh7k   1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-proxy-2p4xc                                          1/1     Running   0             4m8s    10.1.0.5   capz-c31sw3-md-0-kwqv7            <none>           <none>
kube-system   kube-proxy-5gjht                                          1/1     Running   0             4m8s    10.1.0.4   capz-c31sw3-md-0-cjzhl            <none>           <none>
kube-system   kube-proxy-5l4jf                                          1/1     Running   0             5m40s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-proxy-windows-df8xl                                  1/1     Running   0             2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   kube-proxy-windows-nvhf4                                  1/1     Running   0             2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   kube-scheduler-capz-c31sw3-control-plane-trh7k            1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   metrics-server-74557696d7-wl6sk                           0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>

@CecileRobertMichon
Copy link
Contributor Author

CecileRobertMichon commented Jul 7, 2022

New flake, CSI driver deployment failed:

Timed out after 900.037s.
Deployment default/csi-azuredisk-controller failed

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-azure/2462/pull-cluster-api-provider-azure-e2e/1545146418140811264

cc @jackfrancis @sonasingh46

/retest

@lzhecheng
Copy link
Contributor

Not sure why cloud-provider failed to install (maybe it was a transient failure? @lzhecheng @jackfrancis) but in any case calico install was successful:

Installing cloud-provider-azure components via helm
Error: INSTALLATION FAILED: failed to download "https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo/cloud-provider-azure-1.24.2.tgz"
NAME                              STATUS   ROLES           AGE     VERSION                              INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION     CONTAINER-RUNTIME
capz-c31s-6qlf9                   Ready    <none>          2m44s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.6      <none>        Windows Server 2019 Datacenter   10.0.17763.3046    containerd://1.6.2
capz-c31s-vfqtj                   Ready    <none>          2m44s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.7      <none>        Windows Server 2019 Datacenter   10.0.17763.3046    containerd://1.6.2
capz-c31sw3-control-plane-trh7k   Ready    control-plane   5m41s   v1.25.0-alpha.2.117+e9b96b167fbe5b   10.0.0.4      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
capz-c31sw3-md-0-cjzhl            Ready    <none>          4m7s    v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.4      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
capz-c31sw3-md-0-kwqv7            Ready    <none>          4m7s    v1.25.0-alpha.2.117+e9b96b167fbe5b   10.1.0.5      <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.6.2
NAMESPACE     NAME                                                      READY   STATUS    RESTARTS      AGE     IP         NODE                              NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-57cb778775-98pkx                  0/1     Pending   0             5m37s   <none>     <none>                            <none>           <none>
kube-system   calico-node-gcpz7                                         1/1     Running   0             4m8s    10.1.0.4   capz-c31sw3-md-0-cjzhl            <none>           <none>
kube-system   calico-node-nj2js                                         1/1     Running   0             5m36s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   calico-node-windows-hs52t                                 2/2     Running   1 (55s ago)   2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   calico-node-windows-z57bs                                 2/2     Running   1 (42s ago)   2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   calico-node-zhxqx                                         1/1     Running   0             4m8s    10.1.0.5   capz-c31sw3-md-0-kwqv7            <none>           <none>
kube-system   containerd-logger-chrv6                                   1/1     Running   0             2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   containerd-logger-mfmwc                                   1/1     Running   0             2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   coredns-6bd5b8bf54-2kmbw                                  0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>
kube-system   coredns-6bd5b8bf54-mc7vr                                  0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>
kube-system   csi-proxy-8b5j2                                           1/1     Running   0             94s     10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   csi-proxy-gqccj                                           1/1     Running   0             93s     10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   etcd-capz-c31sw3-control-plane-trh7k                      1/1     Running   0             5m41s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-apiserver-capz-c31sw3-control-plane-trh7k            1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-controller-manager-capz-c31sw3-control-plane-trh7k   1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-proxy-2p4xc                                          1/1     Running   0             4m8s    10.1.0.5   capz-c31sw3-md-0-kwqv7            <none>           <none>
kube-system   kube-proxy-5gjht                                          1/1     Running   0             4m8s    10.1.0.4   capz-c31sw3-md-0-cjzhl            <none>           <none>
kube-system   kube-proxy-5l4jf                                          1/1     Running   0             5m40s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   kube-proxy-windows-df8xl                                  1/1     Running   0             2m45s   10.1.0.7   capz-c31s-vfqtj                   <none>           <none>
kube-system   kube-proxy-windows-nvhf4                                  1/1     Running   0             2m45s   10.1.0.6   capz-c31s-6qlf9                   <none>           <none>
kube-system   kube-scheduler-capz-c31sw3-control-plane-trh7k            1/1     Running   0             5m39s   10.0.0.4   capz-c31sw3-control-plane-trh7k   <none>           <none>
kube-system   metrics-server-74557696d7-wl6sk                           0/1     Pending   0             5m40s   <none>     <none>                            <none>           <none>

Yes, it seems to be transient.

Copy link
Contributor

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 8, 2022
@k8s-ci-robot k8s-ci-robot merged commit 7e69da8 into kubernetes-sigs:main Jul 8, 2022
@lzhecheng
Copy link
Contributor

/cherrypick release-1.3

@k8s-infra-cherrypick-robot

@lzhecheng: new pull request created: #2464

In response to this:

/cherrypick release-1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CecileRobertMichon
Copy link
Contributor Author

/cherrypick release-1.4

this merged after we cut the tag for v1.4.0

@CecileRobertMichon
Copy link
Contributor Author

/cherry-pick release-1.4

@k8s-infra-cherrypick-robot

@CecileRobertMichon: new pull request created: #2465

In response to this:

/cherrypick release-1.4

this merged after we cut the tag for v1.4.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-infra-cherrypick-robot

@CecileRobertMichon: new pull request could not be created: failed to create pull request against kubernetes-sigs/cluster-api-provider-azure#release-1.4 from head k8s-infra-cherrypick-robot:cherry-pick-2462-to-release-1.4: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"A pull request already exists for k8s-infra-cherrypick-robot:cherry-pick-2462-to-release-1.4."}],"documentation_url":"https://docs.github.com/rest/reference/pulls#create-a-pull-request"}

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci-entrypoint.sh creates Windows clusters without calico installed on Windows Nodes
7 participants