Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Templates and E2E tests use external cloud-provider-azure by default #1994

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/book/src/developers/kubernetes-developers.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,11 @@ export CLUSTER_TEMPLATE="test/dev/cluster-template-custom-builds.yaml"

To test changes made to the [Azure cloud provider](https://github.com/kubernetes-sigs/cloud-provider-azure), first build and push images for cloud-controller-manager and/or cloud-node-manager from the root of the cloud-provider-azure repo.

Then, use the `external-cloud-provider` flavor to create a cluster:
The default reference template uses the external cloud-provider, so simply update them to include references to your custom images. E.g.:

```bash
AZURE_CLOUD_CONTROLLER_MANAGER_IMG=myrepo/my-ccm:v0.0.1 \
AZURE_CLOUD_NODE_MANAGER_IMG=myrepo/my-cnm:v0.0.1 \
CLUSTER_TEMPLATE=cluster-template-external-cloud-provider.yaml \
CLUSTER_TEMPLATE=cluster-template.yaml \
make create-workload-cluster
```
6 changes: 4 additions & 2 deletions docs/book/src/topics/cloud-provider-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@ All cloud provider config values can be customized by creating the `${RESOURCE}-
</aside>


# External Cloud Provider
# External Cloud Provider components

To deploy a cluster using [external cloud provider](https://github.com/kubernetes-sigs/cloud-provider-azure), create a cluster configuration with the [external cloud provider template](https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/cluster-template-external-cloud-provider.yaml).
The recommended cloud provider configuration is to use the external cloud-provider-azure. The [default reference template](https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/cluster-template.yaml) specifies external cloud-provider-azure. Note: you must enable the `ClusterResourceSet` feature flag on your cluster-api management cluster in order to use the reference template. See [here](https://github.com/kubernetes-sigs/cluster-api/blob/v1.1.3/docs/book/src/tasks/experimental-features/experimental-features.md#enabling-experimental-features-on-existing-management-clusters) for more information on how to do that.

After deploying the cluster, you should eventually see a set of pods like the following in a `Running` state:

Expand All @@ -81,6 +81,8 @@ kube-system cloud-node-manager-mfsdg
kube-system cloud-node-manager-qrz74 1/1 Running 0 24s
```

The `cloud-node-manager` component will be scheduled to run on one (or more, if you have more than one replica) of your control plane nodes, and is responsible for doing the bulk of the work communicating with Azure to do cloud-provider-specific work. The `cloud-node-manager` component is a DaemonSet pod that runs on each node, and does ensures that each node running in Azure is healthy and ready for work.

## Storage Drivers

### Azure File CSI Driver
Expand Down
12 changes: 9 additions & 3 deletions scripts/ci-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,12 @@ select_cluster_template() {
export CI_VERSION="${CI_VERSION:-$(curl -sSL ${CI_VERSION_URL})}"
export KUBERNETES_VERSION="${CI_VERSION}"
else
export CLUSTER_TEMPLATE="test/ci/cluster-template-prow.yaml"
export CLUSTER_TEMPLATE="test/ci/cluster-template-prow-in-tree-cloud-provider.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition to in-tree tests relying on this script, there are also conformance jobs running with in-tree, eg. https://testgrid.k8s.io/provider-azure-1.23-signal#capz-conformance which run /scripts/ci-conformance.sh, and there are presubmit jobs that rely on the custom builds templates to build and install a custom version on k8s.

It might be worth doing a full audit of cloud-provider azure in tree/ out of tree jobs that rely on CAPZ to make sure all template references are covered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that call-out. I wonder if something as simple as this would work.

Changes to capz:

$ git diff test/e2e/conformance_test.go
diff --git a/test/e2e/conformance_test.go b/test/e2e/conformance_test.go
index 5106d834..a4f15eba 100644
--- a/test/e2e/conformance_test.go
+++ b/test/e2e/conformance_test.go
@@ -104,6 +104,9 @@ var _ = Describe("Conformance Tests", func() {
 
                kubernetesVersion := e2eConfig.GetVariable(capi_e2e.KubernetesVersion)
                flavor := clusterctl.DefaultFlavor
+               if os.Getenv("IN_TREE_CLOUDPROVIDER") == "true" {
+                       flavor = "in-tree-cloud-provider"
+               }
                if isWindows(kubetestConfigFilePath) {
                        flavor = getWindowsFlavor()
                }

Changes to test-infra specs:

$ git diff config/jobs/kubernetes/sig-cloud-provider/azure/release-master.yaml
diff --git a/config/jobs/kubernetes/sig-cloud-provider/azure/release-master.yaml b/config/jobs/kubernetes/sig-cloud-provider/azure/release-master.yaml
index 4af65012d0..f383e8c056 100644
--- a/config/jobs/kubernetes/sig-cloud-provider/azure/release-master.yaml
+++ b/config/jobs/kubernetes/sig-cloud-provider/azure/release-master.yaml
@@ -322,6 +322,8 @@ periodics:
         value: "latest"
       - name: CONFORMANCE_WORKER_MACHINE_COUNT
         value: "2"
+      - name: IN_TREE_CLOUDPROVIDER
+        value: "true"
       securityContext:
         privileged: true
       resources:

(We would also make the same env var addition to the remaining "release-*" test-infra test specs under config/jobs/kubernetes/sig-cloud-provider/azure/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K., this change was made. Here's the test-infra equivalent that we'll want to land as part of this effort:

kubernetes/test-infra#25598

fi

if [[ -n "${TEST_CCM:-}" ]]; then
export CLUSTER_TEMPLATE="test/ci/cluster-template-prow-external-cloud-provider.yaml"
export CLUSTER_TEMPLATE="test/ci/cluster-template-prow.yaml"
K8S_FEATURE_GATES="MixedProtocolLBService=true"
# shellcheck source=scripts/ci-build-azure-ccm.sh
source "${REPO_ROOT}/scripts/ci-build-azure-ccm.sh"
echo "Using CCM image ${AZURE_CLOUD_CONTROLLER_MANAGER_IMG} and CNM image ${AZURE_CLOUD_NODE_MANAGER_IMG} to build external cloud provider cluster"
Expand All @@ -103,8 +104,13 @@ select_cluster_template() {
# this requires k8s 1.22+
if [[ -n "${TEST_WINDOWS:-}" ]]; then
export WINDOWS_WORKER_MACHINE_COUNT="${WINDOWS_WORKER_MACHINE_COUNT:-2}"
export K8S_FEATURE_GATES="WindowsHostProcessContainers=true"
if [ "${K8S_FEATURE_GATES:-}" != "" ]; then
K8S_FEATURE_GATES+=",WindowsHostProcessContainers=true"
else
K8S_FEATURE_GATES="WindowsHostProcessContainers=true"
fi
fi
export K8S_FEATURE_GATES
}

create_cluster() {
Expand Down
210 changes: 210 additions & 0 deletions templates/cloud-provider-azure-ipv6/cloud-controller-manager.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloud-controller-manager
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:cloud-controller-manager
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
k8s-app: cloud-controller-manager
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- ""
resources:
- nodes
verbs:
- "*"
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
resources:
- services
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- services/status
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- serviceaccounts
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- get
- list
- update
- watch
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- create
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:cloud-controller-manager
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- kind: User
name: cloud-controller-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: system:cloud-controller-manager:extension-apiserver-authentication-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- apiGroup: ""
kind: User
name: cloud-controller-manager
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cloud-controller-manager
namespace: kube-system
labels:
component: cloud-controller-manager
spec:
selector:
matchLabels:
tier: control-plane
component: cloud-controller-manager
replicas: 1
template:
metadata:
labels:
component: cloud-controller-manager
tier: control-plane
spec:
priorityClassName: system-node-critical
hostNetwork: true
nodeSelector:
node-role.kubernetes.io/master: ""
serviceAccountName: cloud-controller-manager
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: cloud-controller-manager
image: ${AZURE_CLOUD_CONTROLLER_MANAGER_IMG:=mcr.microsoft.com/oss/kubernetes/azure-cloud-controller-manager:v1.1.5}
imagePullPolicy: IfNotPresent
command: ["cloud-controller-manager"]
args:
- "--allocate-node-cidrs=true"
- "--cloud-config=/etc/kubernetes/azure.json"
- "--cloud-provider=azure"
- "--cluster-cidr=2001:1234:5678:9a40::/58"
- "--bind-address=::"
- "--cluster-name=${CLUSTER_NAME}"
- "--controllers=*,-cloud-node" # disable cloud-node controller
- "--configure-cloud-routes=true" # "false" for Azure CNI and "true" for other network plugins
- "--leader-elect=true"
- "--node-cidr-mask-size=0"
- "--route-reconciliation-period=10s"
- "--v=2"
- "--port=10267"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: "4"
memory: 2Gi
livenessProbe:
httpGet:
path: /healthz
port: 10267
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: etc-kubernetes
mountPath: /etc/kubernetes
- name: etc-ssl
mountPath: /etc/ssl
readOnly: true
- name: msi
mountPath: /var/lib/waagent/ManagedIdentity-Settings
readOnly: true
volumes:
- name: etc-kubernetes
hostPath:
path: /etc/kubernetes
- name: etc-ssl
hostPath:
path: /etc/ssl
- name: msi
hostPath:
path: /var/lib/waagent/ManagedIdentity-Settings
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
namespace: default
resources:
- ../default
- ccm-resource-set.yaml

patchesStrategicMerge:
- patches/external-cloud-provider.yaml

configMapGenerator:
- name: cloud-controller-manager-addon
files:
Expand Down
27 changes: 27 additions & 0 deletions templates/cloud-provider-azure/ccm-resource-set.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
name: crs-ccm
namespace: default
spec:
strategy: "ApplyOnce"
clusterSelector:
matchLabels:
ccm: external
resources:
- name: cloud-controller-manager-addon
kind: ConfigMap
---
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
name: crs-node-manager
namespace: default
spec:
strategy: "ApplyOnce"
clusterSelector:
matchLabels:
ccm: external
resources:
- name: cloud-node-manager-addon
kind: ConfigMap
Loading