Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💎 provision vnet for aks clusters #1009

Merged
merged 1 commit into from
Oct 29, 2020

Conversation

alexeldeib
Copy link
Contributor

@alexeldeib alexeldeib commented Oct 22, 2020

What type of PR is this?

/kind api-change
/kind cleanup

What this PR does / why we need it:

pre-provisions vnets for aks clusters, replacement for #929. was taking too long to get that.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

This is pretty breaking...but I'm going to vote we iterate while in exp rather than avoid breaking users (i'm skeptical this has many users yet).

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

- AKS clusters provisioned via CAPZ now have predictably named virtual networks. Prior to this, virtual networks for AKS clusters were not predictably named. This also means backfilling the corresponding cluster specs is impossible. Action required: upgrading a cluster to this version requires manually updating `spec.virtualNetwork.Name` and `spec.virtualNetwork.Subnet.Name` to the values generated by AKS. Otherwise the cluster will fail to reconcile as the network and subnet names will not match.
- Changed `spec.resourceGroup` to `spec.resourceGroupName`. Action required: update your specs accordingly.

@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 22, 2020
@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 22, 2020
@k8s-ci-robot
Copy link
Contributor

@alexeldeib: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-cluster-api-provider-azure-apidiff 59bc01a link /test pull-cluster-api-provider-azure-apidiff

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@alexeldeib
Copy link
Contributor Author

notable:

sigs.k8s.io/cluster-api-provider-azure/exp/api/v1alpha3
  Incompatible changes:
  - (*AzureManagedControlPlane).SetDefaultSSHPublicKey: removed
  - (*AzureManagedControlPlane).ValidateSSHKey: removed
  - AzureManagedControlPlaneSpec.ResourceGroup: removed
  Compatible changes:
  - AzureManagedControlPlaneSpec.NodeResourceGroupName: added
  - AzureManagedControlPlaneSpec.ResourceGroupName: added
  - AzureManagedControlPlaneSpec.VirtualNetwork: added
  - ManagedControlPlaneSubnet: added
  - ManagedControlPlaneVirtualNetwork: added

@CecileRobertMichon
Copy link
Contributor

This is pretty breaking...but I'm going to vote we iterate while in exp rather than avoid breaking users (i'm skeptical this has many users yet).

+1

I haven't code reviewed in depth yet but overall these changes are sane to me

@alexeldeib
Copy link
Contributor Author

alexeldeib commented Oct 23, 2020

n.b.: we only deploy aks with system msi right now, afaik this avoids several permissions issues with AKS service principals // user assigned managed identities. In the future, we may need to either create appropriate rbac assignments for the cluster service principal or update docs to reflect required permissions: https://docs.microsoft.com/en-us/azure/aks/kubernetes-service-principal#networking

would follow up separately for that

azure "sigs.k8s.io/cluster-api-provider-azure/cloud"
)

// VNetScope defines the scope interface for a virtual network service.
type VNetScope interface {
logr.Logger
azure.ClusterDescriber
azure.NetworkDescriber
Vnet() *infrav1.VnetSpec
Copy link
Contributor

@CecileRobertMichon CecileRobertMichon Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This highlights a potential bug in the code... the vnet service is supposed to be unaware of the AzureCluster spec and just create / update / delete vnets from the specs it gets. The fact that it needs Vnet() which doesn't have any info that VNetSpecs() doesn't have is a little strange. Looking at the code, it uses that to write the info back into the Vnet() which makes sense but also would cause a bug if we have decided to have multiple specs returned by VnetSpecs() since they would overwrite each other. Not specific to this PR, in fact this PR makes it easy to see that bug, just reflecting 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is another case where it would make more sense if vnetSpec() didn't return an array @alexeldeib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -36,6 +36,8 @@ import (
"sigs.k8s.io/cluster-api-provider-azure/cloud/scope"
"sigs.k8s.io/cluster-api-provider-azure/cloud/services/groups"
"sigs.k8s.io/cluster-api-provider-azure/cloud/services/managedclusters"
"sigs.k8s.io/cluster-api-provider-azure/cloud/services/subnets"
"sigs.k8s.io/cluster-api-provider-azure/cloud/services/virtualnetworks"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add an e2e test for managed control plane clusters as part of moving it out of exp/ (can be optional on PRs). Is there already an issue tracking that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I should probably do this sooner rather than later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeldeib
Copy link
Contributor Author

small bump for review if anyone has some cycles 🙂

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2020
@k8s-ci-robot k8s-ci-robot merged commit 2bcdc7b into kubernetes-sigs:master Oct 29, 2020
@k8s-ci-robot k8s-ci-robot added this to the v0.4.10 milestone Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants