Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable configurable API Server Load Balancer #974

Merged

Conversation

CecileRobertMichon
Copy link
Contributor

@CecileRobertMichon CecileRobertMichon commented Oct 2, 2020

What type of PR is this?
/kind feature
/kind api-change

What this PR does / why we need it: This PR allows configuring the API Server load balancer to enable things like private clusters, eg:

  networkSpec:
    apiServerLB:
      type: Internal

It adds defaulting + validating webhooks for the new Load Balancer spec. It also allows BYO api server public IP via specifying a pre-existing IP.

Check out the docs to see what this exposes to the user: https://deploy-preview-974--kubernetes-sigs-cluster-api-provider-azure.netlify.app/topics/api-server-endpoint.html

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #486, #696, #952, #892

Special notes for your reviewer:

Still work in progress, opening to get early feedback.

Changes are currently breaking.

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squash commits
  • documentation
  • unit tests
  • back compat for public API server
  • private cluster needs management cluster in the same vnet
  • BYO API Server IP delete
  • make sure that private control plane VMs have outbound access

Release note:

Allow configuration of the API Server Load Balancer, including support for private API Server endpoint.
Allow BYO API Server IP

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 2, 2020
@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 2, 2020
@fejta-bot
Copy link

Unknown CLA label state. Rechecking for CLA labels.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/check-cla

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 3, 2020
// +optional
PrivateIPAddress string `json:"privateIP,omitempty"`
// +optional
PublicIP *PublicIPSpec `json:"publicIP,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 7, 2020
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 8, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2020
@CecileRobertMichon CecileRobertMichon marked this pull request as ready for review October 9, 2020 22:47
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 9, 2020
@CecileRobertMichon
Copy link
Contributor Author

I still need to fix unit tests and find a way to test private clusters but otherwise, most functionality should be there so this is ready for initial review.

@CecileRobertMichon CecileRobertMichon changed the title [WIP] Enable configurable API Server Load Balancer Enable configurable API Server Load Balancer Oct 13, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 13, 2020
@CecileRobertMichon
Copy link
Contributor Author

CecileRobertMichon commented Oct 13, 2020

/hold for private cluster test

@devigned @alexeldeib @nader-ziada @mboersma this is ready for initial review

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2020
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/assign @alexeldeib

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2020
@nader-ziada
Copy link
Contributor

🎉 great work Cecile
lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2020
@alexeldeib
Copy link
Contributor

is this the hairpin issue? looks like a timeout hitting the api server with 3 master setup? didn't look too deeply I admit

/test pull-cluster-api-provider-azure-e2e

 Unexpected error:
      <*url.Error | 0xc0004d8090>: {
          Op: "Delete",
          URL: "https://capz-e2e-b3o4h8-a7437409.westus2.cloudapp.azure.com:6443/api/v1/namespaces/default/services/httpd-ilb",
          Err: {
              Op: "read",
              Net: "tcp",
              Source: {IP: [10, 60, 5, 156], Port: 44610, Zone: ""},
              Addr: {IP: [20, 69, 81, 80], Port: 6443, Zone: ""},
              Err: {Syscall: "read", Err: 0x6e},
          },
      }
      Delete https://capz-e2e-b3o4h8-a7437409.westus2.cloudapp.azure.com:6443/api/v1/namespaces/default/services/httpd-ilb: read tcp 10.60.5.156:44610->20.69.81.80:6443: read: connection timed out
  occurred

@CecileRobertMichon
Copy link
Contributor Author

is this the hairpin issue? looks like a timeout hitting the api server with 3 master setup? didn't look too deeply I admit

no it was the HA public cluster spec that failed capz-e2e: Workload cluster creation With 3 control-plane nodes and 2 worker nodes, looks like a flake

@CecileRobertMichon
Copy link
Contributor Author

@alexeldeib let me know when you're okay with me rebasing

@CecileRobertMichon
Copy link
Contributor Author

Rebased + squashed

@alexeldeib @devigned @nader-ziada ready for a final look

Cecile Robert-Michon added 6 commits November 4, 2020 11:16
Update azure_test.go
Update cluster.go

attach VMs to outbound LB

fix PublicLBAddressPoolName

address comments

Update networkinterfaces_test.go

address comments

cleanup OutboundPoolName/APIServerLBName

fix tests
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 4, 2020

@CecileRobertMichon: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-cluster-api-provider-azure-apidiff fd5c6ad link /test pull-cluster-api-provider-azure-apidiff

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 4, 2020
@devigned
Copy link
Contributor

devigned commented Nov 5, 2020

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: devigned

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 5, 2020
@k8s-ci-robot k8s-ci-robot merged commit 2dfc5ea into kubernetes-sigs:master Nov 5, 2020
@k8s-ci-robot k8s-ci-robot added this to the v0.4.10 milestone Nov 5, 2020
@CecileRobertMichon CecileRobertMichon deleted the private-cluster branch March 19, 2021 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Private clusters
9 participants