Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ Add strict validation for CIDR ranges specified in Clusters #7420

Merged

Conversation

killianmuldoon
Copy link
Contributor

@killianmuldoon killianmuldoon commented Oct 17, 2022

Signed-off-by: killianmuldoon [email protected]

Add a check in the Cluster webhook to ensure each CIDR block only contains valid CIDR blocks.

/kind bug

Fixes: #7358

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 17, 2022
@killianmuldoon killianmuldoon changed the title 🐛 Add validation for number of CIDR ranges specified in Clusters 🐛 [WIP] Add validation for number of CIDR ranges specified in Clusters Oct 17, 2022
@killianmuldoon
Copy link
Contributor Author

@marvinWolff - it would be great to get your input on this.

@killianmuldoon
Copy link
Contributor Author

flake?

/retest

@fabriziopandini
Copy link
Member

@jayunit100 PTAL if you have some time. We are discussing validation rules for CIDR in ClusterNetwork (ServiceCIDRs and Node/PodsCIDR)

@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch 3 times, most recently from 19fa536 to 03f6846 Compare October 18, 2022 10:22
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 18, 2022
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from 03f6846 to cfe6ea5 Compare October 18, 2022 10:29
@killianmuldoon
Copy link
Contributor Author

I've updated the implementation to match that done by kubeadm:

The following is validated in the webhook:

  1. No more than two CIDR blocks are specified under Pods or Services
  2. If two are specified the blocks need to be from different IP families i.e. one IPv4 and one IPv6
  3. The IPFamily for pods and services must be compatible
  4. The CIDR ranges are valid CIDR ranges

This is done using the Cluster's GetIPFamily method. This is currently used in CAPI in the topology controller and in CAPD.

This change could be breaking for some providers if they were using these values for something other than passing them directly to Kubernetes e.g. informing an IP pool for some IPAM CNI, so I think we might need to be careful with including this change. That may be the case in #7358

The base problem is that validation on this field is done differently depending on whether or not users are using ClusterClass. If it's better to leave webhook validation as is, then validation should no longer be done when computing variables on the ClusterClass to make it consistent.

@killianmuldoon killianmuldoon changed the title 🐛 [WIP] Add validation for number of CIDR ranges specified in Clusters 🐛 Add validation for number of CIDR ranges specified in Clusters Oct 18, 2022
@killianmuldoon
Copy link
Contributor Author

/hold

For more input from providers on the blast radius of this change

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 18, 2022
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, rest lgtm

api/v1beta1/cluster_types.go Outdated Show resolved Hide resolved
api/v1beta1/cluster_types.go Outdated Show resolved Hide resolved
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch 2 times, most recently from b05d675 to 6849902 Compare October 18, 2022 14:59
@vincepri
Copy link
Member

vincepri commented Oct 19, 2022

We've discussed during the Cluster API office hours meeting on Oct 19th and decided by consensus that we should upgrade this PR to a ⚠️ (breaking change) and go ahead and merge the stricter validations, given that's what ultimately Kubernetes expects.

@killianmuldoon will send an email to the mailing list.

@killianmuldoon killianmuldoon changed the title 🐛 Add validation for number of CIDR ranges specified in Clusters ⚠️ Add validation for number of CIDR ranges specified in Clusters Oct 19, 2022
@marvinWolff
Copy link

marvinWolff commented Oct 20, 2022

Hello @killianmuldoon,

sorry for the late reply.

This looks good to me. Validating what kubernetes expects is a good idea to prevent creating clusters with a basically invalid configuration like we did.

We also solved our issue in the meantime by specifiying the whole ip-range and using calico ip-pools.

@killianmuldoon
Copy link
Contributor Author

Thanks @marvinWolff can you share which providers you were using with the original config? Just wondering how the original version you had was working initially.

@killianmuldoon
Copy link
Contributor Author

I did some digging into how Kubernetes validates these IP ranges (and which components use which) across the core components.

The conclusion is that these ranges are assumed to be either a single entry or dualstack if there are two entries. Supplying more than two CIDR ranges is not supported. With this in mind I think the validation approach in this PR is correct.

Details:

Kube Proxy

ClusterCIDR from flag cluster-cidr. CIDR range of pods in cluster.
Validated so there can not be more than two entries. If there are two entries they must be dualstack.
Ref: https://github.com/kubernetes/kubernetes/blob/7df6c02288833c7be66f425e03c1e858d2343f1f/pkg/proxy/apis/config/validation/validation.go#L73

Kube APIServer

ServiceClusterIPRanges from flag service-cluster-ip-range. CIDR range of services in the cluster.
Validated so there can not be more than two entries. If there are two entries they must be dualstack.
Ref: https://github.com/kubernetes/kubernetes/blob/6dc81c3280996a9a4267bf8ef30d4bda36f8a293/cmd/kube-apiserver/app/options/validation.go#L45

Kube Controller Manager

ClusterCIDR from flag cluster-cidr. CIDR range of pods in cluster. Validates so there can not be more than two entries. If there are two entries they must be dualstack.
Ref: https://github.com/kubernetes/kubernetes/blob/08749750a96fe7f236befa40076a3117e7b36733/staging/src/k8s.io/cloud-provider/app/core.go#L102

ServiceCIDR from flag service-cluster-ip-range. CIDR range for Services in Cluster. Takes the first two values given, but doesn't validate if more than two are given, instead just ignoring additional values. If there are two entries they must be dualstack.

Ref: https://github.com/kubernetes/kubernetes/blob/7b478d15d5838e14b8286c9b541c8cbd626cbe3c/cmd/kube-controller-manager/app/options/nodeipamcontroller.go#L51

Kubelet

PodCIDR from flag pod-cidr string. Only used in standalone mode, not relevant here.

Kube Scheduler

No CIDR related flags.

@killianmuldoon
Copy link
Contributor Author

/hold

Want to bring this up one more time at office hours today - we can merge after that.

@elmiko
Copy link
Contributor

elmiko commented Nov 9, 2022

as discussed in the community meeting on 9 November, we'd like to get more comments on this before merging. please review the defaults as described in #7521 (comment) , these will be encoded by this PR.

if there are no additional comments or objections by 11 November, we will merge this.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2022
@sbueringer
Copy link
Member

@killianmuldoon Can you rebase when you have some time?

@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from 8808d54 to a1e0ee5 Compare November 10, 2022 19:04
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2022
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from a1e0ee5 to ed74c64 Compare November 10, 2022 19:08
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2022
@sbueringer
Copy link
Member

Thank you very much!!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 11, 2022
@sbueringer
Copy link
Member

/assign @fabriziopandini

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fact that we are not making IpFamily a first-class concern given that it is CAPD specific.
I have one last question on validation to make sure we are not blocking use cases allowed upstream (like the one in the attached issue).

@@ -443,3 +470,40 @@ func machineDeploymentClassOfName(clusterClass *clusterv1.ClusterClass, name str
}
return nil
}

// validateCIDRBlocks validates that:
// 1) No more than two CIDR blocks are specified.
Copy link
Member

@fabriziopandini fabriziopandini Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a Kubernetes limitation that applies both to service and pod cidr?
same for 2

Copy link
Member

@fabriziopandini fabriziopandini Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not we are not sure I will simply validate all the CIDR are valid, without making assumptions on the combination of them in this first iteration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@killianmuldoon
Copy link
Contributor Author

/remove-hold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2022
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from ed74c64 to 84ee36d Compare November 14, 2022 12:58
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2022
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from 84ee36d to 0d1bed7 Compare November 14, 2022 13:04
@fabriziopandini
Copy link
Member

/lgtm
Thanks for opening the follow-up issue, appreciated

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2022
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit, otherwise lgtm

internal/webhooks/cluster.go Outdated Show resolved Hide resolved
@killianmuldoon killianmuldoon force-pushed the webhook/add-cidr-validation branch from 0d1bed7 to bef7d72 Compare November 14, 2022 15:05
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2022
@sbueringer
Copy link
Member

Thx!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2022
@k8s-ci-robot k8s-ci-robot merged commit f9fcf3f into kubernetes-sigs:main Nov 14, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.3 milestone Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when using multiple cidrBlocks
8 participants