-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-1880 Multiple Service CIDRs #3365
Conversation
aojea
commented
Jun 8, 2022
- One-line PR description: Services IP Ranges API: Initial implementation
- Issue link: Multiple Service CIDRs #1880
- Other comments:
// IPRangeSpec describe how the IPRange's specification looks like. | ||
type IPRangeSpec struct { | ||
// An IPv4 block in CIDR notation "10.0.0.0/8" | ||
IPv4 string `json:"ipv4"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bother having multiple families? What benefit does that add? To me, it seems like a single family per object would be easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find the discussion in the multiple clusterCidr API KEP #2594, but I made exactly the same point your are doing now, but changed my opinion.
Having different objects for IP family makes harder the reconciliation and validation, and you can always leave one empty for single stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also could not find the exact discussion, but IIRC there were a couple main arguments:
- for pod CIDRs we have to specify a per-node mask, and it was wierd to have potentially different masks per family
- configuration races - have 1 family per resource means it is possible to observe the config "half complete" and fail to allocate IPs of the intended family.
(1) does not apply here, exactly, but it is kind of similar - one could argue that you usually want service CIDRs to be the same size across families.
(2) I'm not super worried about this, personally - if someone asks for v6 spcifically, and v6 was not configured yet, it will fail syncronously. If they said preferDualStack
, then they clearly don't NEED it.
That said, I see some value in forming this the same as the CCC KEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on this new feedback and the desire of resizing and scale up and down ranges I think we should go with single family #3365 (comment)
To be realistic, once we remove the size limitation on IPv6 I don't think that people will need multiple ranges ono IPv6, a /64 is huge, the actual problem is with IPv4 and scale up or scale down service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest arguments in favor of supporting both are:
- Consistency with CCC
- A single rnage object for apiserver default ranges
- No race if the user ever changes the config by adding a new range
I don't want to block on this - can we add a beta criterion to revisit this and see if we are still happy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm renaming everything to be consistent with the multi-cluster-cidr KEP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the PRR sections need love, too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For further changes, please use new commits. We have confused the crap out of Github's UI (again)
|
||
During this time, there is a chance that an apiserver tries to allocate this IPAddress, with a possible situation where | ||
2 Services has the same IPAddress. In order to avoid it, the Allocator will not delete an IP from its local cache | ||
until it verifies that the consumer associated to that IP has been deleted too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forget if apiserver ones run repair loop at startup? If it does, that's enough to rebuild the cache and (at least for services) prevent duplicate allocation.
I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they do, there is a bootstrap controller that run in a post-start hook, this controller run the repair loops.
|
||
``` | ||
<<[UNRESOLVED keps/sig-network/3070-reserved-service-ip-range]>> | ||
Option 1: Maintain the same formula and behavior per ServiceCIDRConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think we should do option 1 (no API) and see if we really need option 3. I suspect the impl will be very similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"less is more" thing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty LGTM on this. DO you want to manually squash (with a good message) or just let the bot do it?
/lgtm |
the former 😄 |
Kubernetes networking is based fundamentally on two kind of networks, the Pod network aka Cluster Network and the Service Network. These networks are configured statically, using flags, causing a lot of limitations and pain on users and administrators. The Pod network is consumed by the CNI plugins, Kubernetes already provides a simple IPAM that will be extended by the KEP-2593 "Multiple Cluster CIDRs", allowing users to use an API to configure it. This KEP provides a similar API for the Service Network, allowing users to dynamically add more networks or resize the existing ones. It also remove some of the limitations of current implementation.
Ok, I think all my questions are answered, so once it has SIG approval I can give PRR approval |
Thanks! /lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, johnbelamaric, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* KEP-1880 Multiple Service CIDRs Kubernetes networking is based fundamentally on two kind of networks, the Pod network aka Cluster Network and the Service Network. These networks are configured statically, using flags, causing a lot of limitations and pain on users and administrators. The Pod network is consumed by the CNI plugins, Kubernetes already provides a simple IPAM that will be extended by the KEP-2593 "Multiple Cluster CIDRs", allowing users to use an API to configure it. This KEP provides a similar API for the Service Network, allowing users to dynamically add more networks or resize the existing ones. It also remove some of the limitations of current implementation. * clarify serviceCIDRConfig IPAddress relation