Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dual-Stack Capabilities to NSC #1544

Merged
merged 15 commits into from
Sep 23, 2023
Merged

Add Dual-Stack Capabilities to NSC #1544

merged 15 commits into from
Sep 23, 2023

Conversation

aauren
Copy link
Collaborator

@aauren aauren commented Sep 14, 2023

This is an initial stab at adding dual-stack capabilities to NSC which is the last controller needed to merge into master and do a release of v2.0.0.

I still need to do some additional testing, but I figured I would put this up in case anyone else in the community wanted to test it. It has so far been stable for me, but there are a LOT of changes contained in this MR and it was rough work getting all of the logic in line.

One of the things to specifically call out is that we now use EndpointSlices instead of Endpoints in order to get ALL the possible IP endpoints for a given service as Endpoints only returns what it considers the best IP for a given pod endpoint rather than all of the IPs it might have on it (especially those from another family).

@aauren
Copy link
Collaborator Author

aauren commented Sep 14, 2023

Fixes #1432

During our initial run, fail fatally when we encounter problems rather
than just continuing on and causing subsequent problems and potentially
burying the real error.
There is absolutely no reason that we should ever assume netmasks, and
even if we did, we shouldn't modify them as a side-effect of a
completely different operation. No idea was this was ever coded this
way. Netmask is now set upstream to the appropriate mask for the IP
family.
With the advent of IPv6 integrated into the NSC we no longer get all IPs
from endpoints, but rather just the primary IP of the pod (which is
often, but not always the IPv4 address).

In order to get all possible endpoint addresses for a given service we
need to switch to using EndpointSlice which also nicely groups addresses
into IPv4 and IPv6 by AddressType and also gives us more information
about the endpoint status by giving us attributes for serving and
terminating, instead of just ready or not ready.

This does mean that users will need to add another permission to their
RBAC in order for kube-router to access these objects.
Adds more logging information (in the form of warnings) when we come
across common errors that are not big enough to stop processing, but
will still confuse users when the error gets bubbled up to NSC.
Before this, we had 2 different ways to interact with ipsets, through
the handler interface which had the best handling for IPv6 because NPC
heavily utilizes it, and through the ipset struct which mostly repeated
the handler logic, but didn't handle some key things.

NPC utilized the handler functions and NSC / NRC mostly utilized the old
ipset struct functions. This caused a lot of duplication between the two
groups of functions and also caused issues with proper IPv6 handling.

This commit consolidates the two sets of usage into just the handler
interface. This greatly simplifies how the controllers interact with
ipsets and it also reduces the logic complexity on the ipset side.

This also fixes up some inconsistency with how we handled IPv6 ipset
names. ipset likes them to be prefixed with inet6:, but we weren't
always doing this in a way that made sense and was consistent across all
functions in the ipset struct.
@aauren aauren force-pushed the add_dualstack_to_nsc branch from 49b11d1 to c51b814 Compare September 19, 2023 20:33
Don't just compare the primary IP according to k8s, but all IPs that the
pod contains.
For IPv6 we need to have family specific links inside the pod to receive
the ip6ip6 and ipip traffic that we are sending.
With advertiseService set to false by default, it means that it won't
ever get re-evaluated if the service isn't a local host and will ALWAYS
result in withdrawing the VIPs which is incorrect. It needs to default
to true, and only override the boolean if serviceLocal is set to true.
@aauren aauren changed the title Draft: Add Dual-Stack Capabilities to NSC Add Dual-Stack Capabilities to NSC Sep 23, 2023
@aauren aauren merged commit 65ac87d into prep-v2.0 Sep 23, 2023
@aauren aauren deleted the add_dualstack_to_nsc branch September 23, 2023 20:15
@aauren aauren mentioned this pull request Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant