-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Egress Policy v1alpha1 implementation #1924
Comments
Thanks for the details of version 1. A question for controlplane API: could we reuse AppliedToGroup instead of adding a new EgressGroup? |
And for AppliedTo why do not have ClusterGroup and Service reference there? Is it for simplification of 1st version? |
cc @ceclinux to take a look at work breakdown. |
I thought about this but didn't find real benefits to do so so switched to another way that could reduce code redundancy and grouping caculation between all kinds of groups, including clustergroups, appliedtogroups, addressgroups and egressgroups.
I copied the struct from your PR. Supporting ClusterGroup and Service reference should be ok, I don't think of big effort introduced by them. Feel free to add them to the design if you think they should be in 1st version. |
But from understanding/troubleshooting perspective, it is much better to use a single type, and map a single ClusterGroup to a single AddressGroup or AppliedToGroup. I think better to support ClusterGroup and Service reference too. I can update my PR with my ideas. |
I think you mean having another API path but use the same struct, like "/v1alpha1/egressgroups" will get the new set of AppliedToGroup. However, clientset code is generated based on the name of the struct or the "resourceName" tag of the struct. I think it won't work if we use same struct in same API group as the paths in the generated clientset will be exactly same. And what do you think the first and second problems I mentioned above, especially the second? I think the group for egresspolicy has more difference from AppliedToGroup for NetworkPolicy: it needs to include podIP information and dispatched to all egress nodes, which is more like AddressGroup for Egress Node but AppliedToGroup for non Egress Node. |
@jianjuns Given the fact that all agents need to watch all Egresses and there shouldn't be overlapped groups for Egress, I found there is no much value to have a controlplane Egress API as we could just create a EgressGroup with the same name as the Egress resource (just like Service and Endpoint), then use the Egress's name to get its group on agent side to save many code (the controller in antrea-controller can focus on syncing EgressGroup, antrea-agent can leverage Egress Informer), let me know if you have concern on this. This is the code on antrea-controller side: 178405b |
I am fine to watch Egresses directly from K8s API for now. We can decide what to do later (when we have another solution to discover/assign SNAT IPs). |
All code changes have been merged, closing |
Describe what you are trying to solve
This proposal summarizes the first alpha version of the Egress feature. Please see #667 for the complete proposal.
In v1alpha1, we require users to manually configure SNAT IPs on the Nodes. In an Egress, a particular SNAT IP can be specified for the selected Pods, and antrea-controller will publish the selected Pods of Egresses to the Nodes on which the selected Pods run.
There will be some limiations in the first version:
encap
mode is the only supported traffic mode. Some features and scenarios, e.g. HA, dual-stack and windows are not supported.Describe how your solution impacts user flows
Describe the main design/architecture of your solution
API change
An user-facing API will be introduced. The object schema will be like below:
Egress's Pod selection is calculated by antrea-controller and transmitted to antrea-agent via a controlplane API EgressGroup. This is mainly to avoid redundant Pod watching and group calculation when resolving "AppliedTo".
A Egress's corresponding EgressGroup will use the same name for agent to identify, like Service and Endpoint resource.
Control Plane
antrea-controller
antrea-controller watches the Egress resource from Kubernetes API, creates the EgressGroup resources. EgressGroup API in the controlplane API group will provide list, get, and watch interface for agents to consume.
antrea-agent
antrea-agent watches the above EgressGroup API and Egress API, then:
For each Egress, it checks whether the
EgressIP
is configured on the Node it runs on. If yes, it allocates a locally-unique ID (usage mentioned in the "Data plane" section below) for this IP and configures corresponding openflow rules and iptables rules to enforce SNAT for specific traffic. Otherwise it does nothing.For each Pod in EgressGroup, it checks whether the associated
EgressIP
is local or not. If local, it configures specific openflow rules to forward the traffic coming from the Pod to the gateway interface with specific mark set. If remote, it configures specific openflow rules to forward the traffic to the tunnel interface with specific tunnel destination set.Data Plane
(Copied from #667 (comment))
On the Node, antrea-agent will realize the SNATPolicy with OVS flows and iptables rules. If the SNAT IP is not present on the local Node, the packets to be SNAT'd will be tunneled to the SNAT Node using the SNAT IP to be the tunnel destination IP. On the SNAT Node, the tunnel destination IP will be directly used as the SNAT IP.
On the SNAT Node, an iptables rule will be added to perform the SNAT with the specified SNAT IP, but which SNAT IP to use for a given packet is controlled by the OVS flows. The OVS flows will mark a packet that needs to be SNAT'd with a SNAT IP with the corresponding integer ID, and the corresponding iptables SNAT rule matches the packet MARK.
The OVS flow changes include:
iptables rules:
iptables -t nat -A POSTROUTING -m mark --mark snat_id -j SNAT --to-source snat_ip
Work breakdown
Alternative solutions that you considered
NONE
Test plan
Add E2E tests to verify specific Pods are translated to specific IP when accessting an http server deployed "outside" the cluster (it could be a host-network Pod running on a Node that is different from the Egress Node.
Additional context
Any other relevant information.
The text was updated successfully, but these errors were encountered: