-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BGP capabilities to the NAT GW #4285
Conversation
This is in progress, I'll fix conflicts in a few hours, but you can at least see how I got the thing to work for now |
Great job! 🎉 |
06a8772
to
68b7c3c
Compare
Please add Signed-off-by in the commit message and rebase the master branch. <COMMIT MESSAGE>
Signed-off-by: Author Name <[email protected]> |
Yeah I've done that but not pushed it yet, i'm seeing an awful lot of messages such as in the controller logs, idk if my code broke something or not ever since i rebased the changes you've done yesterday on the nat gateway or if it's something else, I need to investigate |
Ah, you modified a bunch of stuff in the shell script of the nat-gateway image and the pullPolicy is IfNotPresent, I had an old version running. This is now fixed, everything seems to be working. I'll add a few lines to make the image used as a speaker a parameter, and find a way to configure the BGP neighbors. Do you think we should make this a parameter of the VpcNatGateway CRD or just something adjusted from the nat-gw configmap ? @zhangzujian |
68b7c3c
to
cabf32b
Compare
CRD would be better. |
Edited a CRD to enable the bgp speaker per gw and to override the parameters sent to kube-ovn-speakers. You guys can look at it and tell me if it's worthy of a merge! |
1d4d4ab
to
f54cf9c
Compare
Hi, thanks for the review, I'll make the commits. |
f54cf9c
to
667b31f
Compare
Is the last test a flake? |
576b3f8
to
f4ab271
Compare
@zhangzujian Hi, PR is now fully green! |
E0720 08:07:53.667742 13 vpc_nat_gateway.go:294] failed to create statefulset 'vpc-nat-gw-gw1', err: StatefulSet.apps "vpc-nat-gw-gw1" is invalid: spec.template.annotations: Invalid value: ".kubernetes.io/routes": prefix part a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*') Please fix it. |
The error is caused by your configuration lacking the NAD Provider (same setup as the one needed by the VPC DNS), I was relying on the same configmap as the VPC-DNS one by mistake. Now the VPC NATGW configmap has a field to provide a networkattachementdefinition so that the speaker can actually talk with the API server. See the PR's first message for the complete setup of the NAD. You need the field I added an error message to make the problem more explicit. |
7259c02
to
23d5273
Compare
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]>
0bf9d68
to
f6fbee5
Compare
I fixed the conflict introduced by the latest commit on main |
Well apparently there's also a new linter now, so I need to do some changes on the fmt part.. |
Signed-off-by: SkalaNetworks <[email protected]>
I'm kind of busy recently. Will review the commits later. |
k8s.v1.cni.cncf.io/networks: kube-system/ovn-vpc-external-network, default/ Still fails. Please fix it. |
@@ -747,6 +790,18 @@ func (c *Controller) genNatGwStatefulSet(gw *kubeovnv1.VpcNatGateway, oldSts *v1 | |||
util.LogicalSwitchAnnotation: gw.Spec.Subnet, | |||
util.IPAddressAnnotation: gw.Spec.LanIP, | |||
} | |||
|
|||
if gw.Spec.BgpSpeaker.Enabled { // Add an interface that can reach the API server | |||
defaultSubnet, err := c.subnetsLister.Get(c.config.DefaultLogicalSwitch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have introduced configMap option apiNadProvider
, we should use the nad instead of hard-coded default subnet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use function findSubnetByNetworkAttachmentDefinition
instead of using the default subnet directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Kubernetes apiserver runs in the default subnet doesn't it? The NAD just happens to be connected to that subnet. Do you want me to:
- Lookup every subnet in the cluster
- Determine which one has a provider equal to our NAD
- Get its gateway
That can be an option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Kubernetes apiserver runs in the default subnet doesn't it?
K8s apiserver runs in control plane nodes with host network. It can be accessed from subnets in the default vpc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NAD can be pointed to ANY subnet which is running in the default vpc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, should kube-ovn have a default NAD installed in the default VPC for kube-dns and the NAT GW? That would be extremely handy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example I'm using:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ovn-nad
namespace: default
spec:
config: '{
"cniVersion": "0.3.0",
"type": "kube-ovn",
"server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
"provider": "ovn-nad.default.ovn"
}'
---
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
name: vpc-apiserver-subnet
spec:
protocol: IPv4
cidrBlock: 100.100.100.0/24
provider: ovn-nad.default.ovn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should kube-ovn have a default NAD installed in the default VPC for kube-dns and the NAT GW?
NAD resources can be created only when the CRD is installed in the cluster, while the CRD should be installed by users.
Signed-off-by: SkalaNetworks <[email protected]>
New ConfigMap option: apiVersion: v1
data:
apiNadName: ovn-nad
apiNadProvider: ovn-nad.default.ovn
bgpSpeakerImage: docker.io/kubeovn/vpc-nat-gateway:v1.13.0
image: docker.io/kubeovn/vpc-nat-gateway:v1.13.0
kind: ConfigMap
metadata:
name: ovn-vpc-nat-config
namespace: kube-system This should fix it. |
return errors.New("no NetworkAttachmentDefinition provided to access apiserver, check configmap ovn-vpc-nat-config and field 'apiNadName'") | ||
} | ||
|
||
nad := fmt.Sprintf("%s/%s, %s/%s", c.config.PodNamespace, externalNetwork, corev1.NamespaceDefault, vpcNatAPINadName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the NAD namspace is always default
, there is no need to set both apiNadName
and apiNadProvider
in the configMap since <apiNadProvider>
equals <apiNadName>.default.ovn
or <apiNadName>.default
.
How about getting NAD name and namespace from the NAD provider?
I am not very clear what you mean. What's the best implement in your mind? |
I probably don't understand what providers are from the point of view of Kube-OVN.
But my NAD has a provider called "ovn-nad.default.ovn" not ovn-nad.default, which makes me think Kube-OVN actually finds the NAD based on the name of the provider, not by decomposing it into name + namespace? Or am I mistaken? |
Ok, seems like the "dot ovn" in the provider is a special case handled by "findSubnetByNetworkAttachmentDefinition" |
NAD with a suffix |
@SkalaNetworks can you add a doc to describe this feature? |
I'll be doing it ASAP, I'm also adding the feature mentionned by @zhangzujian to make the NAD name/provider one parameter. |
Related PR: #4352 |
Signed-off-by: SkalaNetworks <[email protected]>
Signed-off-by: SkalaNetworks <[email protected]> Signed-off-by: bobz965 <[email protected]>
Pull Request
What type of this PR
Examples of user facing changes:
Which issue(s) this PR fixes
Fixes 4217
This PR is highly experimental, do not merge in master.
To test:
We need to have a NAD for the interface that the speaker will use to talk to the apiserver.
For that we follow the exact same pattern as for the vpc-dns
Create a NAD with access to kube-ovn default network:
Edit the ovn-default subnet to use the NAD as a provider.
Create new permissions for the gateway to poll the apiserver:
Launch a NAT GW and mark its EIP with the BGP annotation:
ovn.kubernetes.io/bgp: "true"