-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many BGP routing entries and neighbors between kube-router server and connected network devices #923
Comments
You can try the following two methods: |
Each kubernetes cluster in our production environment has 4000 nodes, and the whole network is interconnected by BGP, which has been running stably for more than one year. There are many problems with kube-router in the large kubernetes cluster, and we have done a lot of optimization, so I want to contribute some information to the community.I have contributed an enhanced function in the large kubernetes cluster network to Kube router, as well as several practical documents about the large kubernetes cluster network. |
I think your changes are reasonable, we have the same network topology and also will suffer from the same problem. |
Just to clarify there is nothing implicit about the kube-router design that one would see these challenges with routing pod network CIDR. Users have to carefully choose the knobs provided kube-router that suits them. You could use iBGP or peer with just external routers or use route felectors etc. These are standard BGP configuration network engineers deal with. In this e.g. #923 (comment) these are the type of choices (e.g. --enable-ibgp=false) one has to choose at network design stage.
Again I would not design large scale network where all service VIP for all the services are advertised. You should use A prescribed operations guide to design network topology with kube-router would be good. Hopefully documentation in #920 will evolve in this direction. |
Yes, when I communicate with many R & D personnel of other companies, I find that they have the same problem. When the scale of kubernets cluster network becomes larger, the problem becomes more serious. |
If you set "--advertide-cluster-ip=false", our kubernetes service will not be able to route out. However, in the large-scale kubernetes cluster network, if we have the following requirements at the same time: We set the "--enable-ibgp=false", "--advertise-cluster-IP=true" and "--advertise-cluster-subnet=" parameters at the same time. Please see the solution documentation( https://github.com/cloudnativer/kube-router-cnlabs/blob/advertise-cluster-subnet/docs/large-networks01.md ) Related yaml files can be found in https://github.com/cloudnativer/kube-router-cnlabs/blob/advertise-cluster-subnet/daemonset/kube-router-daemonset-advertise-cluster-subnet.yaml
If you set "--advertide-cluster-ip=false", our kubernetes service will not be able to route out. However, in the large-scale kubernetes cluster network, if we have the following requirements at the same time: We set the "--enable-ibgp=false", "--advertise-cluster-IP=true" and "--advertise-cluster-subnet=" parameters at the same time. Please see the solution documentation( https://github.com/cloudnativer/kube-router-cnlabs/blob/advertise-cluster-subnet/docs/large-networks01.md ) Related yaml files can be found in https://github.com/cloudnativer/kube-router-cnlabs/blob/advertise-cluster-subnet/daemonset/kube-router-daemonset-advertise-cluster-subnet.yaml |
Let me add that I will further improve the document according to what you said in the near future. |
@cloudnativer Have you tried |
[requirements and test instructions] Suppose we have a kubernetes service network segment with a range of 172.30.0.0/16. There are 100 running services in the cluster. [Test 1]
Routing table description on the uplink network device:
[Test 2]
Routing table description on the uplink network device:
[Test 3]
Routing table description on the uplink network device:
[Test 4]
3 Args is set to:
Routing table description on the uplink network device:
[Test 5]
Routing table description on the uplink network device:
Attach my yaml template file for testing: I didn't use "kube-router.io/service.advertise.clusterip" to test the effect you said. Did I test it wrong? Or this "kube-router.io/service.advertise.clusterip" can't realize my previous requirements? |
Please note that I've changed "advertise-cluster-subnet" to "advertise-service-cluster-ip-range". |
@cloudnativer Apologies for delay in reverting back. I am focussing on getting 1.0 release out so hence the delay. Will leave comment in the PR |
OK。 |
Adding some context to the problem. Kube-router's implementation of network load balancer is based on Ananta and Maglev. In both the models there are set of dedicated load balancer nodes (Mux in ananta and Maglev in Maglev) which are BGP speakers and advertise service VIP's. In case of Kubernetes each nodes is a load balancer/service proxy as well. So essentially each node in the cluster is part of distributed load balancer. So if each of them is BGP speaker then advertising /32 routes for service VIP's can bloat the routing table as desribed above. But perhaps this is something that can be addressed at leaf routers by advertising service IP range. Neverthless its good weigh in pros and cons and presribe when to use what. |
Yes, I agree with that.
Yes, we can advertise the service IP range on the leaf router to reduce the number of spine routers.But in a large-scale kubernetes cluster network, if all Kube-routers advertise 32-bit host routing, the number of routes on the leaf router will also multiply. If only advertising the service IP range on the leaf router, it can't solve the problem of increasing the number of routes on the leaf router itself.Therefore, we need to be able to achieve the IP range of advertising service on the Kube-router of the server, which is used to reduce the number of leaf routers and uplink routers. |
According to the requirements of murali-Reddy, we split the document and code:
|
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stale for 5 days with no activity. |
Using Kube router in large-scale kubernetes cluster will lead to too many BGP neighbors and BGP routing entries of Kube router server and connected network devices by default, which will seriously affect the network performance of the cluster. Is there any good way to reduce the routing entries of both sides and the performance loss, so as to support the larger cluster network?
The text was updated successfully, but these errors were encountered: