Skip to content

Commit

Permalink
add proposal of multiplexing cloud-edge traffic
Browse files Browse the repository at this point in the history
  • Loading branch information
rambohe-ch committed Apr 14, 2022
1 parent 8302b71 commit fd2bb50
Show file tree
Hide file tree
Showing 3 changed files with 88 additions and 0 deletions.
Binary file added docs/img/multiplexing-traffic/img1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/multiplexing-traffic/img2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
88 changes: 88 additions & 0 deletions docs/proposals/20220414-multiplexing-cloud-edge-traffic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: multiplexing cloud-edge traffic
authors:
- "@rambohe-ch"
- "@Peeknut"
reviewers:
- "@Fei-Guo"
- "@huangyuqi"
creation-date: 2022-04-14
last-updated: 2022-04-14
status: provisional
---

# Proposal to multiplexing cloud-edge traffic

## Glossary
Refer to the [OpenYurt Glossary](https://github.com/openyurtio/openyurt/blob/master/docs/proposals/00_openyurt-glossary.md).

## Summary
In OpenYurt cluster, the traffic between cloud and edge should go through public network. so it is very valuable to reduce the traffic between cloud and edge. In the proposal [#772]([https://github.com/openyurtio/openyurt/pull/772](https://github.com/openyurtio/openyurt/pull/772)), pool-coordinator pod will be deployed in each node pool for storing metadata as a kv storage. In this proposal, we will approach a way to reduce cloud-edge traffic based on `pool-coordinator`and ensure the consistency of edge side metadata.

## Motivation
In a large-scale OpenYurt cluster (eg: node > 1k, pod > 20k, service > 1k), since coredns/kube-proxy on the edge nodes lists and watches all endpoints/endpointslices, the cloud-edge traffic will increase rapidly if pods are deleted and rebuilt due to business upgrade. In addition, frequent node state switching(Ready/NotReady) will cause NodePool updates, which will also lead to a huge increase in cloud-edge traffic.
Based on the traffic statistics of Yurthub metrics(http://127.0.0.1:10267/metrics), the top traffic consuming requests on the edge nodes are as following:
- flannel list/watch nodes
- coredns/kube-proxy list/watch endpoints/endpointslices
- yurthub list/watch nodepools
- kubelet get node

This proposal makes it possible to solve the above problems through `multiplexing cloud-edge traffic`ability.

### Goals
In NodePool, for pool scope data (such as endpoints/endpointslices), there is only one copy of data traffic between cloud and edge. Each node in NodePool reuses nodepool scope data to reduce cloud edge traffic.

### Non-Goals/Future Work
node scope data will not be involved in this solution, every node in NodePool need to keep the original way to get node scope data.

## Proposal
### Definition

- pool scope data
- The metadata from cloud of each node in the NodePool is the same, such as list/watch endpoints/endpointslices response by coredns and kube-proxy components.
- node scope data
- The data of each node in the NodePool is unique, such as pods, secrets, configmaps, etc.

### Architecture
pool scope data(endpoints, endpointslices) is written to the pool-coordinator by the leader YurtHub.
It is ensured that the pool scope data in the pool-coordinator is the latest version, so list/watch requests for pool scope data from standby yurthub can be obtained from the pool-coordinator and no longer obtained from the cloud, which can greatly reduce cloud-edge traffic.
![img.png](../img/multiplexing-traffic/img1.png)
The process can be described as:

- **Step1**: At the beginning, All YurtHubs send list/watch requests of pool scope data to the cloud.
- **Step2**: When the pool-coordinator is started and the leader YurtHub is elected, the leader YurtHub creates new list/watch requests of pool scope data and writes the response data to the pool-coordinator. Leader Yurthub will write a completion flag(a configmap) into pool-coordinator for specifying pool scope data has synced and all yurthub will be notified that it's the time to list/watch pool scope data from pool-coordinator. At this time, each YurtHub still sends list/watch requests to the cloud.
- **Step3**: After leader YurtHub finishes writing pool scope data, each YurtHub stops sending list/watch requests of pool scope data to the cloud and redirect list/watch requests to pool-coordinator. At this time, only the leader YurtHub still maintains list/watch requests with the cloud and keeps writing data to the pool-coordinator.
- **Step4**: If the pool-coordinator goes offline, that is, NodePool governance capabilities is disabled, YurtHub redirect the List/Watch requests of pool scope data to the cloud again.

The following special cases may occur:

**Condition1**: If the leader YurtHub changes, the new leader will take over all the work: send list/watch requests of pool scope data to the cloud, and write data to pool-coordinator. the former leader Yurthub stop the list/watch requests to the cloud.

**Condition2**: If the pool-coordinator restarts/rebuilds, during the restart of the pool-coordinator, YurtHub will still try to connect with the pool-coordinator for a certain period of time. After pool-coordinator starts,Step2-3 will be executed.

### Write Metadata into pool-coordinator
Pool scope data and node scope data are stored in pool-coordinator. Node scope data is written by each YurtHub, while pool scope data is written by only leader YurtHub.
Leader YurtHub is elected by all YurtHubs in node pool(like kube-controller-manager). Because it needs to ensure that the leader YurtHub can get the real-time pool scope data from kube-apiserver, the leader YurtHub and the cloud must be connected. When the leader is disconnected from the cloud, other YurtHubs connected to the cloud become the leader.

![img.png](../img/multiplexing-traffic/img2.png)

### User Stories

1. As a user, I would like to reduce traffic between cloud and edge as much as possible to reduce costs.

### Other tech points
#### Data Traffic Conclusion
| | node scope data | pool scope data |
| --- | --- | --- |
| cloud-edge network on | cloud kube-apiserver --> yurthub --> kubelet/kube-proxy | cloud kube-apiserver --> leader yurthub --> pool-coordinator --> every yurthub --> kubelet/kube-proxy |
| cloud-edge network off | edge node local storage --> yurthub --> kubelet/kube-proxy | 1. a completion flag exists in pool-coordinator: pool-coordinator--> yurthub --> kubelet/kube-proxy <br/>2. a completion flag does not exist in pool-coordinator: edge node local storage --> yurthub --> kubelet/kube-proxy |

#### Pool Scope Data Protection
If resources for Endpoints/EndpointSlices of an empty list are returned from pool-coordinator for some unknown reason, yurthub directly hacks the return of the list request to prevent data cleanup on edge nodes and return error to the clients.

#### Service Topology Notification
Through the traffic reuse of Pool Scope Data, some nodes in NodePool are disconnected from the cloud network. However, the nodes that are disconnected can also obtain the latest Endpoints/Endpointslices data (because they belong to pool Scope data) through pool-coordinator, so the service topology changes in NodePool can also be notified when the network is disconnected.

## Implementation History
- [ ] 04/14/2022: Present proposal at a community meeting and collect feedbacks.
- [ ] xx/xx/2022: Finalize the proposal.

0 comments on commit fd2bb50

Please sign in to comment.