Skip to content

Commit

Permalink
Merge pull request #128 from zzguang/master
Browse files Browse the repository at this point in the history
Update English docs for edge-pod-network and node-pool-management
  • Loading branch information
rambohe-ch authored Jul 5, 2022
2 parents e3f80de + 6025e0d commit 6557424
Show file tree
Hide file tree
Showing 8 changed files with 187 additions and 152 deletions.
63 changes: 35 additions & 28 deletions docs/user-manuals/network/edge-pod-network.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,79 +2,86 @@
title: Edge Pod Network
---

## 背景说明
## Background

在边缘场景中,为了确保云边网络断开时节点上的业务仍能稳定运行,OpenYurt提供了边缘节点自治功能。该功能能够实现云边断网情况下,业务Pod的异常自动重启、节点重启自动拉起业务Pod等功能。然而,为了保证云边断网后业务Pod和节点的重启不影响边边的容器网络,这里有一些网络情况需要适配。
In Cloud Edge usage scenarios,in order to ensure the workloads on the edge nodes can still work stably even in condition of the cloud edge network is disconnected, openyurt provides the edge node autonomy capability. It can realize the automatic restart for the abnormal pod and the auto start of the pods when the edge node is restarted. However, in order to ensure that the node and pod restart will not affect the edge container network when the cloud edge network is disconnected, some network configurations need to be adapted.

## Flannel: VTEP MAC地址保持
### 场景
如果我们使用flannel作为CNI插件,且后端为VXLAN模式。在节点上会创建一个VTEP设备(通常命名为": flannel.1),同时VNI和VTEP的信息会被记录到节点的annotations中,供其它节点创建相应的路由和转发规则。
## Flannel: VTEP MAC address kept
### use case
If we adopt flannel as CNI plugin and VXLAN as the backend, when a VTEP device is created on a node(generally named: flannel.1), the VNI and VTEP info will be record in the node annotations,so that other nodes can create the related route and forward rules.
Flannel arch is shown below:

![flannel-architecture](../../../static/img/docs/user-manuals/network/flannel-architecture.png)

Flannel的架构如图所示,我们用两个边缘节点来举例说明
- node2创建flannel.1设备,MAC地址为"9e:c9:07:f9:b3:8b",IP地址为"172.30.133.0",那么node2的annotations中有如下记录。
Let's take an example with 2 edge nodes
- node2 creates flannel.1 device,MAC address is "9e:c9:07:f9:b3:8b",IP address is "172.30.133.0", the related info will be record in node2 annotations:
```yaml
# node2 annotations with vtep info.
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"9e:c9:07:f9:b3:8b"}'
flannel.alpha.coreos.com/public-ip: 10.0.0.20
```
- node1 将使用node2的这些信息,在主机上配置fdb、arp以及路由信息
- node1 will leverage the related info in node2 to configure the fdb/arp/routing info in host network:
```shell script
# node1 host network namespace.
fdb: 9e:c9:07:f9:b3:8b dev flannel.1 dst 10.0.0.20 self permanent
arp: ? (172.30.133.0) at 9e:c9:07:f9:b3:8b [ether] PERM on flannel.1
route: 172.30.133.0/26 via 172.30.133.0 dev flannel.1 onlink
```
每当node2重启之后,flannel会重新创建VTEP设备,VTEP的MAC地址也发生变化,并更新到node2的annotations中。然而,如果此时node2或者node1与云端的网络断开,node1将不能感知到node2的MAC地址的变化,这将导致node1与node2上的Pod无法正常通信。
Every time node2 restarts, flannel will re-create the vtep device, and the MAC address of vtep will also be changed and updated to node2 annotations. However, if node2 or node1 is disconnected from the cloud network at this time, node1 will not be notified about the change of node2's MAC address, which will cause the pod on node1 and node2 to fail to communicate normally.
### Solution: Add Mac address kept capability in Flannel
Every time the edge node restarts, flannel firstly reads the MAC address information in the node's annotations (from the local cache of apiserver or yurthub). If it exists, it uses this MAC address as the VTEP MAC address.
In order to implement this capability, the flannel source code modifications reference:
### 解决方法: 在Flannel中增加MAC地址保持功能
每次节点重启时,flannel优先读取本节点annotations中MAC地址信息(从apiserver或者yurt-hub的本地缓存中读取),如果存在则使用这个MAC地址作为VTEP的MAC地址。
为了实现这个能力,需要对flannel的代码做改动,参考如下:
```shell script
git clone https://github.com/flannel-io/flannel.git ;
git clone https://github.com/flannel-io/flannel.git;
cd flannel;
git reset --hard e634dabe0af446b765db3b729085b32f97ff6fe6;
wget https://raw.githubusercontent.com/openyurtio/openyurt/master/docs/tutorial/0001-flannel-keep-vtep-mac.patch;
git am 0001-flannel-keep-vtep-mac.patch;
```
flannel-edge镜像地址
flannel-edge image location:
```
Docker镜像仓库: docker.io/openyurt/flannel-edge:v0.14.0-1
阿里云镜像仓库: registry.cn-hangzhou.aliyuncs.com/openyurt/flannel-edge:v0.14.0-1
Docker image location: docker.io/openyurt/flannel-edge:v0.14.0-1
Aliyun image location: registry.cn-hangzhou.aliyuncs.com/openyurt/flannel-edge:v0.14.0-1
```
## IPAM: Pod IP地址保持
### 场景
在大多数场景中,使用host-local为Pod分配IP地址。host-local从nodecidr中选择空闲的IP地址分配给新的Pod,并将已分配的IP地址信息记录在本地文件中。
例如,如果IPAM的数据目录为/var/lib/cni/networks/cbr0, 它的记录信息如下:
## IPAM: Pod IP address kept
### user case
In most scenarios, the pod is assigned an IP address through host-local. Host-local selects the idle IP address from nodecidr and assigns it to the new pod, and records the allocated IP address information in the local file.

For example, if the IPAM data directory is /var/lib/cni/networks/cbr0, its record information is as below:

```shell script
$ ls /var/lib/cni/networks/cbr0
172.30.132.194 172.30.132.198 172.30.132.201
```
当云边断网时,Pod重启会导致host-local重新分配IP地址,且Pod IP地址的变化无法同步到云端,其它边缘节点上的kube-proxy等组件无法感知到Pod IP的变化,则无法使用Cluster IP地址访问业务Pod。
When the cloud edge network is disconnected, pod restart will cause the host-local to reassign an IP address, and the change of pod IP address cannot be synchronized to the cloud. Components such as Kube proxy on other edge nodes cannot be notified about the change of pod IP, so the cluster IP address cannot be used to access the pod.

### Solution: Pod IP address kept
To solve this problem, you need to adjust the host-local code: the format of recording IP address is {ip}-{pod namespace}-{pod name}. When the pod restarts, host-local will firstly assign the IP address of the pod with the same name in the record.

The assigned pod IP records are changed as below:

### 解决办法: Pod IP地址保持
为了解决这个问题,需要调整host-local的代码:记录IP地址的格式为{ip}-{pod namespace}-{pod name}。当pod重启时,host-local将优先使用记录中同名Pod的IP地址。
调整后的已分配的Pod IP记录如下:
```shell script
$ ls /var/lib/cni/networks/cbr0
172.30.132.194_kube-system_coredns-vstxh 172.30.132.198_kube-system_nginx-76df748b9-4cz95 172.30.132.201_kube-system_nginx-76df748b9-nf5l9
```
host-local的代码修改参考:
host-local source code modifications reference:
```shell script
git clone https://github.com/containernetworking/plugins.git ;
cd plugins;
git reset --hard 9ebe139e77e82afb122e335328007bca86905ae4;
wget https://raw.githubusercontent.com/openyurtio/openyurt/master/docs/tutorial/0002-ipam-keep-pod-ip.patch;
git am 0002-ipam-keep-pod-ip.patch;
```
host-local的cni rpm包部署
host-local cni rpm update:
```
rpm -ivh https://github.com/openyurtio/openyurt/releases/download/v0.6.0/openyurt-cni-0.8.7-0.x86_64.rpm
```

## 为flannel增加"get node"权限
flannel调整之后,需要在原有的RBAC之后,提供 "get node"的权限。参考:
## Add "get node" permission for Flannel
After the flannel is adjusted, the permission of "get node" needs to be added along with the original RBAC, below is the modification reference:
```diff
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
Expand Down
20 changes: 10 additions & 10 deletions docs/user-manuals/workload/node-pool-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,24 @@ title: Node Pool Management

###

### 1)安装Yurt-App-Manager组件
### 1)Install Yurt-App-Manager Components

```shell
$ cd yurt-app-manager
$ kubectl apply -f config/setup/all_in_one.yaml
```

等待Yurt-App-Manager组件安装成功,验证
Check whether all the Yurt-App-Manager components are installed successfully:

```shell
$ kubectl get pod -n kube-system |grep yurt-app-manager
```



### 2)节点池使用Example
### 2)Example of Nodepool usage

- 创建一个节点池
- Create a nodepool

```shell
$ cat <<EOF | kubectl apply -f -
Expand Down Expand Up @@ -51,7 +51,7 @@ spec:
EOF
```

- 使用kubectl get节点池信息
- Get the nodepool information

```shell
$ kubectl get np
Expand All @@ -61,9 +61,9 @@ beijing Cloud 35s
hangzhou Edge 28s
```

- 将节点加入到节点池
- Add node to nodepool

添加云端节点Cloud node到北京节点池,你只需将此节点按如下方式打上label即可
Add a cloud node to nodepool "beijing", you only need to label the node as below:

```shell
$ kubectl label node {Your_Node_Name} apps.openyurt.io/desired-nodepool=beijing
Expand All @@ -78,7 +78,7 @@ $ kubectl label node master apps.openyurt.io/desired-nodepool=beijing
master labeled
```

当然,你也可以将你的边缘节点Edge node添加到杭州节点池,方法和上面类似
Similarly, you can add the edge nodes to nodepool "hangzhou":

```shell
$ kubectl label node {Your_Node_Name} apps.openyurt.io/desired-nodepool=hangzhou
Expand All @@ -92,9 +92,9 @@ $ kubectl label node k8s-node2 apps.openyurt.io/desired-nodepool=hangzhou
k8s-node2 labeled
```

- 验证节点已经加入节点池
- Verify whether a node is added to a nodepool:

当Edge node成功加入到节点池,节点的配置信息除了节点池Spec中的所有内容,同时,节点添加了一个新的标签:apps.openyurt.io/nodepool
When an edge node is added to a nodepool, all the annotations/labels of the nodepool are added to the node, together with a new label: apps.openyurt.io/nodepool

```shell
$ kubectl get node {Your_Node_Name} -o yaml
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,70 +2,86 @@
title: Edge Pod Network
---

## 背景说明
## Background

在边缘场景中,为了确保云边网络断开时节点上的业务仍能稳定运行,OpenYurt提供了边缘节点自治功能。该功能能够实现云边断网情况下,业务Pod的异常自动重启、节点重启自动拉起业务Pod等功能。然而,为了保证云边断网后业务Pod和节点的重启不影响边边的容器网络,这里有一些网络情况需要适配。
In Cloud Edge usage scenarios,in order to ensure the workloads on the edge nodes can still work stably even in condition of the cloud edge network is disconnected, openyurt provides the edge node autonomy capability. It can realize the automatic restart for the abnormal pod and the auto start of the pods when the edge node is restarted. However, in order to ensure that the node and pod restart will not affect the edge container network when the cloud edge network is disconnected, some network configurations need to be adapted.

## Flannel: VTEP MAC地址保持
### 场景
如果我们使用flannel作为CNI插件,且后端为VXLAN模式。在节点上会创建一个VTEP设备(通常命名为": flannel.1),同时VNI和VTEP的信息会被记录到节点的annotations中,供其它节点创建相应的路由和转发规则。
## Flannel: VTEP MAC address kept
### use case
If we adopt flannel as CNI plugin and VXLAN as the backend, when a VTEP device is created on a node(generally named: flannel.1), the VNI and VTEP info will be record in the node annotations,so that other nodes can create the related route and forward rules.
Flannel arch is shown below:

![flannel-architecture](../../../../static/img/docs/user-manuals/network/flannel-architecture.png)

Flannel的架构如图所示,我们用两个边缘节点来举例说明
- node2创建flannel.1设备,MAC地址为"9e:c9:07:f9:b3:8b",IP地址为"172.30.133.0",那么node2的annotations中有如下记录。
Let's take an example with 2 edge nodes
- node2 creates flannel.1 device,MAC address is "9e:c9:07:f9:b3:8b",IP address is "172.30.133.0", the related info will be record in node2 annotations:
```yaml
# node2 annotations with vtep info.
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"9e:c9:07:f9:b3:8b"}'
flannel.alpha.coreos.com/public-ip: 10.0.0.20
```
- node1 将使用node2的这些信息,在主机上配置fdb、arp以及路由信息
- node1 will leverage the related info in node2 to configure the fdb/arp/routing info in host network:
```shell script
# node1 host network namespace.
fdb: 9e:c9:07:f9:b3:8b dev flannel.1 dst 10.0.0.20 self permanent
arp: ? (172.30.133.0) at 9e:c9:07:f9:b3:8b [ether] PERM on flannel.1
route: 172.30.133.0/26 via 172.30.133.0 dev flannel.1 onlink
```
每当node2重启之后,flannel会重新创建VTEP设备,VTEP的MAC地址也发生变化,并更新到node2的annotations中。然而,如果此时node2或者node1与云端的网络断开,node1将不能感知到node2的MAC地址的变化,这将导致node1与node2上的Pod无法正常通信。
Every time node2 restarts, flannel will re-create the vtep device, and the MAC address of vtep will also be changed and updated to node2 annotations. However, if node2 or node1 is disconnected from the cloud network at this time, node1 will not be notified about the change of node2's MAC address, which will cause the pod on node1 and node2 to fail to communicate normally.
### Solution: Add Mac address kept capability in Flannel
Every time the edge node restarts, flannel firstly reads the MAC address information in the node's annotations (from the local cache of apiserver or yurthub). If it exists, it uses this MAC address as the VTEP MAC address.
In order to implement this capability, the flannel source code modifications reference:
### 解决方法: 在Flannel中增加MAC地址保持功能
每次节点重启时,flannel优先读取本节点annotations中MAC地址信息(从apiserver或者yurt-hub的本地缓存中读取),如果存在则使用这个MAC地址作为VTEP的MAC地址。
为了实现这个能力,需要对flannel的代码做改动,参考如下:
```shell script
git clone https://github.com/flannel-io/flannel.git ;
git clone https://github.com/flannel-io/flannel.git;
cd flannel;
git reset --hard e634dabe0af446b765db3b729085b32f97ff6fe6;
wget https://raw.githubusercontent.com/openyurtio/openyurt/master/docs/tutorial/0001-flannel-keep-vtep-mac.patch;
git am 0001-flannel-keep-vtep-mac.patch;
```
flannel-edge image location:
```
Docker image location: docker.io/openyurt/flannel-edge:v0.14.0-1
Aliyun image location: registry.cn-hangzhou.aliyuncs.com/openyurt/flannel-edge:v0.14.0-1
```
## IPAM: Pod IP address kept
### user case
In most scenarios, the pod is assigned an IP address through host-local. Host-local selects the idle IP address from nodecidr and assigns it to the new pod, and records the allocated IP address information in the local file.

For example, if the IPAM data directory is /var/lib/cni/networks/cbr0, its record information is as below:

## IPAM: Pod IP地址保持
### 场景
在大多数场景中,使用host-local为Pod分配IP地址。host-local从nodecidr中选择空闲的IP地址分配给新的Pod,并将已分配的IP地址信息记录在本地文件中。
例如,如果IPAM的数据目录为/var/lib/cni/networks/cbr0, 它的记录信息如下:
```shell script
$ ls /var/lib/cni/networks/cbr0
172.30.132.194 172.30.132.198 172.30.132.201
```
当云边断网时,Pod重启会导致host-local重新分配IP地址,且Pod IP地址的变化无法同步到云端,其它边缘节点上的kube-proxy等组件无法感知到Pod IP的变化,则无法使用Cluster IP地址访问业务Pod。
When the cloud edge network is disconnected, pod restart will cause the host-local to reassign an IP address, and the change of pod IP address cannot be synchronized to the cloud. Components such as Kube proxy on other edge nodes cannot be notified about the change of pod IP, so the cluster IP address cannot be used to access the pod.

### Solution: Pod IP address kept
To solve this problem, you need to adjust the host-local code: the format of recording IP address is {ip}-{pod namespace}-{pod name}. When the pod restarts, host-local will firstly assign the IP address of the pod with the same name in the record.

The assigned pod IP records are changed as below:

### 解决办法: Pod IP地址保持
为了解决这个问题,需要调整host-local的代码:记录IP地址的格式为{ip}-{pod namespace}-{pod name}。当pod重启时,host-local将优先使用记录中同名Pod的IP地址。
调整后的已分配的Pod IP记录如下:
```shell script
$ ls /var/lib/cni/networks/cbr0
172.30.132.194_kube-system_coredns-vstxh 172.30.132.198_kube-system_nginx-76df748b9-4cz95 172.30.132.201_kube-system_nginx-76df748b9-nf5l9
```
host-local的代码修改参考:
host-local source code modifications reference:
```shell script
git clone https://github.com/containernetworking/plugins.git ;
cd plugins;
git reset --hard 9ebe139e77e82afb122e335328007bca86905ae4;
wget https://raw.githubusercontent.com/openyurtio/openyurt/master/docs/tutorial/0002-ipam-keep-pod-ip.patch;
git am 0002-ipam-keep-pod-ip.patch;
```
## 为flannel增加"get node"权限
flannel调整之后,需要在原有的RBAC之后,提供 "get node"的权限。参考:
host-local cni rpm update:
```
rpm -ivh https://github.com/openyurtio/openyurt/releases/download/v0.6.0/openyurt-cni-0.8.7-0.x86_64.rpm
```

## Add "get node" permission for Flannel
After the flannel is adjusted, the permission of "get node" needs to be added along with the original RBAC, below is the modification reference:
```diff
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,24 @@ title: Node Pool Management

###

### 1)安装Yurt-App-Manager组件
### 1)Install Yurt-App-Manager Components

```shell
$ cd yurt-app-manager
$ kubectl apply -f config/setup/all_in_one.yaml
```

等待Yurt-App-Manager组件安装成功,验证
Check whether all the Yurt-App-Manager components are installed successfully:

```shell
$ kubectl get pod -n kube-system |grep yurt-app-manager
```



### 2)节点池使用Example
### 2)Example of Nodepool usage

- 创建一个节点池
- Create a nodepool

```shell
$ cat <<EOF | kubectl apply -f -
Expand Down Expand Up @@ -51,7 +51,7 @@ spec:
EOF
```

- 使用kubectl get节点池信息
- Get the nodepool information

```shell
$ kubectl get np
Expand All @@ -61,9 +61,9 @@ beijing Cloud 35s
hangzhou Edge 28s
```

- 将节点加入到节点池
- Add node to nodepool

添加云端节点Cloud node到北京节点池,你只需将此节点按如下方式打上label即可
Add a cloud node to nodepool "beijing", you only need to label the node as below:

```shell
$ kubectl label node {Your_Node_Name} apps.openyurt.io/desired-nodepool=beijing
Expand All @@ -78,7 +78,7 @@ $ kubectl label node master apps.openyurt.io/desired-nodepool=beijing
master labeled
```

当然,你也可以将你的边缘节点Edge node添加到杭州节点池,方法和上面类似
Similarly, you can add the edge nodes to nodepool "hangzhou":

```shell
$ kubectl label node {Your_Node_Name} apps.openyurt.io/desired-nodepool=hangzhou
Expand All @@ -92,9 +92,9 @@ $ kubectl label node k8s-node2 apps.openyurt.io/desired-nodepool=hangzhou
k8s-node2 labeled
```

- 验证节点已经加入节点池
- Verify whether a node is added to a nodepool:

当Edge node成功加入到节点池,节点的配置信息除了节点池Spec中的所有内容,同时,节点添加了一个新的标签:apps.openyurt.io/nodepool
When an edge node is added to a nodepool, all the annotations/labels of the nodepool are added to the node, together with a new label: apps.openyurt.io/nodepool

```shell
$ kubectl get node {Your_Node_Name} -o yaml
Expand Down
Loading

0 comments on commit 6557424

Please sign in to comment.