Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

分享节点池serviceTopology流量拓扑功能适配cilium-cni的问题及解决方案 #2269

Open
rayne-Li opened this issue Jan 14, 2025 · 0 comments
Labels
kind/feature kind/feature

Comments

@rayne-Li
Copy link

rayne-Li commented Jan 14, 2025

What would you like to be added:

分享兼容cilium-cni的流量闭环方案

Why is this needed:

cilium的networkpolicy及流量控制能力是flannel不具有的

others
/kind feature

特性

annotation Key annotation Value 说明
openyurt.io/topologyKeys kubernetes.io/hostname 流量被路由到相同的节点
openyurt.io/topologyKeys openyurt.io/nodepool 流量被路由到相同的节点池

参考文档:
https://openyurt.io/zh/docs/user-manuals/network/service-topology

https://kubeedge.io/blog/enable-cilium/#kubeedge-edgecore-setup

openyurt版本: 1.5.0

os: debian12

k8s版本: 1.31

准备工作

  • 必要: k8s版本>1.18, 在1.21之后的版本endpointSlice被移除featureGate, 不需要特别处理
  • 配置kube-proxy使用in-cluster设置连接yurt-hub
$ kubectl edit cm -n kube-system kube-proxy
apiVersion: v1
data:
  config.conf: |-
    clientConnection:
      #kubeconfig: /var/lib/kube-proxy/kubeconfig.conf # 2. comment this line.
      qps: 0
    clusterCIDR: 10.244.0.0/16
    configSyncPeriod: 0s
  • 必要: 确认yurt-hub正常运行,
  • 必要: yurthub 组件依赖于 yurt-manager 来批准 csr
  • 必要: 创建节点池
$ cat << EOF | kubectl apply -f -
apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
  name: fujian
spec:
  type: Cloud

---

apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
  name: wuhan
spec:
  type: Edge

---

apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
  name: wuqing
spec:
  type: Edge
EOF
  • 必要: 将节点加入到节点池中(通过打label的方式)
# kubectl get nodepool
NAME     TYPE    READYNODES   NOTREADYNODES   AGE
fujian   Cloud   2            0               7d21h
wuhan    Edge    2            0               7d21h
wuqing   Edge    2            0               7d21h

# kubectl get nb
NAME            NUM-NODES   AGE
fujian-7rxsj8   2           7d21h
wuhan1          2           7d17h
wuqing-cb6rvn   2           7d21h

创建测试工作负载

  • svc
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  annotations:
    openyurt.io/topologyKeys: openyurt.io/nodepool
  labels:
    app: busy-box
  name: busy-box-svc
spec:
  ports:
  - port: 3000
    protocol: TCP
    targetPort: 3000
  selector:
    app: busy-box
  type: ClusterIP
EOF
  • yas
apiVersion: apps.openyurt.io/v1beta1
kind: YurtAppSet
metadata:
  name: example
  namespace: default
  resourceVersion: "4501951"
  uid: 16c5e569-366c-4fd9-b2e2-379cf8ce8317
spec:
  nodepoolSelector:
    matchLabels:
      yurtappset.openyurt.io/type: nginx
  pools:
  - wuhan
  - wuqing
  - fujian
  workload:
    workloadTemplate:
      deploymentTemplate:
        metadata:
          labels:
            app: busy-box
        spec:
          replicas: 2
          selector:
            matchLabels:
              app: busy-box
          template:
            metadata:
              labels:
                app: busy-box
            spec:
              containers:
              - command:
                - nc
                - -lk
                - -p
                - "3000"
                - -e
                - /bin/hostname
                - -i
                image: busybox
                imagePullPolicy: Always
                name: busy-box
                ports:
                - containerPort: 3000
                resources: {}

测试结果(不能实现流量闭环)

  • 确认缓存和iptables的宿主设置都正确
cat /etc/kubernetes/cache/kube-proxy/endpointslices.v1.discovery.k8s.io/default/busy-box-svc-7cgbp
... 
"endpoints":[
{"addresses":["192.168.3.45"],"conditions":{"ready":true,"serving":true,"terminating":false},"targetRef":{"kind":"Pod","namespace":"default","name":"example-wuqing-pd9fn-88589f6bf-hqdqd","uid":"2c1388df-847e-4087-ab56-a4a8351d8ab8"},"nodeName":"tj-wq2-lzytest-0002"},
{"addresses":["192.168.4.192"],"conditions":{"ready":true,"serving":true,"terminating":false},"targetRef":{"kind":"Pod","namespace":"default","name":"example-wuqing-pd9fn-88589f6bf-gr9g2","uid":"a0b07a2a-fe14-4068-b299-483c8a7a477f"},"nodeName":"tj-wq2-lzytest-0001"}]

KUBE-SVC-PZIRA6MO24RJXLWV  tcp  --  anywhere             10.103.185.179       /* default/busy-box-svc cluster IP */ tcp dpt:3000

Chain KUBE-SVC-PZIRA6MO24RJXLWV (1 references)
target     prot opt source               destination
KUBE-MARK-MASQ  tcp  -- !192.168.0.0/16       10.103.185.179       /* default/busy-box-svc cluster IP */ tcp dpt:3000
KUBE-SEP-IQNXXFW6DAPHIZPB  all  --  anywhere             anywhere             /* default/busy-box-svc -> 192.168.3.45:3000 */ statistic mode random probability 0.50000000000
KUBE-SEP-3AAZEJEAEB3WZLEG  all  --  anywhere             anywhere             /* default/busy-box-svc -> 192.168.4.192:3000 */
  • 从宿主上telnet clusterIP实现流量闭环没有问题, 只会连接到节点池内的pod, 而在容器里不可以
telnet 10.103.185.179 3000
Trying 10.103.185.179...
Connected to 10.103.185.179.
Escape character is '^]'.
192.168.3.45

telnet 10.103.185.179 3000
Trying 10.103.185.179...
Connected to 10.103.185.179.
Escape character is '^]'.
192.168.4.192
Connection closed by foreign host.
  • 猜测是因为cilium的原因, 将clusterIP流量劫持了, 直接通过ebpf转发, 而不是iptables规则, 即使开启了kube-proxy也不可以,
https://github.com/cilium/cilium/issues/28904#issuecomment-1804545547
  Services:
  - ClusterIP:      Enabled
  - NodePort:       Disabled
  - LoadBalancer:   Disabled
  - externalIPs:    Disabled
  - HostPort:       Disabled

kubectl -n kube-system exec ds/cilium -- cilium-dbg service list
21   10.103.185.179:3000    ClusterIP      1 => 192.168.5.235:3000 (active)
                                           2 => 192.168.3.45:3000 (active)
                                           3 => 192.168.0.170:3000 (active)
                                           4 => 192.168.2.181:3000 (active)
                                           5 => 192.168.1.160:3000 (active)
                                           6 => 192.168.4.192:3000 (active)
  • 经测试, --set loadBalancer.serviceTopology=true调整cilium参数也无效, 因为底层还是基于endpointSlice运行

解决方案

  • 因为kube-edge社区支持cilium, 所以参考他们的方案发现了需要单独部署cilium和cilium-edge, cilium-edge需要连接yurt-hub
https://kubeedge.io/blog/enable-cilium/#kubeedge-edgecore-setup

### Dump original Cilium DaemonSet configuration  
> kubectl get ds -n kube-system cilium -o yaml > cilium-edgecore.yaml  
  
### Edit and apply the following patch  
> vi cilium-edgecore.yaml  
  
### Deploy cilium-agent aligns with edgecore  
> kubectl apply -f cilium-edgecore.yaml


diff --git a/cilium-edgecore.yaml b/cilium-edgecore.yaml
index bff0f0b..3d941d1 100644
--- a/cilium-edgecore.yaml
+++ b/cilium-edgecore.yaml
@@ -8,7 +8,7 @@ metadata:
     app.kubernetes.io/name: cilium-agent
     app.kubernetes.io/part-of: cilium
     k8s-app: cilium
-  name: cilium
+  name: cilium-kubeedge
   namespace: kube-system
 spec:
   revisionHistoryLimit: 10
@@ -29,6 +29,12 @@ spec:
         k8s-app: cilium
     spec:
       affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+              - matchExpressions:
+                - key: node-role.kubernetes.io/edge
+                  operator: Exists
         podAntiAffinity:
           requiredDuringSchedulingIgnoredDuringExecution:
           - labelSelector:
@@ -39,6 +45,8 @@ spec:
       containers:
       - args:
         - --config-dir=/tmp/cilium/config-map
+        - --k8s-api-server=127.0.0.1:10550
+        - --auto-create-cilium-node-resource=true
         - --debug
         command:
         - cilium-agent
@@ -178,7 +186,9 @@ spec:
       dnsPolicy: ClusterFirst
       hostNetwork: true
       initContainers:
-      - command:
+      - args:
+        - --k8s-api-server=127.0.0.1:10550
+        command:
         - cilium
         - build-config
         env:
  • 参考上面的改动对cilium进行cilium-edge改造
1. 改变cilium的env, 使其连接yurt-hub, 10268端口为https端口
# from
        env:
        - name: KUBERNETES_SERVICE_HOST
           value: {{APISERVER_EXTERNAL_IP}}
        - name: KUBERNETES_SERVICE_PORT
           value: "6443"
# to (每一个contianer都要改)
        env:
        - name: KUBERNETES_SERVICE_HOST
          value: 169.254.2.1
        - name: KUBERNETES_SERVICE_PORT
          value: "10268"
2. 改变deployment的名称
+ name: cilium-edge
3. 改变部分container的启动参数, 不确定kube-api-server与env的优先级. 应该是env优先级更高, 可以二选一改动
    containers:
    - args:
      - --auto-create-cilium-node-resource=true
    initContainers:
    - command:
      - cilium-dbg
      - build-config
      - k8s-api-server=http://127.0.0.1:10261
4. 设置亲和性, cilium-edge只调度到边缘节点, cilium只调度到云节点
# 边
      nodeSelector:
        kubernetes.io/os: linux
+       openyurt.io/is-edge-worker: "true"
# 云
      nodeSelector:
        kubernetes.io/os: linux
+       openyurt.io/is-edge-worker: "false"

  • 社区ranbom-ch提供了一种思路, 使用yurt-hub的data filter的过滤功能
你看下这个文档:[https://openyurt.io/zh/docs/user-manuals/resource-access-control/](https://openyurt.io/zh/docs/user-manuals/resource-access-control/)  
配置好之后,重启一下cilium就可以了。

kubectl -n kube-system get cm yurt-hub-cfg -o yaml
apiVersion: v1
data:
  cache_agents: ""
  discardcloudservice: ""
  masterservice: ""
+  servicetopology: cilium,cilium-agent
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: yurthub
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2024-12-27T02:20:08Z"
  labels:
    app.kubernetes.io/instance: yurthub
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: yurthub
    app.kubernetes.io/version: v1.5.0
    helm.sh/chart: yurthub-1.5.0
  name: yurt-hub-cfg
  namespace: kube-system
  • 改动完成后重启cilium可以看到cilium获得了对应节点池的endpoint
17   10.100.140.155:3000    ClusterIP      1 => 192.168.3.37:3000 (active)
                                           2 => 192.168.4.244:3000 (active)
17   10.100.140.155:3000    ClusterIP      1 => 192.168.0.189:3000 (active)
                                           2 => 192.168.5.46:3000 (active)
# master节点由于没有安装yurt-hub, 因此还是可以看到所有的ep
ID   Frontend               Service Type   Backend
1    10.100.140.155:3000    ClusterIP      1 => 192.168.3.251:3000 (active)
                                           2 => 192.168.2.196:3000 (active)
                                           3 => 192.168.5.85:3000 (active)
                                           4 => 192.168.4.202:3000 (active)
                                           5 => 192.168.0.119:3000 (active)
                                           6 => 192.168.1.102:3000 (active)
  • 进入容器telnet 也可以获得对应的解析效果
kubectl exec -it example-wuqing-pd9fn-88589f6bf-58x7b -- telnet  10.100.140.155 3000
Connected to 10.100.140.155
192.168.4.244
Connection closed by foreign host
command terminated with exit code 1

kubectl exec -it example-wuqing-pd9fn-88589f6bf-58x7b -- telnet  10.100.140.155 3000
Connected to 10.100.140.155
192.168.3.37
Connection closed by foreign host
command terminated with exit code 1
@rayne-Li rayne-Li added the kind/feature kind/feature label Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature kind/feature
Projects
None yet
Development

No branches or pull requests

1 participant