Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s cluster convert to openyurt fail] #572

Closed
moyuduo opened this issue Nov 10, 2021 · 8 comments
Closed

[k8s cluster convert to openyurt fail] #572

moyuduo opened this issue Nov 10, 2021 · 8 comments
Labels
kind/question kind/question

Comments

@moyuduo
Copy link

moyuduo commented Nov 10, 2021

What happened:
I have a k8s cluster 1 node and 1 slave, when i use yurtctl convert -c master -p kubeadm and i got some info as follow

I1110 16:36:28.211216    7661 util.go:540] servant job(yurtctl-disable-node-controller-k8s-node1) has succeeded
I1110 16:36:28.211689    7661 convert.go:343] complete disabling node-controller
I1110 16:36:28.285515    7661 convert.go:354] yurt-tunnel-server is deployed
I1110 16:36:28.303301    7661 convert.go:362] yurt-tunnel-agent is deployed
I1110 16:36:28.312226    7661 convert.go:443] kube-public/cluster-info configmap already exists, skip to prepare it
I1110 16:36:28.342314    7661 convert.go:408] deploying the yurt-hub and resetting the kubelet service on edge nodes...
E1110 16:38:28.348762    7661 util.go:537] fail to run servant job(yurtctl-servant-convert-k8s-node2): wait for job to be complete timeout
E1110 16:38:28.363391    7661 util.go:537] fail to run servant job(yurtctl-servant-convert-k8s-node1): wait for job to be complete timeout
I1110 16:38:28.363469    7661 convert.go:414] complete deploying yurt-hub on edge nodes
I1110 16:38:28.363487    7661 convert.go:417] deploying the yurt-hub and resetting the kubelet service on cloud nodes
E1110 16:40:28.374265    7661 util.go:537] fail to run servant job(yurtctl-servant-convert-master-node): wait for job to be complete timeout
I1110 16:40:28.374392    7661 convert.go:423] complete deploying yurt-hub on cloud nodes

when i use kubectl get pod -A -o wide |grep master|grep yurt to check i got:

kube-system   yurtctl-servant-convert-master-node-rzg7b   0/1     Pending     0          24s    <none>            master-node   <none>           <none>

It seems not converted k8s to openyurt
I use yurtctl revert and retry, but it seem env not cleaned, i got the follow info.

[root@master bin]# ./yurtctl revert
I1110 16:50:20.965055   24090 revert.go:172] yurt controller manager is removed
I1110 16:50:20.988437   24090 revert.go:182] serviceaccount for yurt controller manager is removed
I1110 16:50:20.999411   24090 revert.go:192] clusterrole for yurt controller manager is removed
I1110 16:50:21.003689   24090 revert.go:202] clusterrolebinding for yurt controller manager is removed
I1110 16:50:21.068529   24090 revert.go:353] deployment for yurt app manager is removed
I1110 16:50:21.070492   24090 revert.go:363] Role for yurt app manager is removed
I1110 16:50:21.072227   24090 revert.go:372] ClusterRole for yurt app manager is removed
I1110 16:50:21.074965   24090 revert.go:381] ClusterRoleBinding for yurt app manager is removed
I1110 16:50:21.076781   24090 revert.go:391] RoleBinding for yurt app manager is removed
I1110 16:50:21.125261   24090 revert.go:401] secret for yurt app manager is removed
I1110 16:50:21.326234   24090 revert.go:411] Service for yurt app manager is removed
I1110 16:50:21.328131   24090 revert.go:421] MutatingWebhookConfiguration for yurt app manager is removed
I1110 16:50:21.329725   24090 revert.go:431] ValidatingWebhookConfiguration for yurt app manager is removed
E1110 16:50:21.353065   24090 revert.go:218] fail to remove the yurt app manager: fail to delete the NodePoolCRD/nodepoolcrd: customresourcedefinitions.apiextensions.k8s.io "nodepools.apps.openyurt.io" not found
F1110 16:50:21.533413   24090 revert.go:65] fail to revert yurt to kubernetes: fail to delete the NodePoolCRD/nodepoolcrd: customresourcedefinitions.apiextensions.k8s.io "nodepools.apps.openyurt.io" not found

[root@k8s-node1 bin]#./yurtctl convert  -c master -p kubeadm
I1110 17:17:24.380265   18852 convert.go:318] mark k8s-node1 as the cloud-node
E1110 17:17:24.441631   18852 util.go:537] fail to run servant job(yurtctl-disable-node-controller-k8s-node1): jobs.batch "yurtctl-disable-node-controller-k8s-node1" already exists
I1110 17:17:24.441792   18852 convert.go:343] complete disabling node-controller
I1110 17:17:24.449137   18852 convert.go:443] kube-public/cluster-info configmap already exists, skip to prepare it
F1110 17:17:24.487241   18852 convert.go:103] fail to convert kubernetes to yurt: fail to create the clusterrole/yurt-hub: clusterroles.rbac.authorization.k8s.io "yurt-hub" already exists

what can i do?

@moyuduo moyuduo added the kind/question kind/question label Nov 10, 2021
@Peeknut
Copy link
Member

Peeknut commented Nov 10, 2021

If convert/revert job failed, we don't delete the convert/revert job and user should delete convert/revert job manually.
You can run kubectl get job -A to get convert/revert job, and delete them. Then you can do convert again.

@moyuduo
Copy link
Author

moyuduo commented Nov 10, 2021

I do as you say, but i seems still some problems, such as job timeout and yurthub pod seems not created

[root@k8s-node1 bin]# ./yurtctl convert -c k8s-node1 -p kubeadm
I1110 19:57:33.547551    2811 convert.go:318] mark k8s-node1 as the cloud-node
I1110 19:58:23.642373    2811 util.go:540] servant job(yurtctl-disable-node-controller-k8s-node1) has succeeded
I1110 19:58:23.642422    2811 convert.go:343] complete disabling node-controller
I1110 19:58:23.651814    2811 convert.go:443] kube-public/cluster-info configmap already exists, skip to prepare it
I1110 19:58:23.677696    2811 convert.go:408] deploying the yurt-hub and resetting the kubelet service on edge nodes...
E1110 20:00:23.682522    2811 util.go:537] **fail to run servant job(yurtctl-servant-convert-k8s-node2): wait for job to be complete timeout**
I1110 20:00:23.682720    2811 convert.go:414] complete deploying yurt-hub on edge nodes
I1110 20:00:23.682740    2811 convert.go:417] deploying the yurt-hub and resetting the kubelet service on cloud nodes
E1110 20:02:23.693251    2811 util.go:537] **fail to run servant job(yurtctl-servant-convert-k8s-node1): wait for job to be complete timeout**
I1110 20:02:23.693412    2811 convert.go:423] complete deploying yurt-hub on cloud nodes

use kubectl get pod -A find just yurt-controller-manager-77b97fd47b-p2fmd created

kube-system   calico-kube-controllers-659bd7879c-pxppj   1/1     Running     1          2d
kube-system   calico-node-p6mcn                          1/1     Running     1          2d
kube-system   calico-node-tzqnl                          1/1     Running     0          2d
kube-system   coredns-5897cd56c4-8pb5p                   1/1     Running     0          2d5h
kube-system   coredns-5897cd56c4-tvm4n                   1/1     Running     0          2d5h
kube-system   etcd-k8s-node1                             1/1     Running     1          2d5h
kube-system   kube-apiserver-k8s-node1                   1/1     Running     1          2d5h
kube-system   kube-controller-manager-k8s-node1          1/1     Running     0          12m
kube-system   kube-proxy-h2lfs                           1/1     Running     1          46h
kube-system   kube-proxy-rzfc2                           1/1     Running     0          46h
kube-system   kube-scheduler-k8s-node1                   1/1     Running     1          2d5h
kube-system   yurt-controller-manager-77b97fd47b-p2fmd   1/1     Running     0          12m

use kubectl describe node k8s-node1 | grep Labels:

[root@k8s-node1 bin]# kubectl describe node k8s-node1 | grep Labels
Labels:             beta.kubernetes.io/arch=amd64
[root@k8s-node1 bin]# kubectl describe node k8s-node2 | grep Labels
Labels:             beta.kubernetes.io/arch=amd64

I am sure my k8s cluster work fine, how to solve this problem?

@Peeknut
Copy link
Member

Peeknut commented Nov 10, 2021

Can you get the log of convert job?

@moyuduo
Copy link
Author

moyuduo commented Nov 10, 2021

Job log just as follow.

[root@k8s-node1 pod]# kubectl describe job yurtctl-servant-convert-k8s-node2 -n kube-system
Name:           yurtctl-servant-convert-k8s-node2
Namespace:      kube-system
Selector:       controller-uid=1fa2ddf4-edec-4567-9189-3bced626ad46
Labels:         controller-uid=1fa2ddf4-edec-4567-9189-3bced626ad46
                job-name=yurtctl-servant-convert-k8s-node2
Annotations:    <none>
Parallelism:    1
Completions:    1
Start Time:     Wed, 10 Nov 2021 19:58:23 +0800
Pods Statuses:  0 Running / 0 Succeeded / 1 Failed
Pod Template:
  Labels:  controller-uid=1fa2ddf4-edec-4567-9189-3bced626ad46
           job-name=yurtctl-servant-convert-k8s-node2
  Containers:
   yurtctl-servant:
    Image:      openyurt/yurtctl-servant:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
      -c
    Args:
      cp /usr/local/bin/yurtctl /tmp && nsenter -t 1 -m -u -n -i -- /var/tmp/yurtctl convert edgenode --yurthub-image openyurt/yurthub:latest --join-token p3gb13.4b59h9emaxaofz7x && rm /tmp/yurtctl
    Environment:
      NODE_NAME:     (v1:spec.nodeName)
      KUBELET_SVC:  /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    Mounts:
      /tmp from host-var-tmp (rw)
  Volumes:
   host-var-tmp:
    Type:          HostPath (bare host directory volume)
    Path:          /var/tmp
    HostPathType:  Directory
Events:
  Type     Reason                Age   From            Message
  ----     ------                ----  ----            -------
  Normal   SuccessfulCreate      26m   job-controller  Created pod: yurtctl-servant-convert-k8s-node2-pzj77
  Normal   SuccessfulDelete      21m   job-controller  Deleted pod: yurtctl-servant-convert-k8s-node2-pzj77
  Warning  BackoffLimitExceeded  21m   job-controller  Job has reached the specified backoff limit

@Peeknut
Copy link
Member

Peeknut commented Nov 10, 2021

There is little information because the pod yurtctl-servant-convert-XXX is deleted. How about try to convert again, and run kubectl logs -n kube-system yurtctl-servant-convert-XXX to get the details?

@moyuduo
Copy link
Author

moyuduo commented Nov 11, 2021

[root@k8s-node1 bin]# kubectl logs -n kube-system yurtctl-servant-convert-k8s-node2-4j647
F1111 09:42:50.795808   20927 edgenode.go:50] fail to convert the kubernetes node to a yurt node: stat **/etc/systemd/system/kubelet.service.d/10-kubeadm.conf: no such file or directory**

[root@k8s-node1 system]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: **/usr/lib/systemd/system/kubelet.service.d**
           └─10-kubeadm.conf
   Active: active (running) since Wed 2021-11-10 22:53:50 CST; 10h ago
     Docs: https://kubernetes.io/docs/
 Main PID: 797 (kubelet)
    Tasks: 17
   Memory: 132.9M
   CGroup: /system.slice/kubelet.service
           └─797 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/lfy_k8s_...

It seems that the kubelet config file not found, but i just use kubeadm to create the k8s cluster, i try use systemctl status kubelet got the kubelet config file in /usr/lib/systemd/system/kubelet.service.d

@Peeknut
Copy link
Member

Peeknut commented Nov 11, 2021

Yes, the failure occurs because the kubelet config file path doesn't found. You should add the parameter --kubeadm-conf-path /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf when convert.

@moyuduo
Copy link
Author

moyuduo commented Nov 11, 2021

yes! converted to openyurt successfully, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question kind/question
Projects
None yet
Development

No branches or pull requests

2 participants