Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containerd change #2919

Merged
merged 22 commits into from
Feb 17, 2022
Merged

Containerd change #2919

merged 22 commits into from
Feb 17, 2022

Conversation

rafzei
Copy link
Contributor

@rafzei rafzei commented Jan 24, 2022

To test:

  • single-machine installation/upgrade
  • custom repository mapping where k8s master is on the same host as repo
  • k8s ha install/upgrade
  • k8s backup
  • k8s certs renewal
  • keep an eye on filebeat logs gathering from containers

Copy link
Contributor

@plirglo plirglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know it's not scope of this task but can we try to use imperative style which is shorter and fits more to manuals. I mean eg. instead: "To check status of Grafana you can use the command", use "To check status of Grafana use the command"

@rafzei
Copy link
Contributor Author

rafzei commented Jan 25, 2022

i know it's not scope of this task but can we try to use imperative style which is shorter and fits more to manuals. I mean eg. instead: "To check status of Grafana you can use the command", use "To check status of Grafana use the command"

I like this, will change that after more reviews.

plirglo
plirglo previously approved these changes Feb 3, 2022
atsikham
atsikham previously approved these changes Feb 9, 2022
atsikham
atsikham previously approved these changes Feb 10, 2022
@przemyslavic
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@przemyslavic
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@przemyslavic
Copy link
Collaborator

przemyslavic commented Feb 15, 2022

❌ HA upgrade fails:

2022-02-15T16:55:56.2176599Z[38;21m16:55:56 INFO cli.src.ansible.AnsibleCommand - TASK [upgrade : k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0] ***
2022-02-15T16:55:56.9095619Z[31;21m16:55:56 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (20 retries left).
2022-02-15T16:56:27.6105342Z[31;21m16:56:27 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (19 retries left).
2022-02-15T16:56:58.3295568Z[31;21m16:56:58 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (18 retries left).
2022-02-15T16:57:29.0190866Z[31;21m16:57:29 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (17 retries left).
2022-02-15T16:57:59.7226247Z[31;21m16:57:59 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (16 retries left).
2022-02-15T16:58:30.4145436Z[31;21m16:58:30 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (15 retries left).
2022-02-15T16:59:01.1184037Z[31;21m16:59:01 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (14 retries left).
2022-02-15T16:59:31.8290035Z[31;21m16:59:31 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (13 retries left).
2022-02-15T17:00:02.5730129Z[31;21m17:00:02 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (12 retries left).
2022-02-15T17:00:33.3052430Z[31;21m17:00:33 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (11 retries left).
2022-02-15T17:01:04.0086743Z[31;21m17:01:04 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (10 retries left).
2022-02-15T17:01:34.7118245Z[31;21m17:01:34 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (9 retries left).
2022-02-15T17:02:05.4081510Z[31;21m17:02:05 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (8 retries left).
2022-02-15T17:02:36.1348692Z[31;21m17:02:36 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (7 retries left).
2022-02-15T17:03:06.8328911Z[31;21m17:03:06 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (6 retries left).
2022-02-15T17:03:37.5133882Z[31;21m17:03:37 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (5 retries left).
2022-02-15T17:04:08.2124231Z[31;21m17:04:08 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (4 retries left).
2022-02-15T17:04:38.9116667Z[31;21m17:04:38 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (3 retries left).
2022-02-15T17:05:09.6283168Z[31;21m17:05:09 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (2 retries left).
2022-02-15T17:05:40.3292718Z[31;21m17:05:40 ERROR cli.src.ansible.AnsibleCommand - FAILED - RETRYING: [ci-10todevazubcanal-kubernetes-master-vm-0]: k8s/masterN | Upgrade master ci-10todevazubcanal-kubernetes-master-vm-0 (1 retries left).
2022-02-15T17:06:11.0555087Z[31;21m17:06:11 ERROR cli.src.ansible.AnsibleCommand - fatal: [ci-10todevazubcanal-kubernetes-master-vm-0]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["kubeadm", "upgrade", "node"], "delta": "0:00:00.134915", "end": "2022-02-15 17:06:10.987267", "msg": "non-zero return code", "rc": 1, "start": "2022-02-15 17:06:10.852352", "stderr": "W0215 17:06:10.978771   35797 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH
error execution phase preflight: docker is required for container runtime: exec: \"docker\": executable file not found in $PATH
To see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["W0215 17:06:10.978771   35797 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH", "error execution phase preflight: docker is required for container runtime: exec: \"docker\": executable file not found in $PATH", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "stdout_lines": ["[upgrade] Reading configuration from the cluster...", "[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'"]}

Looks like it fails on kubeadm upgrade node command when trying to upgrade 2nd node.

[operations@ci-10todevazubcanal-kubernetes-master-vm-0 ~]$ systemctl status docker
● docker.service
     Loaded: masked (Reason: Unit docker.service is masked.)
     Active: inactive (dead) since Tue 2022-02-15 16:49:41 UTC; 3h 3min ago
   Main PID: 1022 (code=exited, status=0/SUCCESS)

Feb 15 16:49:31 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:31.614293532Z" level=info msg="ignoring event" module=libc>
Feb 15 16:49:32 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:32.469302137Z" level=info msg="ignoring event" module=libc>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.221824376Z" level=info msg="Container de2afb232dd19d3bd>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.251972178Z" level=info msg="Container 7a249e057613b4a3e>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.307236066Z" level=info msg="ignoring event" module=libc>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.339790976Z" level=info msg="ignoring event" module=libc>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.430463884Z" level=info msg="stopping event stream follo>
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 dockerd[1022]: time="2022-02-15T16:49:41.431206387Z" level=info msg="Daemon shutdown complete"
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 systemd[1]: docker.service: Succeeded.
Feb 15 16:49:41 ci-10todevazubcanal-kubernetes-master-vm-0 systemd[1]: Stopped Docker Application Container Engine.
[operations@ci-10todevazubcanal-kubernetes-master-vm-0 ~]$ systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-02-15 16:50:02 UTC; 3h 2min ago
       Docs: https://containerd.io
   Main PID: 25619 (containerd)
      Tasks: 96
     Memory: 2.0G
     CGroup: /system.slice/containerd.service
             ├─25619 /usr/bin/containerd
             ├─25724 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id e79e98a30842dcd2136ff5cab0bf8a0e2bf14a203210abfae77c7662335de60b -address /run/>
             ├─29754 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 449d0ec64b8690997b25ec9a1276f779a595be43742fce88d6027d797d801cb3 -address /run/>
             ├─31204 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id b390cf864a4a146b948b980f46c0f408a174a01c6212f6c2f40438479603c7b4 -address /run/>
             ├─31231 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 801cfbb8b63fcec330a0929b6222783758962ff42be491847bdde4738d68ffe3 -address /run/>
             ├─31291 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 255c840cbcc678c6bbec5c85e4e605c0a8a0257391c29b7315e280c9f802d06e -address /run/>
             └─31345 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 40daaaec220afe517e6ac8bb381a48bc1b2cf7c5e80edb942d0d1a028eec6b63 -address /run/>

Feb 15 19:52:41 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:41.187290346Z" level=info msg="ExecSync for \"47eb96b0>
Feb 15 19:52:41 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:41.280712937Z" level=info msg="Finish piping \"stdout\>
Feb 15 19:52:41 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:41.281024939Z" level=info msg="Finish piping \"stderr\>
Feb 15 19:52:41 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:41.281170939Z" level=info msg="Exec process \"40b9f227>
Feb 15 19:52:41 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:41.283891251Z" level=info msg="ExecSync for \"47eb96b0>
Feb 15 19:52:51 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:51.187330175Z" level=info msg="ExecSync for \"47eb96b0>
Feb 15 19:52:51 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:51.274699142Z" level=info msg="Finish piping \"stdout\>
Feb 15 19:52:51 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:51.274815042Z" level=info msg="Exec process \"9b662698>
Feb 15 19:52:51 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:51.275080143Z" level=info msg="Finish piping \"stderr\>
Feb 15 19:52:51 ci-10todevazubcanal-kubernetes-master-vm-0 containerd[25619]: time="2022-02-15T19:52:51.277508254Z" level=info msg="ExecSync for \"47eb96b0>
lines 1-26/26 (END)
[operations@ci-10todevazubcanal-kubernetes-master-vm-0 ~]$ kubectl get nodes
NAME                                         STATUS                     ROLES    AGE     VERSION
ci-10todevazubcanal-kubernetes-master-vm-0   Ready,SchedulingDisabled   master   6h10m   v1.19.15
ci-10todevazubcanal-kubernetes-master-vm-1   Ready                      master   6h13m   v1.18.6
ci-10todevazubcanal-kubernetes-master-vm-2   Ready                      master   6h8m    v1.19.15
ci-10todevazubcanal-kubernetes-node-vm-0     Ready                      <none>   6h6m    v1.18.6
ci-10todevazubcanal-kubernetes-node-vm-1     Ready                      <none>   6h6m    v1.18.6
[kubernetes_master]
ci-10todevazubcanal-kubernetes-master-vm-2 ansible_host=20.x.x.x
ci-10todevazubcanal-kubernetes-master-vm-0 ansible_host=20.x.x.y
ci-10todevazubcanal-kubernetes-master-vm-1 ansible_host=20.x.x.z
[operations@ci-10todevazubcanal-kubernetes-master-vm-0 ~]$ kubectl get nodes -o jsonpath='{.items[].status.nodeInfo.containerRuntimeVersion}'
containerd://1.4.12
[operations@ci-10todevazubcanal-kubernetes-master-vm-0 ~]$ sudo kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
W0215 19:54:53.510605   92242 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: docker is required for container runtime: exec: "docker": executable file not found in $PATH
To see the stack trace of this error execute with --v=5 or higher

@rafzei
Copy link
Contributor Author

rafzei commented Feb 16, 2022

So the kubelet knows about the new configuration but there was annotation missing on masterN nodes. Will push a fix soon.

@przemyslavic
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@przemyslavic
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

przemyslavic
przemyslavic previously approved these changes Feb 17, 2022
plirglo
plirglo previously approved these changes Feb 17, 2022
@przemyslavic
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@przemyslavic
Copy link
Collaborator

❤️

@rafzei rafzei merged commit d2ac0d9 into hitachienergy:develop Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants