Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-servant: add preflight-convert command #686

Merged
merged 1 commit into from
Dec 21, 2021

Conversation

Peeknut
Copy link
Member

@Peeknut Peeknut commented Dec 15, 2021

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage
/sig storage

/kind feature

What this PR does / why we need it:

Split preflight check and convert. Preflight check is called by yurtctl convert to perform pre-check to reduce the conversion failures.

Which issue(s) this PR fixes:

Fixes #619

Special notes for your reviewer:

@DrmagicE @adamzhoul @rambohe-ch

Does this PR introduce a user-facing change?


other Note

@openyurt-bot
Copy link
Collaborator

@Peeknut: GitHub didn't allow me to assign the following users: your_reviewer.

Note that only openyurtio members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage
/sig storage

/kind feature

What this PR does / why we need it:

Split preflight check and convert. Preflight check is called by yurtctl convert to perform pre-check to reduce the conversion failures.

Which issue(s) this PR fixes:

Fixes #619

Special notes for your reviewer:

@DrmagicE @adamzhoul @rambohe-ch

Does this PR introduce a user-facing change?


other Note

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openyurt-bot openyurt-bot added kind/feature kind/feature size/XL size/XL: 500-999 labels Dec 15, 2021
@Peeknut
Copy link
Member Author

Peeknut commented Dec 15, 2021

Test cases:

Success:

[root@node1 node-servant]# ./yurt-node-servant preflight-convert
I1215 16:44:57.780194   26543 preflight.go:22] [preflight-convert] Running node-servant pre-flight checks
[preflight-convert] Pulling images required for converting a Kubernetes cluster to an OpenYurt cluster
[preflight-convert] This might take a minute or two, depending on the speed of your internet connection
I1215 16:44:57.805648   26543 preflight.go:31] convert pre-flight checks success
[root@node1 node-servant]#

Failure: run in OpenYurt Node

[root@node1 node-servant]# ./yurt-node-servant preflight-convert --yurthub-image=openyurt/yurthub:v0.5.0
I1215 17:07:35.262807    3171 preflight.go:22] [preflight-convert] Running node-servant pre-flight checks
F1215 17:07:35.263045    3171 preflight.go:29] Fail to run pre-flight checks: [preflight] Some fatal errors occurred:
	[ERROR Port-10268]: Port 10268 is in use
	[ERROR Port-10261]: Port 10261 is in use
	[ERROR Port-10267]: Port 10267 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc000010001, 0xc0006c2000, 0x164, 0x2a9)
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1026 +0xb9
k8s.io/klog/v2.(*loggingT).output(0x1f1f000, 0xc000000003, 0x0, 0x0, 0xc0004e4070, 0x0, 0x1e8c1cc, 0xc, 0x1d, 0x0)
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:975 +0x1f1
k8s.io/klog/v2.(*loggingT).printf(0x1f1f000, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0x151b798, 0x21, 0xc0004e2360, 0x1, ...)
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:753 +0x19a
k8s.io/klog/v2.Fatalf(...)
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1514
github.com/openyurtio/openyurt/cmd/yurt-node-servant/preflight-convert.NewxPreflightConvertCmd.func1(0xc0006a82c0, 0xc00068ee40, 0x0, 0x1)
	/root/test/node-servant/openyurt/cmd/yurt-node-servant/preflight-convert/preflight.go:29 +0x24f
github.com/spf13/cobra.(*Command).execute(0xc0006a82c0, 0xc00068ee30, 0x1, 0x1, 0xc0006a82c0, 0xc00068ee30)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00068b8c0, 0xc0005bff00, 0x1, 0x1)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
main.main()
	/root/test/node-servant/openyurt/cmd/yurt-node-servant/node-servant.go:50 +0x307

goroutine 6 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x1f1f000)
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1169 +0x8b
created by k8s.io/klog/v2.init.0
	/root/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:420 +0xdf

goroutine 7 [chan receive]:
k8s.io/klog.(*loggingT).flushDaemon(0x1f1ef20)
	/root/go/pkg/mod/k8s.io/[email protected]/klog.go:1010 +0x8b
created by k8s.io/klog.init.0
	/root/go/pkg/mod/k8s.io/[email protected]/klog.go:411 +0xd8
[root@node1 node-servant]#

@Peeknut Peeknut force-pushed the nodeservant-precheck branch 2 times, most recently from 1324b0c to ecdf6dc Compare December 15, 2021 09:40
@rambohe-ch
Copy link
Member

@adamzhoul Do you have any comments about new command: node-servant preflight-check?

@Peeknut Peeknut force-pushed the nodeservant-precheck branch from ecdf6dc to d329680 Compare December 17, 2021 06:47
@openyurt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Peeknut, rambohe-ch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openyurt-bot openyurt-bot added the approved approved label Dec 17, 2021
@@ -64,15 +60,6 @@ func (n *nodeConverter) validateOptions() error {
return nil
}

func (n *nodeConverter) preflightCheck() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we delete those?
and how can we make sure users do preflight before do convert?

or, what if we put core preflight capability in convert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two reasons for deletion:
(1) The node servant mainly serves for yurtctl convert, it is rare for users to use node-servant to convert nodes individually.
(2) In order to increase the conversion success rate of yrutcl convert, yurtctl convert will perform preflight check and issue node servant preflight check job. If the check is successful, the cloud will perform the conversion and issue node servant convert job. If add a check part to the node servant, the check part will be executed twice. It seems a bit cumbersome.
So I remove the check part in node-servant convert.
How about marking it in the usage document: Perform node-servant preflight check before executing node-servant convert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) The node servant mainly serves for yurtctl convert, it is rare for users to use node-servant to convert nodes individually.

I think it's not about we use it individually or not.
I only think node-servant should include preflight check capability,
and export its capability to the upper ( yurtctl convert level )

(2) In order to increase the conversion success rate of yrutcl convert,

is this be noticed by users?
I think end-user should not notice we have a preflight command if everything is right.
The end-user doesn't need to execute this manually I think.

what do you say?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not about we use it individually or not...

Yes, considering the usability of users, it is better that convert include preflight check .So yurtctl convert does just that.
But in node-servant, we split it into two parts, one for preflight check and one for conversion to cooperate with the use of yurtctl convert.
If the node-servant convert contains check and also exposes the check capability, then yurtctl convert will call twice repeated checks (as mentioned above) during conversion.

is this be noticed by users?

If the user uses yurtctl convert, the preflight check content is included in the convert, so the user uses the same way as before.
If the user uses node-servant convert, preflight check and convert are two commands, so the user needs to execute preflight check first, and then execute convert, which needs to be stated in the document.

@@ -17,7 +17,7 @@ limitations under the License.
package edgenode

const (
KubeletSvcPath = "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf"
KubeletSvcPath = "/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the issues mentioned by the community, KubeletSvcPath is generally "/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf", so it was changed to the default value.

In addition, If the default value is "/etc/systemd/system/kubelet.service.d/10-kubeadm.conf", the first time the user uses it, the conversion will often fail, and the parameter --kubeadm-conf-path must be passed in when converting. It is troublesome to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share the issue address?
want to know some background in detail

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: #572#203
May need further consideration, these two addresses are different in different systems.

@Peeknut Peeknut force-pushed the nodeservant-precheck branch from d329680 to 40fc6e4 Compare December 20, 2021 12:48
@Peeknut
Copy link
Member Author

Peeknut commented Dec 20, 2021

@adamzhoul @rambohe-ch
Currently, only the parameter kubeadm-config-path in node-servant convert and preflight check has been modified. It does not affect the use of yurtctl convert/revert and node-servant revert.
Because there are different modifications in the yurtctl convert/revert and node-servant revert, I will submit it in another pr.

@Peeknut
Copy link
Member Author

Peeknut commented Dec 20, 2021

kubeadm-conf-path Test:

convert:

[root@master bin]# ./yurt-node-servant convert --join-token 30zrv0.2h7ivu4561gre3fx --working-mode cloud --kubeadm-conf-path=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
I1220 20:44:46.779194   16227 yurthub.go:67] setting up yurthub on node
I1220 20:44:46.779231   16227 yurthub.go:70] setting up yurthub apiServer addr
I1220 20:44:46.779330   16227 yurthub.go:88] create the /etc/kubernetes/manifests/yurt-hub.yaml
I1220 20:44:46.781013   16227 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:44:56.781285   16227 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:45:06.781257   16227 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:45:16.781258   16227 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:45:26.781560   16227 yurthub.go:182] yurt-hub healthz is OK after 40.002206 seconds
I1220 20:45:26.781631   16227 kubelet.go:100] revised kubeconfig /var/lib/openyurt/kubelet.conf is generated
I1220 20:45:26.781773   16227 kubelet.go:173] restartKubelet: systemctl daemon-reload
I1220 20:45:26.839596   16227 kubelet.go:178] restartKubelet: systemctl restart kubelet
I1220 20:45:26.859147   16227 kubelet.go:183] restartKubelet: kubelet has been restarted
I1220 20:45:26.859161   16227 convert.go:48] convert success


[root@master bin]# ./yurt-node-servant revert
I1220 20:20:24.097646    5506 kubelet.go:161] revertKubelet: undoAppendConfig finished
I1220 20:20:24.097677    5506 kubelet.go:176] restartKubelet: systemctl daemon-reload
I1220 20:20:24.162094    5506 kubelet.go:181] restartKubelet: systemctl restart kubelet
I1220 20:20:24.184615    5506 kubelet.go:186] restartKubelet: kubelet has been restarted
I1220 20:20:24.184664    5506 kubelet.go:88] revertKubelet: undoWriteYurthubKubeletConfig finished
I1220 20:20:24.184708    5506 yurthub.go:106] UnInstallYurthub: /etc/kubernetes/manifests/yurt-hub.yaml has been removed
I1220 20:20:24.184894    5506 yurthub.go:119] UnInstallYurthub: config dir /var/lib/yurthub  has been removed
I1220 20:20:24.184902    5506 yurthub.go:152] wait for yurt-hub exit
I1220 20:20:24.185566    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:25.185909    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:26.185881    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:27.185886    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:28.185882    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:29.185887    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:30.185894    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:31.185936    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:32.186965    5506 yurthub.go:162] yurt-hub is still running
I1220 20:20:33.185805    5506 yurthub.go:159] yurt-hub is not running, with ping result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:20:33.185871    5506 yurthub.go:133] UnInstallYurthub: cache dir /etc/kubernetes/cache/  has been removed
I1220 20:20:33.185889    5506 revert.go:41] revert success



[root@master bin]# ./yurt-node-servant convert --join-token 30zrv0.2h7ivu4561gre3fx --working-mode cloud
I1220 20:19:11.003462    4514 yurthub.go:67] setting up yurthub on node
I1220 20:19:11.003493    4514 yurthub.go:70] setting up yurthub apiServer addr
I1220 20:19:11.003592    4514 yurthub.go:88] create the /etc/kubernetes/manifests/yurt-hub.yaml
I1220 20:19:11.003790    4514 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:19:21.004018    4514 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:19:31.004014    4514 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:19:41.004008    4514 yurthub.go:179] yurt-hub is not ready, ping cluster healthz with result: Get "http://127.0.0.1:10267/v1/healthz": dial tcp 127.0.0.1:10267: connect: connection refused
I1220 20:19:51.004338    4514 yurthub.go:182] yurt-hub healthz is OK after 40.000724 seconds
I1220 20:19:51.004406    4514 kubelet.go:103] revised kubeconfig /var/lib/openyurt/kubelet.conf is generated
I1220 20:19:51.004500    4514 kubelet.go:176] restartKubelet: systemctl daemon-reload
I1220 20:19:51.065718    4514 kubelet.go:181] restartKubelet: systemctl restart kubelet
I1220 20:19:51.089076    4514 kubelet.go:186] restartKubelet: kubelet has been restarted
I1220 20:19:51.089092    4514 convert.go:48] convert success
[root@master bin]#
[root@master bin]#

preflight check:

[root@master bin]# ./yurt-node-servant preflight-convert 
I1220 20:07:28.351996   31541 preflight.go:40] [preflight-convert] Running node-servant pre-flight checks
[preflight-convert] Pulling images required for converting a Kubernetes cluster to an OpenYurt cluster
[preflight-convert] This might take a minute or two, depending on the speed of your internet connection
I1220 20:07:45.537409   31541 preflight.go:49] convert pre-flight checks success


[root@master bin]# ./yurt-node-servant preflight-convert --kubeadm-conf-path=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,/usr/lib/kubelet.service.d/10-kubeadm.conf --ignore-preflight-errors=fileatleastoneexistingcheck-/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
I1220 20:43:45.756054   15824 preflight.go:40] [preflight-convert] Running node-servant pre-flight checks
E1220 20:43:45.756277   15824 preflight.go:46] Fail to run pre-flight checks: [preflight] Some fatal errors occurred:
	[ERROR KubeadmConfig]: no file in list [/etc/systemd/system/kubelet.service.d/10-kubeadm.conf /usr/lib/kubelet.service.d/10-kubeadm.conf] exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
[root@master bin]#
[root@master bin]#

@adamzhoul
Copy link
Member

/lgtm

@openyurt-bot openyurt-bot added the lgtm lgtm label Dec 21, 2021
@openyurt-bot openyurt-bot merged commit 6cc8fbd into openyurtio:master Dec 21, 2021
MrGirl pushed a commit to MrGirl/openyurt that referenced this pull request Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved approved kind/feature kind/feature lgtm lgtm size/XL size/XL: 500-999
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature request]add yurtctl precheck command for reducing yurtctl convert failure.
4 participants