Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy kwok in kind, it will cause kindnet CrashLoopBackOff #819

Closed
1 of 5 tasks
mrhello369 opened this issue Oct 21, 2023 · 3 comments
Closed
1 of 5 tasks

deploy kwok in kind, it will cause kindnet CrashLoopBackOff #819

mrhello369 opened this issue Oct 21, 2023 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mrhello369
Copy link

mrhello369 commented Oct 21, 2023

How to use it?

  • kwok
  • kwokctl --runtime=docker (default runtime)
  • kwokctl --runtime=binary
  • kwokctl --runtime=nerdctl
  • kwokctl --runtime=kind

What happened?

  1. Create one cluster with kind.
  2. Deploy kwok(v0.4.0)
  3. Create node(https://kwok.sigs.k8s.io/docs/user/kwok-manage-nodes-and-pods/)

Then kindnet pod continually crash:

-> % k get pod -nkube-system

NAME                                          READY   STATUS             RESTARTS       AGE
coredns-5d78c9869d-65mx5                      1/1     Running            0              3d13h
coredns-5d78c9869d-8kb9d                      1/1     Running            0              3d13h
etcd-learn-control-plane                      1/1     Running            0              3d13h
kindnet-cb8qp                                 0/1     CrashLoopBackOff   5 (107s ago)   3d13h
kindnet-pcq7k                                 1/1     Running            0              65m
kube-apiserver-learn-control-plane            1/1     Running            0              3d13h
kube-controller-manager-learn-control-plane   1/1     Running            0              3d13h
kube-proxy-l55zl                              1/1     Running            0              65m
kube-proxy-nr5ww                              1/1     Running            0              3d13h
kube-scheduler-learn-control-plane            1/1     Running            0              3d13h
kwok-controller-64bfb797f9-hp7qd              1/1     Running            0              5m52s
metrics-server-7fccf7886-ktxql                1/1     Running            0              2d17h

The log:

-> % k logs -nkube-system kindnet-cb8qp
I1021 04:48:35.224148       1 main.go:316] probe TCP address learn-control-plane:6443
I1021 04:48:35.225080       1 main.go:102] connected to apiserver: https://learn-control-plane:6443
I1021 04:48:35.225096       1 main.go:107] hostIP = 172.18.0.2
podIP = 172.18.0.2
I1021 04:48:35.317630       1 main.go:116] setting mtu 1500 for CNI
I1021 04:48:35.317666       1 main.go:146] kindnetd IP family: "ipv4"
I1021 04:48:35.317694       1 main.go:150] noMask IPv4 subnets: [10.244.0.0/16]
I1021 04:48:35.618195       1 main.go:223] Handling node with IPs: map[10.244.0.43:{}]
I1021 04:48:35.618231       1 main.go:250] Node kwok-node-0 has CIDR [10.244.1.0/24]
I1021 04:48:35.618385       1 routes.go:62] Adding route {Ifindex: 0 Dst: 10.244.1.0/24 Src: <nil> Gw: 10.244.0.43 Flags: [] Table: 0}
I1021 04:48:35.618453       1 main.go:204] Failed to reconcile routes, retrying after error: network is unreachable
I1021 04:48:35.618474       1 main.go:223] Handling node with IPs: map[10.244.0.43:{}]
I1021 04:48:35.618481       1 main.go:250] Node kwok-node-0 has CIDR [10.244.1.0/24]
I1021 04:48:35.618546       1 routes.go:62] Adding route {Ifindex: 0 Dst: 10.244.1.0/24 Src: <nil> Gw: 10.244.0.43 Flags: [] Table: 0}
I1021 04:48:35.618586       1 main.go:204] Failed to reconcile routes, retrying after error: network is unreachable
I1021 04:48:36.618917       1 main.go:223] Handling node with IPs: map[10.244.0.43:{}]
I1021 04:48:36.618955       1 main.go:250] Node kwok-node-0 has CIDR [10.244.1.0/24]
I1021 04:48:36.619091       1 routes.go:62] Adding route {Ifindex: 0 Dst: 10.244.1.0/24 Src: <nil> Gw: 10.244.0.43 Flags: [] Table: 0}
I1021 04:48:36.619158       1 main.go:204] Failed to reconcile routes, retrying after error: network is unreachable
I1021 04:48:38.619296       1 main.go:223] Handling node with IPs: map[10.244.0.43:{}]
I1021 04:48:38.619327       1 main.go:250] Node kwok-node-0 has CIDR [10.244.1.0/24]
I1021 04:48:38.619466       1 routes.go:62] Adding route {Ifindex: 0 Dst: 10.244.1.0/24 Src: <nil> Gw: 10.244.0.43 Flags: [] Table: 0}
I1021 04:48:38.619526       1 main.go:204] Failed to reconcile routes, retrying after error: network is unreachable
I1021 04:48:41.620810       1 main.go:223] Handling node with IPs: map[10.244.0.43:{}]
I1021 04:48:41.620840       1 main.go:250] Node kwok-node-0 has CIDR [10.244.1.0/24]
I1021 04:48:41.620981       1 routes.go:62] Adding route {Ifindex: 0 Dst: 10.244.1.0/24 Src: <nil> Gw: 10.244.0.43 Flags: [] Table: 0}
I1021 04:48:41.621034       1 main.go:204] Failed to reconcile routes, retrying after error: network is unreachable
panic: Maximum retries reconciling node routes: network is unreachable

goroutine 1 [running]:
main.main()
        /go/src/cmd/kindnetd/main.go:208 +0xd07

What did you expect to happen?

Everything works fine

How can we reproduce it (as minimally and precisely as possible)?

As described above

Anything else we need to know?

-> % kind version
kind v0.20.0 go1.20.4 linux/amd64

Kwok version

$ kwok --version
# paste output here

$ kwokctl --version
# paste output here
As described above

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here

On Darwin:

$ uname -a

paste output here

On Windows:

C:> wmic os get Caption, Version, BuildNumber, OSArchitecture

paste output here

-> % uname -a
Linux pve 6.2.0-33-generic #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 7 10:33:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

</details>

@mrhello369 mrhello369 added the kind/bug Categorizes issue or PR as related to a bug. label Oct 21, 2023
@mrhello369 mrhello369 changed the title deploy kwok in kind, it will case kindnet CrashLoopBackOff deploy kwok in kind, it will cause kindnet CrashLoopBackOff Oct 21, 2023
@wzshiming
Copy link
Member

This is a known, problem with kindnet hanging because the pod's hostIP is not an actual node.
So, we can change kwok deploy to hostNetwork: true or just use kwokctl create cluster --runtime kind to create cluster

@wzshiming
Copy link
Member

/close

@k8s-ci-robot
Copy link
Contributor

@wzshiming: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

swimablefish added a commit to swimablefish/kwok that referenced this issue Nov 22, 2024
…es-sigs#819

Update the README.md in the kwok helm chart to pass the job pull-kwok-verify-main

update the comment
k8s-ci-robot added a commit that referenced this issue Dec 31, 2024
Make the kwok chart support `hostNetwork` to avoid the issue #819
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants