-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Super slow access to service IP from host (& host-networked pods) with Flannel CNI #1245
Comments
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. calico.yaml file is copied from Calico's documantation and no change should be done to it. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. calico.yaml file is copied from Calico's documantation and no change should be done to it. Signed-off-by: Or Mergi <[email protected]>
Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. calico.yaml file is copied from Calico's documantation and no change should be done to it. Signed-off-by: Or Mergi <[email protected]>
* Deploy Calico pod network plugin on k8s-1.17 Flannel has compatability issues with k8s-1.17 flannel-io/flannel#1245. deploy calico plugin instead also for better proformance. calico.yaml file is copied from Calico's documantation and no change should be done to it. Signed-off-by: Or Mergi <[email protected]> * CNI manifest file names and kubernetes versions map This map will corrolate between k8s version and the plugin we would like to deploy. Signed-off-by: Or Mergi <[email protected]> * Separate cni selection logic from provision scripts. cli.sh, create /tmp/scripts directory in the VM and copy cni-map.sh . cnis-map.sh map between k8s version and cni manifest file name to use. node01.sh provision.sh, use cnis-map.sh to resolve the right cni manifest to use. Signed-off-by: Or Mergi <[email protected]>
We are seeing multiple reports that flannel + kube 1.17 don't play well:
@tomdee can you look at these? |
I think I've been hitting this issue yesterday/today Some tests I was doing, from one host (not in a container)
I've just swapped to the flannel: 0.11.0 |
Something we noticed is that the number of conntrack insert_failed was dramatically higher while running kube 1.17. |
We experienced the same issue today. Fixed this by using the solution of @mikebryant. Is there any permanent solution on the way? |
@tomdee as you are the last remaining maintainer, who should I ping/tag to get this looked at. |
Just FIY, this is not related only to 1.17 .. Because of these issues here, I've tried to downgrade from 1.17.3 to 1.16.8, but same result
And after that, even traceroute is super slow
|
Just curious, how many folks experiencing this issue are using hyperkube? |
I'm having this issue with vxlan backend both with flannel version 0.11 and 0.12 aswell. Finally setting up a static route on my nodes to service network through cni0 interface helped me instantly: os: CentOS 7 |
Fixed this problem by using the solution of @mengmann in kubernetes version v1.17.2 . |
Exactly the same issue here |
Not sure if its the same issue but we noticed an additional delay of 1 second when upgrading from kubernetes 1.15.3 to 1.18.1. We seem to trace the problem to the |
I'm currently working with kubernetes 17.3(some nodes 17.4). Fortunately there are not so many apps running on my new-built cluster, so I migrated them this week and changed the network fabric to calico according to this article. Now erverything works perfect. 😄 |
@rbrtbnfgl What do you think about this issue? I am experiencing the same slowness when accessing service external to the cluster (flannel cni).
but occasionally success after 15 seconds... I am almost at the point of switching to calico as it claims to solve the problem... |
Which version of Flannel are you using? This is a very old issue I think it will be better if you create a new one with your setup config. It could be a problem with the UDP checksum #1679 |
@rbrtbnfgl
Because my cluster is a mix of windows (worker only) and linux nodes, and the port numbers need to change. I did try running
on all the master and worker nodes (no reboot or restart any services whatsoever, just run the command purely), but still does not resolve the issue. The tricky bit is, the problem is not consistent. I'd say, 80% of the time ping will either give "bad address" or take > 15s to resolve, 20% of the time it works reasonably ok. NB: ping ip always work, it is the DNS resolution, or the communication from pod to dns resolver via flannel, that seems to be causing the issue. |
Is the issue only on the pods on the windows nodes or also with the ones on linux? |
I have resolved my issue, although I am not sure how/why it causes issues to my cluster. Root cause:
Solution: kubectl -n kube-system rollout restart deployment coredns Perhaps after step 3, coredns is meant to be refreshed/restarted? Anyway, thanks @rbrtbnfgl |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Ref: kubernetes/kubernetes#87233 (comment)
The k/k guys believed this is a Flannel's issue, so re-post here.
The text was updated successfully, but these errors were encountered: