-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ServiceLB cannot be accessed via loopback when service ExternalTrafficPolicy=Local #7561
Comments
I suspect that it's related to the flannel version bump. I see that @rbrtbnfgl has assigned this to himself so I'll let him reply once he's had a chance to do some investigation. |
Yes I want to do a futher investigation on it. The only difference between the Kube-router and flannel versions should be the iptables rules order but they should be related only to the pods traffic and it seems that the issue is related to the traffic from the node to localhost. |
I think the issue is related to |
I don't know if this is the right fix or this behave is the expected one. It could be related to this https://github.com/k3s-io/k3s/blob/master/pkg/cloudprovider/servicelb.go#L556 |
We haven't changed anything about how traffic gets into the svclb pods, that still relies on NodePort pods - the kubelet and portmap CNI plugin handle all of that. |
The behaviour of the CNI is the same on both versions the packets directed to localhost:80 are masqueraded with the IP of the
On 1.25.7
|
Yes, the difference comes from the ExternalTrafficPolicy, which was added in https://github.com/k3s-io/k3s/pull/6726/files#diff-38e1c51632f2d12566706b6d7f22cf82e1441ea19e27f787ff15e4b7b92dc197R566-R592 If the ExternalTrafficPolicy is set to local, the svclb pods target the host IP and NodePort to ensure that traffic only goes to local pods. If it is set to anything else, it targets the cluster service address and port. Why would the drop rule only match if the connection is to the loopback address, instead of the node IP? Is it bypassing a port rewrite rule on the way in? |
When you the ip is the node IP the kube-proxy rules masquerade the packets with the ingress pod IP and not the svclb one. |
Hmm. The intent of the FORWARD rules is to enforce LoadBalancerSourceRanges; I guess that ends up blocking local access that doesn't go through kube-proxy. What's the source of the traffic in this case? Is it from the loopback address? |
The source address is the IP of |
Hmm OK. So in the case of targeting the local pod's node port for local traffic policy, the forward rules we put in place to enforce source ranges end up blocking access via localhost. I'll have to do some thinking on how to handle that. Is it always the cni0 IP as the source, across all CNIs? or is this a flannel-specific thing? |
It should be related to portmap CNI. It changes the destination IP to the IP of the pod and the linux routing table decides to use the IP of |
I just realized that I was overcomplicating the path here, and the core issue is just that the source and destination port are not the same when the ExternalTrafficPolicy is set to local, and I used the wrong port in the allow rule. There appears to be a community PR to fix this at k3s-io/klipper-lb#54 |
Validated on Version:-k3s version v1.27.2+k3s-55db9b18 (55db9b18) Environment DetailsInfrastructure Node(s) CPU architecture, OS, and Version: Cluster Configuration: Config.yaml:
Steps to Repro the issue:
Steps to Validate the fix:
|
can this be backported? |
Backported to what? It was already backported to all branches that were active at the time the issue was closed. You can find the backport issues linked above. |
Sorry I see that now, I read the last comment about it being verified on 1.27 and assumed that was the only place there was a fix |
I'm on 1.25.15 and still facing this issue, however |
Please open a new issue and fill out the issue template, describing specifically what you're running into. The issue described here was resolved in #7718 - I suspect you're running into something different. |
K3s Version:
Migrating from v1.25.6+k3s1 to v1.25.7+k3s1
Node(s) CPU architecture, OS, and Version:
1 Node, x86, reproducible on different OS like Ubuntu 22.04 and Ubuntu on WSL
Cluster Configuration:
1 Node local dev development
Describe the bug:
We're using k3s as our local development environment platform, routing FQDNs to our dev machines using
/etc/hosts
entries pointing to127.0.0.1
This worked perfectly fine for basically years now, until recently. I had to dig quite a while before I was able to pinpoint it to upgrading k3s from v1.25.6+k3s1 to v1.25.7+k3s1. It stops working on 1.25.7 and works again after downgrading to 1.25.6. The problem also exists on the most current 1.27.1
Steps To Reproduce:
Install k3s and ingress controller like so and access it using 127.0.0.1
Expected behavior:
A 404 Message returned by nginx
Actual behavior:
Network timeout
However - the Ports/Services are in fact working perfectly fine using the Node's LAN Address like 192.168.... It's just not available via 127.0.0.1 anymore and I can't figure out, why.
Additional info
I checked the corresponding changelog at https://github.com/k3s-io/k3s/releases/tag/v1.25.7%2Bk3s1 and found some changes regarding the servicelb but at least to me none of them explained the behavior I'm seeing.
I exported iptables while running 1.25.6 and while running 1.25.7 and tried to compare them somehow but I'm afraid my knowledge of iptables is not sufficient to assess whether the cause of the problem is to be found here or not.
Thanks a lot for any help in advance - Max
The text was updated successfully, but these errors were encountered: