Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If external IP is defined K3s API NAT rule use it. #2963

Closed
dragon2611 opened this issue Feb 19, 2021 · 6 comments
Closed

If external IP is defined K3s API NAT rule use it. #2963

dragon2611 opened this issue Feb 19, 2021 · 6 comments

Comments

@dragon2611
Copy link

Environmental Info:
K3s Version:
k3s version v1.20.2+k3s1 (1d4adb0)

Node(s) CPU architecture, OS, and Version:
Linux k3s-pmx 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
1 Master (Used to be 2 masters) , external postgresDB

Describe the bug:
if both node-ip and node-external-ip are set K3s uses the node-external IP in the iptables rules for internal communications causing coredns failure if the external IP not present on the server (E.g if it's on a firewall)

trace[1568160156]: [30.000435088s] [30.000435088s] END
E0219 23:23:37.842661       1 reflector.go:127] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:156: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get "https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0": dial tcp 10.43.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"

Steps To Reproduce:
Define both node ip and node-external-IP

Node external IP should not be present on the Node itself but should exist on some other device (E.g Network Firewall)

Expected behavior:
Would expect internal cluster communications to use the internal IP particularly for traffic being DNAT by iptables on the machine itself given 10.43.0.1 is the VIP for the kubeapi

Actual behavior:
Causes failure of services trying to reach Kube API.

@brandond
Copy link
Member

brandond commented Feb 19, 2021

We frequently deploy nodes to places like EC2 where the node-external-ip is not actually bound to the interface and do not experience the issue you're describing. Can you confirm that you're not seeing another issue, like your firewall blocking access to traffic from nodes to the external IP? If you define an external IP, nodes need to be able to communicate with it.

@dragon2611
Copy link
Author

dragon2611 commented Feb 20, 2021

Node in question had no firewall rules on it at the time other than the ones K3S generated i'd also tried switching to iptables legacy incase it was a nftable issue.

Tried a Hairpin Nat on the router/firewall but it didn't seem to help, could see hits on the rule.

There was only the one node online/available so no it wouldn't have been traffic from another node.

Dumping the nat table on the node showed the external IP and the rule wasn't catching the traffic (I think you'd need a rule in postrouting on the egress from the node to catch such a packet).

Annoyingly I didn't think to save the iptables rules to post them here.

@brandond
Copy link
Member

brandond commented Feb 21, 2021

Can you provide any more info on your configuration? Our QA environments are pretty much in the exact configuration you describe (except with EC2 instead of a traditional on-prem environment and edge firewall) and we don't see any issues with it. I am going to guess that it has something to do with how hairpin NAT is configured in your environment.

@dragon2611
Copy link
Author

It's a Ubuntu 20.40 vm running on Proxmox behind a Mikrotik CHR with an address in the 10.10.10.0/24 range, the external IP is on the mikrotik rather than the VM itself.

I wonder if EC2 implements some kind of NAT loopback

@dragon2611
Copy link
Author

dragon2611 commented Feb 21, 2021

❯ iptables-save | grep KUBE-SVC-NPX
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
-A KUBE-SERVICES -d 10.43.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-A2RYYULAFKOCTQ6C
❯ iptables-save | grep KUBE-SEP-A2RYYULAFKOCTQ6C
:KUBE-SEP-A2RYYULAFKOCTQ6C - [0:0]
-A KUBE-SEP-A2RYYULAFKOCTQ6C -s 46.x.xx.xxx/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-A2RYYULAFKOCTQ6C -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 46.x.xx.xxx:6443
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-A2RYYULAFKOCTQ6C

I think it's Now working with a hairpin nat, although it would be better if the traffic didn't have to leave the VM only to Be natted back at itself, I suppose the problem is because the node itself is a master and thus is where the kubeapi is.

@stale
Copy link

stale bot commented Aug 21, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Aug 21, 2021
@stale stale bot closed this as completed Sep 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants