This document explains how to troubleshoot issues that you may encounter while using the Citrix Kubernetes node controller (CNC). Using this document, you can collect logs to determine the causes and apply workarounds for some of the common issues related to the configuration of the CNC
To validate Citrix ADC and the basic node configurations, refer to the image on the deployment page.
If kube-cnc-router pods are not starting, then, this could be due to certain cluster restrictions wherein privileged pods cannot be deployed by non-admin users/privileges.
As a workaround, there are 2 options:
- Use "kube-system" namespace to deploy CNC
OR
- Assign "cluster-admin" role to CNC clusterrolebinding.
Note: If option 1) is chosen, then one cannot create multipe instances of CNC in a single cluster as we only have one kube-system namespace available per cluster.
To debug the issues when the service is in down state,
-
Verify the logs of the CNC pod using the following command::
kubectl logs <cnc-pod> -n <namespace>
Check for any 'permission' errors in the logs. CNC creates kube-cnc-router pods which require NET_ADMIN privilege to perform the configurations on nodes. So, the CNC service account must have the NET_ADMIN privilege and the ability to create host mode kube-cnc-routerpods.
-
Verify the logs of the kube-cnc-router pod using the following command:
kubectl logs <kube-cnc-pod> -n <namespace>
Check for any error in the node configuration. The following is a sample typical router pod log:
-
Verify the kube-cnc-router configmap output using the following command:
kubectl get configmaps -n <namespace> kube-cnc-router -o yaml
Check for empty field in the data section of the configmap. The following is a sample typical two node data section:
-
Verify the node configuration and make sure the following:
- CNC interface cncvxlan<md5_of_namespace> should be created.
- Assigned VTEP IP address should be the same as the corresponding router gateway entry in Citrix ADC.
- Status of interface should be functioning.
- iptable rule port should be created.
- Port should be the same that of VXLAN created on Citrix ADC.
- CNC interface cncvxlan<md5_of_namespace> should be created.
If you are not able to ping the service IP address from Citrix ADC even though the services are in operational state. One reason may be the presence of a PBR entry which directs the packets from Citrix ADC with SRCIP as NSIP to the default gateway.
It does not impact any functionality. You can use the VTEP of Citrix ADC as source IP address using the -S option of ping command in the Citrix ADC command line interface. For example:
ping <serviceIP> -S <vtepIP>
Note: If it is necessary to ping with NSIP itself, then, you must remove the PBR entry or add a new PBR entry for the endpoint with high priority.
Though, services are in up state, still you cannot cURL to the pod endpoint, that means, the stateful TCP session to the endpoint is failing. One reason may be the ns mode 'MBF' is set to enable. This issue depends upon deployments and might occur only on certain versions of Citrix ADC. To resolve this issue, you should disable MBF ns mode or bind a netprofilewith the netprofile disabled to the servicegroup. Note: If disabling the MBF resolves the issue, then it should be kept disabled.
For general support, when you raise issues, provide the following details which help for faster debugging. cURL or ping from Citrix ADC to the endpoint and get the details for the following:
For the node, provide the details for the following commands:
- tcpdump capture on CNC interface on nodes
tcpdump -i cncvxlan<hash_of_namesapce> -w cncvxlan.pcap
- tcpdump capture on node Mgmt interface lets say "eth0"
tcpdump -i eth0 -w mgmt.pcap
- tcpdump capture on CNI interface lets say "vxlan.calico"
tcpdump -i vxlan.calico -w cni.pcap
- output of "ifconfig -a" on the node.
- output of "iptables -L" on the node.
For ADC, provide the details for the following show commands:
- show ip
- show vxlan <vxlan_id>
- show route
- show arp
- show bridgetable
- show ns pbrs
- show ns bridgetable
- show ns mode
- Try and capture nstrace while ping/curl:
start nstrace -size 0 -mode rx new_rx txb tx -capsslkeys enABLED
stop nstrace