The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server #7926

ming12713 · 2023-08-14T08:34:24Z

calico version: v3.26.1
kubernetes version: v1.26.6
calico installation spec

spec:
  calicoNetwork:
    bgp: Enabled
    hostPorts: Enabled
    ipPools:
    - blockSize: 26
      cidr: 10.244.0.0/16
      disableBGPExport: false
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()
    linuxDataplane: Iptables
    multiInterfaceMode: None
    nodeAddressAutodetectionV4:
      firstFound: true

iperf3 testing in pod

iperf3 server pod running on airflow01 node with 10g nic
iperf3 client pod running on airflow02 node with 10g nic

iperf3 testing in baremetal node

iperf3 server running on airflow01 node with 10g nic
iperf3 client running on airflow02 node with 10g nic

The text was updated successfully, but these errors were encountered:

sridhartigera · 2023-08-22T16:57:56Z

@ming12713 I am not sure if I understand this correctly. Baremetal has 10G NICs and iperf output is ~10Gbps. Am I missing something?

ming12713 · 2023-08-23T03:13:59Z

@sridhartigera
Yes, the baremetal servers have 10G network card, and when using iperf to test on the baremetal servers, the speed reaches 10G. However, when pods are running on these baremetal servers using the Calcio plugin, point-to-point network testing does not achieve 10G. The two pods being tested are located on different baremetal server

lwr20 · 2023-08-24T18:26:17Z

Halving of throughput generally indicates MTU issues. Often throughput is limited by maximum packets-per-second, and if there is an MTU issue in the path, that will lead to fragmentation and a doubling of the number of packets (which halves throughput)

See https://docs.tigera.io/calico/latest/networking/configuring/mtu

ming12713 · 2023-08-25T06:33:19Z

@lwr20 thanks
I've set the MTU to 8950 and then used iperf to perform bandwidth testing between different pods. However, the bandwidth still doesn't reach the bandwidth of the baremeta servers.

lwr20 · 2023-08-25T13:58:46Z

OK, and what's the MTU of the network interface between the nodes (eth0 or whatever)? And the MTU set on any routers between the nodes?

ming12713 · 2023-08-28T01:57:14Z

@lwr20 thanks, baremeta server network interface mtu is 9000,nodes connected through 10G switch, and the 10G switch default setting of 1500 mtu.

lwr20 · 2023-08-29T16:43:16Z

That doesn't sound good - if the server has 9000 MTU, switch should also have MTU=9000.

But on the other hand, that's the same for both baremetal and pod-pod case, so its clearly not the cause of this issue.

You sound like you're using VXLAN encapsulation (since you mentioned 8950 MTU setting in Calico). Do you need VXLAN encap at all in this scenario? ISTR there was a recent linux kernel bug with VXLAN checksum offloading. Can you try without VXLAN to establish if the problem is related to VXLAN or not?

ming12713 · 2023-08-30T05:02:09Z

yes ,i use VXLANCrossSubnet encap, i think I've encountered the same bug you mentioned.
releate issue #7974

lwr20 · 2023-08-30T10:48:32Z

I don't think that's the VXLAN checksum offload issue, that's a kernel crash, isn't it? The VXLAN checksum offload issue "just" causes dropped packets (I think)

Based on #4727 (comment)
Can you try setting featureDetectOverride: "ChecksumOffloadBroken=true" in the default FelixConfiguration and see if that fixes the issue please?

ming12713 · 2023-08-31T01:01:51Z

@lwr20
I changed the encapsulation VXLANCossSubnet mode to IPIPCrossSubnet, and after testing, I found that the bandwidth still couldn't be fully utilized. IPIP performs slightly worse than VXLAN in terms of performance.

onesb23 · 2023-09-11T10:05:51Z

I don't think that's the VXLAN checksum offload issue, that's a kernel crash, isn't it? The VXLAN checksum offload issue "just" causes dropped packets (I think)

Based on #4727 (comment) Can you try setting featureDetectOverride: "ChecksumOffloadBroken=true" in the default FelixConfiguration and see if that fixes the issue please?

This fixed the vxlan calico underspeed issue for me.

mazdakn · 2023-09-19T16:52:40Z

@ming12713 have you tried the fix that @lwr20 mentioned above?

ming12713 · 2023-09-20T01:44:43Z

@ming12713 have you tried the fix that @lwr20 mentioned above?
no

ming12713 changed the title ~~The network speed of the pod cannot reach the speed of the bare metalserver, only half the speed of the bare metalserver~~ The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server Aug 14, 2023

mazdakn added the kind/support label Sep 19, 2023

caseydavenport closed this as completed Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server #7926

The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server #7926

ming12713 commented Aug 14, 2023 •

edited

Loading

sridhartigera commented Aug 22, 2023

ming12713 commented Aug 23, 2023

lwr20 commented Aug 24, 2023 •

edited

Loading

ming12713 commented Aug 25, 2023

lwr20 commented Aug 25, 2023

ming12713 commented Aug 28, 2023

lwr20 commented Aug 29, 2023

ming12713 commented Aug 30, 2023 •

edited

Loading

lwr20 commented Aug 30, 2023 •

edited

Loading

ming12713 commented Aug 31, 2023

onesb23 commented Sep 11, 2023

mazdakn commented Sep 19, 2023

ming12713 commented Sep 20, 2023

The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server #7926

The network speed of the pod cannot reach the speed of the baremetal, only half the speed of the baremetal server #7926

Comments

ming12713 commented Aug 14, 2023 • edited Loading

sridhartigera commented Aug 22, 2023

ming12713 commented Aug 23, 2023

lwr20 commented Aug 24, 2023 • edited Loading

ming12713 commented Aug 25, 2023

lwr20 commented Aug 25, 2023

ming12713 commented Aug 28, 2023

lwr20 commented Aug 29, 2023

ming12713 commented Aug 30, 2023 • edited Loading

lwr20 commented Aug 30, 2023 • edited Loading

ming12713 commented Aug 31, 2023

onesb23 commented Sep 11, 2023

mazdakn commented Sep 19, 2023

ming12713 commented Sep 20, 2023

ming12713 commented Aug 14, 2023 •

edited

Loading

lwr20 commented Aug 24, 2023 •

edited

Loading

ming12713 commented Aug 30, 2023 •

edited

Loading

lwr20 commented Aug 30, 2023 •

edited

Loading