Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with delay between HTTP Get and response more than ~26 second #2364

Closed
surfinlemex opened this issue Mar 29, 2023 · 60 comments
Closed

Comments

@surfinlemex
Copy link

In ours two cluster deployment we can observer several same submariner connections.
All of the is Active and digagnoses wasn't detect any issues.
However we have issue with TCP session delay for connections between two POD in diferent clusters.
Can you please let's us know if it's correct configuration when we can see more than one connection as it shown on screenshot in attachment?

Can we somehow delete extra connections?

image

@skitt
Copy link
Member

skitt commented Mar 29, 2023

These are all the same connection (same gateway, same cluster, same remote IP etc.) — I’m guessing you have multiple contexts defined in your kubeconfig; subctl show connections tries all available contexts to show the connections.

@surfinlemex
Copy link
Author

Yes, you're right, the number of connections is equivalent to the number of contexts.
In turn i would like to ask if the same connections can affect on tcp session packets delay.
We have issue with ~26 delay between HTTP get and responce, when we are working via submariner.
Also we have DUP and Out-of-Order packets in that session.

image

@skitt
Copy link
Member

skitt commented Mar 29, 2023

There is only one connection, you’re seeing it reported multiple times through the different contexts.

Since your traffic goes through this connection (an IPsec tunnel), there can be an impact on the transmission, although since your average RTT is very low (0.5ms) I wouldn’t expect it to be noticeable. I also wouldn’t expect packets to be delivered out-of-order or duplicated, since the IPsec protocol goes to great lengths to avoid that. Perhaps there’s something going wrong between the nodes and the gateway; @sridhargaddam, @aswinsuryan, what do you think?

@surfinlemex
Copy link
Author

There is a video how our issue is looks like. See attachment.
We have captured traffic in pcap files from both cluster side pods.
I can provide them if needed

submariner.mp4

@surfinlemex
Copy link
Author

Can you please have a look to the captured traffic from both side?
There is an issue with delay between 3 way handshake and HTTP GET request around 26 second
At the same time ICMP traffic working fine without any retransmission and packets lost, only TCP traffic affected.

image
image

@sridhargaddam
Copy link
Member

This is weird. Can you share the following details?

Environment:

  • Diagnose information (use subctl diagnose all):
  • Gather information (use subctl gather):
  • Cloud provider or hardware configuration:

Also, the CNI that you are using in your setup.

Note: While running the above commands, you can pass the admin context, otherwise the same information is captured with ALL the contexts. For example: subctl show all --context ...

@surfinlemex
Copy link
Author

You can find requested information it the test file in attachment.
About SNI it's network plugin ("OpenShiftSDN")

We use hardware for our OpenShift deployment. It's shown on diagram
Diagram

amsoce02dr-is01.txt
amsoce02dr-is01-gather.txt
ams-oce2-ocp-is01.txt
ams-oce2-ocp-is01-gather.txt

@sridhargaddam
Copy link
Member

Thank you for sharing the details. The output of "subctl diagnose all" does not show any errors or warnings.
So, we have to look at other places. In order to understand if there are any errors in various Submariner pods, we would need the logs from the pods which are stored in a directory displayed in the last line of the "subctl gather ..." output.

For example:

<SNIP>
...
...
 ✓ Found 0 deployments by label selector "app=submariner-networkplugin-syncer" in namespace "submariner-operator"
 ✓ Found 1 deployments by label selector "app=submariner-lighthouse-agent" in namespace "submariner-operator"
 ✓ Found 1 deployments by label selector "app=submariner-lighthouse-coredns" in namespace "submariner-operator"
Files are stored under directory "submariner-20230331084001/oce02dr"

For each cluster, please tar one of the directory contents of the folders shown in the output of "subctl gather" and attach them to this issue.

Also, may i know the OpenShift version you are using in your setup?

@yboaron
Copy link
Contributor

yboaron commented Apr 3, 2023

+1 to what @sridhargaddam suggested.

In addition, from the pcap files attached above, it seems that TCP MSS values are asymmetric (1410 and 1160) , I would also recommend making sure that you are not running into any MTU related issue. for this you can run subctl verify with small packet size (check [1]), this option is available in latest subctl version.

[1]
subctl verify --context cluster1 --tocontext cluster2 --only connectivity --packet-size 500

@surfinlemex
Copy link
Author

Looks like we have issue with subctl veryfy connectivity test..
Please have a look to the screenshot below
image

@sridhargaddam
Copy link
Member

Following is the command you executed.

subctl verify --context admin --tocontext admin --only connectivity

As you can see, the context names are the same, hence you are getting the error. Please rename the context of the second cluster from admin to some other name and then re-run the command.

Following is the sample output of contexts from a KIND setup.

[sgaddam@localhost submariner]$ export KUBECONFIG=output/kubeconfigs/kind-config-cluster1:output/kubeconfigs/kind-config-cluster2
[sgaddam@localhost submariner]$ kubectl config get-contexts
CURRENT   NAME       CLUSTER    AUTHINFO   NAMESPACE
*         cluster1   cluster1   cluster1   
          cluster2   cluster2   cluster2 

@surfinlemex
Copy link
Author

Please have a look to the screenshot
image

@surfinlemex
Copy link
Author

Please let me know if that test was informative for you?
Can we interrrupt it now?
image

@surfinlemex
Copy link
Author

This is proper connectivity test between two contexts in deferent cluster
verify connection.txt

@surfinlemex
Copy link
Author

@sridhargaddam do you have chanse to see the logs file with connection test?

@surfinlemex
Copy link
Author

@sridhargaddam Please let me know if we can aggange call to troubleshoot our issue with submariner online ?

@sridhargaddam
Copy link
Member

This is proper connectivity test between two contexts in deferent cluster verify connection.txt

From the attached logs, I can see 5 failures.

Summarizing 5 Failures:

[Fail] [dataplane] Basic TCP connectivity tests across clusters without discovery when a pod connects via TCP to a remote pod when the pod is not on a gateway and the remote pod is not on a gateway [It] should have sent the expected data from the pod to the other pod
github.com/submariner-io/[email protected]/test/e2e/framework/network_pods.go:187

[Fail] [dataplane] Basic TCP connectivity tests across clusters without discovery when a pod connects via TCP to a remote service when the pod is not on a gateway and the remote service is not on a gateway [It] should have sent the expected data from the pod to the other pod
github.com/submariner-io/[email protected]/test/e2e/framework/network_pods.go:187

[Fail] [dataplane] Basic TCP connectivity tests across clusters without discovery when a pod connects via TCP to a remote service when the pod is not on a gateway and the remote service is on a gateway [It] should have sent the expected data from the pod to the other pod
github.com/submariner-io/[email protected]/test/e2e/framework/network_pods.go:187

[Fail] [dataplane] Basic TCP connectivity tests across clusters without discovery when a pod with HostNetworking connects via TCP to a remote pod when the pod is not on a gateway and the remote pod is not on a gateway [It] should have sent the expected data from the pod to the other pod
github.com/submariner-io/[email protected]/test/e2e/framework/network_pods.go:188

[Fail] [dataplane] Basic TCP connectivity tests across clusters without discovery when a pod connects via TCP to a remote pod in reverse direction when the pod is not on a gateway and the remote pod is not on a gateway [It] should have sent the expected data from the pod to the other pod
github.com/submariner-io/[email protected]/test/e2e/tcp/connectivity.go:69

Gateway to Gateway connectivity is working fine and all the failures are when the pod is scheduled on the non-Gateway node. So, I suspect that there is some MTU issue which is creating problems for TCP traffic. As @yboaron suggested, lets try with a lower packet-size.

Note: The --packet-size option for subctl verify .... is only available in the devel as its added as part of the current Submariner release (0.15.0). Just for testing, we can use 0.15.0-rc0 image.

curl -Ls https://get.submariner.io | VERSION=0.15.0-rc0 bash
export PATH=$PATH:~/.local/bin
echo export PATH=\$PATH:~/.local/bin >> ~/.profile

subctl verify --context oce02dr-gg/api-oce02dr-okd:6443/system:admin  --tocontext amsoce02-gg/api-amsoce02-okd:6443/system:admin --only connectivity --packet-size 500

If the output of the above command is success and all the tests PASS, then it implies that there is indeed an MTU issue.

@sridhargaddam
Copy link
Member

Looks like you are not using the right version of subctl. Please follow the instructions I shared above on how to download the subctl version 0.15.0-rc0 and use the same version to run the verify command.

@surfinlemex
Copy link
Author

With new subctl connectivity test all successful.
I did it with different packet size. See screenshots
image
image

@sridhargaddam
Copy link
Member

Okay, so its indeed an MTU issue. It looks like the underlay network connecting the clusters are adding some protocol overhead and the default MTU configured on the interfaces is not sufficient.

Using standard tools, please check the proper MTU between a non-Gateway node to a remote cluster non-Gateway node and once you derive the value, you can force the MTU as shown below.

Add the following annotation on the Gateway nodes of both the clusters.
e.g. kubectl annotate node <node_name> submariner.io/tcp-clamp-mss=

After adding the annotation, restart the submariner-routeagent pods.

kubectl delete pod -n submariner-operator -l app=submariner-routeagent

This should fix the latency issue you are seeing with TCP packets.

@surfinlemex
Copy link
Author

@sridhargaddam I suppose issue Gateway nodes or libreswan. Please have a look to the traceroute screenshots
image
image

@surfinlemex
Copy link
Author

non-Gateway node MTU
image

Gateway node MTU
image

non-Gateway node DR cluster
image

Gateway node DR cluster
image

Both cluster connected to the same L3 switch and MTU on all switchport is 9032

@surfinlemex
Copy link
Author

Routing table on Gateway node
image

Routing table on DR cluster Gateway node
image

@surfinlemex surfinlemex changed the title How to remove extra submariner connections? Issue with delay between HTTP Get and responce more than ~26 second Apr 13, 2023
@surfinlemex surfinlemex changed the title Issue with delay between HTTP Get and responce more than ~26 second Issue with delay between HTTP Get and response more than ~26 second Apr 13, 2023
@surfinlemex
Copy link
Author

I just redeploy all routeragents from both clusters, however issue wasn't resolved
image

@surfinlemex
Copy link
Author

surfinlemex commented Apr 13, 2023

This is MTU size test with fragmentation deny between Gateway nodes in different clusters
image

@yboaron
Copy link
Contributor

yboaron commented Apr 13, 2023

@surfinlemex Did you annotate the GW nodes on both clusters with the desired TCP_MSS value as Sridhar mentioned?

you should apply

kubectl annotate node <gw_node_name> submariner.io/tcp-clamp-mss=1200

On GWs in both clusters.

and then restart all route_agent pods by running:
kubectl delete pod -n submariner-operator -l app=submariner-routeagent

If the above steps didn't help please attach the latest Submariner logs (subctl gather )

@yboaron
Copy link
Contributor

yboaron commented Apr 27, 2023

Thanks for uploading the pcap files @surfinlemex , I'll try to check them in the next few days

@yboaron
Copy link
Contributor

yboaron commented Apr 30, 2023

Could you please add [1] iptables rules on non-gw and gw nodes on both clusters and rerun the curl test ?

You can use the following steps to install iptables rules on node:
A. Find submariner-routeagent-xxxxx pod that runs on the relevant node , you can use [2] for that
B. Exec into submariner-routeagent pod , see [3]
C. Apply iptables rules

[1]

iptables -t raw -I OUTPUT -s -d -j NOTRACK
iptables -t raw -I OUTPUT -s -d -j NOTRACK
iptables -t raw -I PREROUTING -s -d -j NOTRACK
iptables -t raw -I PREROUTING -s -d -j NOTRACK

[2]

$ kubectl get pods -n submariner-operator -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
submariner-gateway-c7qm2 1/1 Running 0 7d1h 172.23.0.6 cluster2-worker
submariner-globalnet-sljvr 1/1 Running 0 7d1h 172.23.0.6 cluster2-worker
submariner-lighthouse-agent-55655c9b6-7nqc7 1/1 Running 0 7d1h 10.129.1.5 cluster2-worker
submariner-lighthouse-coredns-749b7cf9c4-7jx2r 1/1 Running 0 7d1h 10.129.1.6 cluster2-worker
submariner-lighthouse-coredns-749b7cf9c4-gzp68 1/1 Running 0 7d1h 10.129.1.7 cluster2-worker
submariner-metrics-proxy-qsnll 2/2 Running 0 7d1h 10.129.1.4 cluster2-worker
submariner-operator-56ccb54ccf-fg6bs 1/1 Running 0 7d1h 10.129.1.3 cluster2-worker
submariner-routeagent-kxf76 1/1 Running 0 7d1h 172.23.0.5 cluster2-control-plane
submariner-routeagent-ztzsf 1/1 Running 0 7d1h 172.23.0.6 cluster2-worker

[3]
$ kubectl exec -it -n submariner-operator submariner-routeagent-kxf76 /bin/sh

@surfinlemex
Copy link
Author

You can see that we have all those rules already. Unfortunately same result:
image

IPTABLES on both cluster GW node and non gw node wiht pod
image
image

@yboaron
Copy link
Contributor

yboaron commented May 2, 2023

Thanks @surfinlemex .

Trying to summarize the case :

What's the problem?

  • inter cluster ICMP (ping) is fine while TCP connectivity test there is a delay of ~25 seconds
  • Subctl verify e2e tests pass successfully

What did you try so far?

  • Setting MSS_CLAMPING value (submariner.io/tcp-clamp-mss ) - didn't help
  • Disable connection tracking (using iptables NOTRACK) for Submariner traffic - didn't help

cc @sridhargaddam

@surfinlemex
Copy link
Author

@yboaron nothing was help... TCP-MSS 1000 or NOTRACK for iptables...

I have only one option... this issue related to CNI=OpenShift SDN = OVS. If you check captured traffic you can see intercluster traffic packets on external NIC.

@sridhargaddam
Copy link
Member

@surfinlemex I strongly suspect this is some platform configuration issue.

Lets try to narrow down the problem.
Please run the following command on the following Baremetal nodes.

  1. The Gateway node of BOTH the clusters
  2. The node on which the nginx server pod is scheduled.
  3. The node on which the client pod is scheduled from where the curl command is issued to the server.

Command:

ethtool --offload vx-submariner rx off tx off

After executing the above command, you can try the curl command once again and let us know the behavior.

@surfinlemex
Copy link
Author

@sridhargaddam Looks like the change you suggested resolve the issue
See screenshot blow:
image

@sridhargaddam
Copy link
Member

@surfinlemex that's great to hear!!

Can you let us know the following details from the BareMetal platform

  1. OS details
  2. Linux kernel version
  3. OpenShift version
  4. Iptables version.

@surfinlemex
Copy link
Author

surfinlemex commented May 3, 2023

1. OS details:
cat system-release
Red Hat Enterprise Linux CoreOS release 4.10

2. Linux kernel:
Linux ams-oce2-ocp-is01 4.18.0-305.3.1.el8.x86_64 #1 SMP Tue Jun 1 16:14:33 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

3. OpenShift version:
oc version
Client Version: 4.10.0-202209291217.p0.g442535c.assembly.stream-442535c

4.Iptables version:
iptables --version
iptables v1.8.4 (nf_tables)

image

@maayanf24 maayanf24 moved this to Todo in Submariner 0.16 May 3, 2023
@yboaron yboaron self-assigned this May 3, 2023
@sridhargaddam sridhargaddam self-assigned this May 3, 2023
@surfinlemex
Copy link
Author

@yboaron Can you please provide RCA of the issue we have?
Looks like command: ethtool --offload vx-submariner rx off tx off just workaround.

@yboaron
Copy link
Contributor

yboaron commented May 9, 2023

Well, I can see multiple UDP packets with bad checksum captured in the pcap files you attached [1],

I assume that the root cause is some infrastructure issue (kernel or NIC firmware) causing bad checksum calculation for UDP packets as multiple similar issues have also been reported in other projects[2].

[1]
image

[2]
kubernetes-sigs/kubespray#8992
projectcalico/calico#3145

@stale
Copy link

stale bot commented Sep 17, 2023

This issue has been automatically marked as stale because it has not had activity for 60 days. It will be closed if no further activity occurs. Please make a comment if this issue/pr is still valid. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 17, 2023
@dfarrell07
Copy link
Member

I think @yboaron said he hopes to take a look at this when he has cycles

@stale stale bot removed the wontfix This will not be worked on label Oct 17, 2023
@skitt skitt removed the wontfix This will not be worked on label Oct 17, 2023
@skitt skitt removed this from Submariner 0.16 Oct 17, 2023
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further
activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 15, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in Submariner 0.17 Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

7 participants