Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP libp2p port not responding at times #5004

Closed
jakubgs opened this issue May 30, 2023 · 6 comments
Closed

TCP libp2p port not responding at times #5004

jakubgs opened this issue May 30, 2023 · 6 comments

Comments

@jakubgs
Copy link
Member

jakubgs commented May 30, 2023

Describe the bug
A weird behavior has been observed on our test fleets where the libp2p TCP port appears as if it's not responding at times:

[email protected]:~ % PORTS=$(awk -F'[= ]+' '/tcp-port/{printf "%d,", $3}' /etc/systemd/system/beacon-node-mainnet-*.service)
[email protected]:~ % sudo nmap -Pn -p$PORTS localhost
Starting Nmap 7.80 ( https://nmap.org ) at 2023-05-30 12:57 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000042s latency).

PORT     STATE    SERVICE
9000/tcp filtered cslistener
9001/tcp filtered tor-orport
9002/tcp filtered dynamid
9003/tcp open     unknown
9004/tcp open     unknown
9005/tcp open     golem

Nmap done: 1 IP address (1 host up) scanned in 1.34 seconds
[email protected]:~ % sudo nmap -Pn -p$PORTS localhost
Starting Nmap 7.80 ( https://nmap.org ) at 2023-05-30 12:57 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000020s latency).

PORT     STATE    SERVICE
9000/tcp open     cslistener
9001/tcp filtered tor-orport
9002/tcp filtered dynamid
9003/tcp open     unknown
9004/tcp open     unknown
9005/tcp open     golem

Nmap done: 1 IP address (1 host up) scanned in 2.41 seconds

But it's not consistent, since sometimes they do respond, and they they do not.

Additional context
No firewall issues have been found, this appears to be an issue with the nodes themselves.
I have observed the behavior in both stable and testing but in unstable so far.

@Menduist
Copy link
Contributor

Menduist commented Jun 8, 2023

The node stops accepting new connections when it is full

@jakubgs
Copy link
Member Author

jakubgs commented Jun 12, 2023

Okay:

  1. What do you mean by "full"? Are you referring to max peers limit? Because I've seen this on nodes that are not at limit.
  2. What do you mean by "stops accepting new connections"? Do you mean they reject connections or just ignore them?

@Menduist
Copy link
Contributor

Menduist commented Jun 12, 2023

Yes, once they reach --max-peers, they will ignore connections (to be more precise, they won't call accept, so it will be queued in the kernel as long as the other side is happy to wait, most probably will timeout after a while)
Though, if you've seen it on nodes which are not at their limit, that might be an actual issue

@jakubgs
Copy link
Member Author

jakubgs commented Jun 12, 2023

Doesn't seem related to max peers limit to me:

[email protected]:~ % s cat beacon-node-mainnet-stable-01 | grep -e max-peers -e metrics-port -e tcp-port   
    --tcp-port=9000 \
    --max-peers=320 \
    --metrics-port=9200 \
[email protected]:~ % sudo nmap -Pn -p9000 localhost                                                        
Starting Nmap 7.80 ( https://nmap.org ) at 2023-06-12 08:54 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up.

PORT     STATE    SERVICE
9000/tcp filtered cslistener

Nmap done: 1 IP address (1 host up) scanned in 2.12 seconds
[email protected]:~ % curl -s 0:9200/metrics | grep '^libp2p_peers '                                        
libp2p_peers 25.0

Clearly not at a limit, and the port seems unresponsive to TCP handshakes.

@Menduist
Copy link
Contributor

Menduist commented Jun 14, 2023

Probable fix: vacp2p/nim-libp2p#916

@Menduist
Copy link
Contributor

#5102 seems to have fixed this for good, please re-open if it happens again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants