Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not reject peers if --p2p-max-peers is reached #5588

Closed
q9f opened this issue Apr 23, 2020 · 5 comments · Fixed by #6184
Closed

Do not reject peers if --p2p-max-peers is reached #5588

q9f opened this issue Apr 23, 2020 · 5 comments · Fixed by #6184
Labels
Blocked Blocked by research or external factors Networking P2P related items

Comments

@q9f
Copy link
Contributor

q9f commented Apr 23, 2020

💎 Issue

Background

I would suggest that --p2p-max-peers should be a soft limit, not a hard limit.

Description

In ETH1 clients, we do something along the lines:

  • if peerCount < maxPeers: do active peer-discovery and accept incoming peers
  • if peerCount >= maxPeers: don't do active peer-discovery but still accept incoming peers

The way you could implement these limits could be along the notion of --p2p-min-peers and --p2p-max-peers where min-peers would be the soft limit, e.g., 30 peer target, and max-peers would be a hard limit.

So, the client would try to maintain 30 peers but not reject incoming peers up to a more strictly defined hard cap. Additionally, after reaching the hard limit, it would be nice to send a Goobye to inform the peer why they were kicked.

@q9f
Copy link
Contributor Author

q9f commented Apr 23, 2020

This is what a Lighthouse peer observes once a Prysm node reached the max peers limit:

Apr 23 11:07:56.026 WARN RPC Error                               Error: Custom("Protocol error: I/O error: unexpected end of file"), request_id: 0, Peer: PeerId("16Uiu2HAm5RX4gAQtwqArBmuuGugUXAViKaKBx6ugDJb1L1RFcpfK"), service: router, service: network
Apr 23 11:07:56.405 WARN RPC Error                               Error: Custom("Protocol error: I/O error: i/o error: Connection reset by peer (os error 104)"), request_id: 0, Peer: PeerId("16Uiu2HAmSu8PFJfvCj2WYdEpikMRYLiNQHR3aXHa1D9dRx5vVnyU"), service: router, service: network

@prestonvanloon
Copy link
Member

Wouldn’t that open up an attack vector? I could create thousands of nodes and your node would peer with all of them if they all dialed you? Your node would be overwhelmed with resource consumption.

With regards to disabling active discovery, I don’t think this functionality is available in libp2p configuration so we haven’t taken the time to add it upstream yet. cc: @nisdas

@q9f
Copy link
Contributor Author

q9f commented Apr 24, 2020

I could create thousands of nodes and your node would peer with all of them if they all dialed you?

No, we discussed this on discord and I'm sorry for using the terms in a confusing way. Actually, max-peers would not change, it would be a hard limit, e.g, 60 peers. Every incoming request beyond that limit would kindly be sent home with a "Goodbye". #5589

But internally, there could be a soft-limit in place, let's call it min-peers and you can decide if you want to make it configurable or just derive the value from the max-peers value, i.e., by using 50% or 70% of max-peers as soft-limit. This would mean that you stop actively looking for new peers at the soft limit of 30 peers but still accept incoming connections.

The very theoretical worst-case scenario we discussed: Imagine a network with 31 nodes all connected to each other. How would a 32nd node be able to join the network if all disconnect immediately.

And the more practical scenario we encountered: Trying to setup some local testnet nodes and manually connecting two local nodes kept failing due to the prysm node quickly filling up all 30 peer slots.

@nisdas
Copy link
Member

nisdas commented Apr 24, 2020

@prestonvanloon This would be possible with discoveryV5, it would actually be very simple so its not such a big issue. The only place we can't do this is when using kad dht, but since that is not a part of mainnet discovery spec we can ignore it.

I think this is possible to achieve using libp2p's connection manager. Its just that we disabled using it before as our nodes were getting overwhelmed from a large number of peers. We needed a strict peer limit so that our nodes did not get exhausted which the connection manager did not provide. However we have had a lot of improvements to prysm and it is able to handle a larger number of peers much more easily now.

What the connection manager does is that it manages peers between a low and high watermark .The low watermark would be your max peer limit whereas your high watermark would be a value of upto 125% of your max peer limit. This will allow inbound peers to connect to with a peer at its max limit and not be disconnected. The connection manager will slowly prune excess peers to bring it down to the 'low' watermark. Essentially keeping the peers connected between these 2 limits.

This is in theory how the connection manager should work, however we did face a lot of issues using libp2p's connection manager in prysm. Which is why we opened issues here:
libp2p/go-libp2p-core#99
libp2p/go-libp2p-swarm#146

We did not have any way to deny active discovery using kademlia-dht in prysm previously which was our main problem, however with discoveryV5 it is much more simple to implement. Two potential solutions to resolve this:

  1. Use the connection manager again to primarily handles connections with other peers instead of enforcing a hard limit.
  2. In the event libp2p's connection manager is difficult to use again for us, we implement a simplified version of it specifically for prysm.

@nisdas nisdas added the Networking P2P related items label Apr 24, 2020
@nisdas
Copy link
Member

nisdas commented Apr 24, 2020

To give an update, I tried running libp2p's connection manager with a max limit of 20 with a high watermark of 25 , the number of peers connected ended up ballooning to over 100...

So the connection manager is pretty much ineffective here and does not do active pruning very well. or at all for that matter. Any solution would have to be with a custom connection manager which would manage peers dynamically without any static hard limit.

@shayzluf shayzluf added the Blocked Blocked by research or external factors label Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked Blocked by research or external factors Networking P2P related items
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants