Add functionality for advanced control over peer set #6097

whyrusleeping · 2019-03-18T17:44:59Z

I'd really like to have the ability to specify peer IDs in my config file that my node will try to always remain connected to. Additionally, it would be nice to specify strategies for these peers, like "always", which attempts to reconnect on any disconnect (respecting backoff rules), "preferred" which never closed a connection to that peer, but doesnt necessarily try and hold open a connection, and maybe one more that just increases the likelihood the conn manager will hold open the connection, but still allows it to be closed as needed.

This will enable me to keep connections open to friends machines, making transfers between us much more reliable.

whyrusleeping · 2019-03-18T17:46:06Z

Additionally, the infra team wants this to hold connections between the gateways and our pinning servers open.

raulk · 2019-03-18T17:55:09Z

This will percolate to the connection manager (current, interim and future).

brianmcmichael · 2019-03-27T13:59:28Z

+1

brianmcmichael · 2019-03-27T14:04:58Z

This feature would mean that I can set up a gateway node that could 'permanently' connect to pin storage node and speed up propagation when the machine holding the content is known.

obo20 · 2019-03-27T15:03:04Z

We would also highly benefit from this type of feature. We're currently running an automated "swarm connect {gatewayAddr}" with our nodes every 5 minutes or so to keep these connections open.

Having an official "supported" way of keeping nodes connected would be amazing.

whyrusleeping · 2019-03-27T20:56:19Z

@Stebalien @raulk A pretty quick and non-invasive way of doing this would be to add a list of peers to the connection manager that lets it handle the ‘preferred’ case (not closing the connection). Then, in go-ipfs we can have a little process that is fed a list of ‘always’ peers that it dials, and listens for disconnects from.

brianmcmichael · 2019-03-27T20:59:57Z

Geth has the "static nodes" feature which may be useful as a development/design pattern for go-ipfs.

https://github.com/ethereum/go-ethereum/wiki/Connecting-to-the-network#static-nodes

raulk · 2019-03-27T21:53:17Z

@whyrusleeping let's do that quickly. Connection manager v2 proposal is in the works, but there's no reason we cannot implement a protected set. I'll work on a patch.

raulk · 2019-03-27T23:34:47Z

@whyrusleeping

like "always", which attempts to reconnect on any disconnect (respecting backoff rules),

Do you think libp2p should take care of reestablishing the connection? This would require changes in the host and/or the swarm, e.g. when you host.Connect() you could specify a supervision policy for that connection.

"preferred" which never closed a connection to that peer, but doesnt necessarily try and hold open a connection,

See libp2p/go-libp2p-interface-connmgr#14 and libp2p/go-libp2p-connmgr#36.

and maybe one more that just increases the likelihood the conn manager will hold open the connection, but still allows it to be closed as needed.

In the current connection manager, this affinity can be achieved by setting a higher score on that connection via a separate tag, e.g. "peer_affinity".

obo20 · 2019-03-28T00:16:12Z

@Stebalien The issue occurring in #6145 may or may not be relevant to this ticket

lanzafame · 2019-03-29T14:06:48Z

It would be awesome if this could be exposed via an API as well as configuration as this would allow Cluster to dynamically protect connections between IPFS nodes as they join a cluster.

raulk · 2019-03-29T20:00:18Z

I've added the Protect()/Unprotect() API to the connection manager, available in gomod version v0.0.3.

Please take it out for a spin and report back.

You should be unblocked now to make progress with this; do shout out if you think otherwise.

obo20 · 2019-04-01T14:33:31Z

@raulk @whyrusleeping How does this work in regards to spam protection for high profile nodes? For example, almost everybody would probably love to stay connected to the official ipfs.io gateway nodes if given the chance. However, it's obviously unfeasible for the official ipfs.io nodes to maintain connected to that many nodes all the time, which could result in an overwhelming amount of disconnects / attempts to reconnect.

Do the backoff rules may cover this edge case? I just wanted to double check that this doesn't accidentally bring your infrastructure to a standstill.

raulk · 2019-04-01T14:56:27Z

@obo20 dialer backoff rules wouldn't cover that case, as presumably the dials would succeed.

While I think it's legitimate for everybody to want to stay connected to ipfs.io, that's should not be the case and it's not the desired architecture. In other words: IPFS is not a hub-and-spoke nor federated model.

Gateways are able to discover across a decentralised network; if that proves dysfunctional, we should dig into that.

raulk · 2019-04-01T15:01:13Z

@obo20 from the viewpoint of a libp2p node, it's legitimate to strive to keep a connection alive with peer A if you consider it high-value. Peer A also has resource management in place, and will eventually prune connections it considers low-value. If many peers deem peer A as high-value, they will eventually compete for its resources. If the protocol manages reputation/scoring well (e.g. bitswap), peer A will retain the highest performing peers.

obo20 · 2019-04-01T15:01:54Z

@raulk I may have miscommunicated my concern.

My worry is that if say for example, the ipfs.io gateway nodes have a high water of 2000 (I'm making this number up) and then 3000 other nodes on the network want to have a "protected" swarm connection to those nodes, how would this be handled?

obo20 · 2019-04-01T15:03:39Z

@obo20 from the viewpoint of a libp2p node, it's legitimate to strive to keep a connection alive with peer A if you consider it high-value. Peer A also has resource management in place, and will eventually prune connections it considers low-value. If many peers deem peer A as high-value, they will eventually compete for its resources. If the protocol manages reputation/scoring well (e.g. bitswap), peer A will retain the highest performing peers.

So from my interpretation of this comment, the high profile nodes would just prune excess nodes even if every single one of those nodes had set it as "protected" ? This sounds good from the perspective of the high profile node.

How do the nodes who have deemed this connection "protected" act when they've been pruned? Do they attempt to frequently reconnect or do they just accept that they've been pruned and move on? (This may be more of an ipfs implementation question instead of a libp2p question.)

raulk · 2019-04-01T15:07:05Z

How do the nodes who have deemed this connection "protected" act when they've been pruned?

They just see the connection die. The application (e.g. IPFS) can then attempt to reconnect, and the other party will accept the connection and keep it alive until the connection manager prunes it again.

Note that Bitswap is not proactively managing reputation/scoring AFAIK. I'm sure a PR there would probably be well-received.

Mikaela · 2019-04-16T08:01:48Z

I noticed that this is partially in the changelog, but are these important connections remembered anywhere yet or is it still upcoming like in the original issue?

My usecase is having three nodes of which one is almost 24/7 and to avoid killing routers they have small connection limits and while they have each other as bootstrap nodes, after running for a moment they forget all about each other.

When I pin something on one, I likely also want to pin it on the others and that is slow unless I ipfs swarm connect by myself (I guess the changelog means that I will have to be running that less often). As they aren't all 24/7, I think the suggested "preferred" flag would fit my usecase due to the nodes connecting each other mainly through Yggdrasil network having static addresses within it.

whyrusleeping · 2019-04-16T17:25:11Z

@raulk what @obo20 is pointing out is that if everyone decides to add the ipfs gateways to their protected connection set, the gateways will get DoSed with connections. What we need to prevent this is the 'disconnect' protocol so the gateways can politely ask peers to disconnect, and have those peers not immediately try to reconnect.

Sure, malicious peers can always ignore that, but we want normal well behaved peers to not accidentally DoS things.

Mikaela · 2019-04-16T20:40:11Z

if everyone decides to add the ipfs gateways to their protected connection set

Would there be any point in this or is this just fear of users not getting it or am I not getting it? I am not using IPFS.io gateway (but ipns.co), but if users request my content from IPFS.io a lot, won't it be fast due to cache anyway regardless of whether my node is currently connected to the gateways or not?

whyrusleeping · 2019-04-24T03:09:23Z

is this just fear of users not getting it

Basically this, yeah. Network protections in systems like this shouldn't have to rely on clients behaving properly. Adding a disconnect protocol still relies on clients behaving properly, but its an additional step (circumventing the disconnect protocol would be deliberate, force connecting to the gateway nodes is more of a configuration mistake)

@Mikaela This feature isnt complete yet, its just the ephemeral important connections currently. Persistence should be coming soon (at this point I think all it takes is adding it to the config file and wiring that through)

jbenet · 2019-06-29T15:15:21Z

proposing new commands

Single Peer -- Keep connections to a specific peers

Use list of peers to stay connected to all the time.

command name options (options to seed ideas -- i dont love any of these names :D)

# choose one
ipfs swarm bind    [list | add | rm]
ipfs swarm peer    [list | add | rm]
ipfs swarm link    [list | add | rm]
ipfs swarm bond    [list | add | rm]
ipfs swarm friend  [list | add | rm]
ipfs swarm tie     [list | add | rm]
ipfs swarm relate  [list | add | rm]
ipfs swarm couple  [list | add | rm]

subcommands

<cmd> list
<cmd> add [--policy=([always]|protect|...)] [ <peer-id> | <multiaddr> ]
<cmd> rm [ <peer-id> | <multiaddr> ]

examples

# just w/ p2p. (use a libp2p peer-routing lookup to find addresses)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# connect specifically to this address
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# can combine both, to try the address but also lookup addresses in case this one doesn't work.
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# always keep a connection open always (periodically check, dial/re-dial if disconnected)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add --policy=always /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# once opened, keep a connection open (try to keep it open, but don't re-dial)
ipfs swarm bind add --policy=protect /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

Peer Group -- Keep connections to a (changing) group of peers

Use a group key to find each other and stay connected. Connect to every peer in the group. Keep a list of groups.
Maybe use a pre-shared key (PSK) to join the group and find out about each other (that way we can have private groups)

command name options

# choose one
ipfs swarm group   [list | add | rm]
ipfs swarm party   [list | add | rm]
ipfs swarm clique  [list | add | rm]
ipfs swarm flock   [list | add | rm]

subcommands

<cmd> list
<cmd> add [--mode=(all|any|number|...)] [ <group-key> ]
<cmd> rm [ <group-key> ]

examples

ipfs swarm group add --mode=all <secret-key-for-ipfs-gateways>
ipfs swarm group add --mode=all <secret-key-for-pinbot-cluster>
ipfs swarm group add --mode=any <secret-key-for-dtube-gateways>
ipfs swarm group add --mode=any <secret-key-for-pinata-peers>
ipfs swarm group add --mode=all <secret-key-for-textile-peers>

obo20 · 2019-07-03T15:33:10Z

I'm definitely a fan of the functionality that @jbenet is suggesting. While 'ipfs swarm connect' currently gets the job done, it would be nice to have more fine tuned control over how to manage connections that we want to keep alive.

The swarm groups are an interesting concept. Instead of a secret key to manage access, I'd love it if there was a concept of "roles" to determine access to the groups. Essentially, the first node to set a group up is an admin role, and then can add either admins or members from there.

The benefit here is that the owner(s) of a group can add / revoke nodes if needed without needing to completely reform the entire group since there's no secret that acts as a master password.

Another thing that would be incredibly helpful here is if this type of stuff could be added to a permanent config, instead of being only temporary until the node restarts. Currently we (and I believe the IPFS infrastructure team - according to @mburns) use the default 'ipfs swarm connect' functionality to keep our nodes connected and we have to continually connect our nodes on a repeating cron-task so that if the node restarts we can reconnect them. Having something like this persist between reboots would be incredibly valuable.

hsanjuan · 2019-08-21T11:51:27Z

A workaround is to add gateways to the bootstrap list. Bootstrap nodes are re-connected to frequently (I'm not sure if they are also "protected" or tagged with higher priority).

Stebalien · 2019-08-21T22:13:06Z

A workaround is to add gateways to the bootstrap list. Bootstrap nodes are re-connected to frequently (I'm not sure if they are also "protected" or tagged with higher priority).

Only if the number of open connections drops below 4. They also aren't tagged with any high priority (as a matter of fact, we've considered tagging them with a negative priority).

olizilla · 2019-09-02T11:31:02Z

Can we make ipfs swarm connect call Protect on the connection by default? If I explicilty ask my node to connect to another, I don't want the connection to be in the list of trimmable connections, I want it to stay connected. I can't guarantee the other side wont drop it, but I definitly dont want my side to drop it. I can ipfs swarm disconnect to signal that I'm done with it.

Adding a mechanism to "reconnect on close" requires us to solve the "dont ddos popular nodes" problem, but exposing the existing libp2p logic to let users identify connections that they dont want their node to trim seems much less risky, and would allow users that control groups of nodes to maintain connections between them all by connecting from both sides. It's doesn't solve the auto-reconnect problem, but that can be scripted for now.

obo20 · 2019-09-02T17:32:10Z

@olizilla Would there be a way to swarm connect without protecting the connection?

Stebalien · 2019-09-09T15:18:14Z

@olizilla we currently add a weight of 100 (didn't have connection protection at the time). But yeah, we should probably protect those connections and add a --protect=false flag.

obo20 · 2019-12-04T18:51:41Z

Is this functionality being considered at all for the 0.5 release?

Stebalien · 2019-12-04T19:15:13Z

No. However, there are a few improvements already in master that may help:

The connection manager will no longer count connections in the grace period towards the connection limit.
- Pro: Useful connections won't be trimmed in favor of new connections.
- Con: You may end up with more connections and may need to reduce your limit.
Bitswap keeps track of historically useful peers and tells the connection manager to avoid disconnecting from these peers.

obo20 · 2019-12-04T20:31:34Z

@Stebalien Does this bitswap history persist through resets or does it live in memory?

If not, would it be difficult to have a separate bootstrap list (or just something in the config that we can set) that consists of peers which we don't ever want to prune connections for? Upon node initiation the node would add all nodes in that list to that "historically useful peers list" you mentioned.

For context, my main goal here is to avoid having to periodically run outside scripts to manage my node connections as this has been somewhat unreliable.

Stebalien · 2019-12-04T23:25:11Z

No, it lives in memory. The end goal is to also have something like this issue implemented, just not right now.

MVP for #6097 This feature will repeatedly reconnect (with a randomized exponential backoff) to peers in a set of "peered" peers. In the future, this should be extended to: 1. Include a CLI for modifying this list at runtime. 2. Include additional options for peers we want to _protect_ but not connect to. 3. Allow configuring timeouts, backoff, etc. 4. Allow groups? Possibly through textile threads. 5. Allow for runtime-only peering rules. 6. Different reconnect policies. But this MVP should be a significant step forward.

MVP for ipfs#6097 This feature will repeatedly reconnect (with a randomized exponential backoff) to peers in a set of "peered" peers. In the future, this should be extended to: 1. Include a CLI for modifying this list at runtime. 2. Include additional options for peers we want to _protect_ but not connect to. 3. Allow configuring timeouts, backoff, etc. 4. Allow groups? Possibly through textile threads. 5. Allow for runtime-only peering rules. 6. Different reconnect policies. But this MVP should be a significant step forward.

Winterhuman · 2022-03-07T18:49:52Z

@Stebalien Is there anything keeping this issue open now that Peering has been added to go-ipfs?

lidel · 2022-04-05T14:29:48Z

I think Peering + #8680 (allows setting limits per peer) cover the technical gist of this issue.

Remaining work is to add some porcelain on top of it, like commands proposed in #6097 (comment)

sinkingsugar · 2022-06-12T07:34:20Z

Peering barely works, even after explicitly doing swarm connect, few seconds after they are culled away if they perform badly.
Why not just use https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#explicit-peering-agreements behind the scenes? So many use cases, surprised about this being so fragile still 😄

Worth mentioning that while with 0.13 nothing seems to work when trying to connect to a simple bitswap protocol (substrate node).
0.12 seems to be able to keep the connection fine.
There might be some regression.

whyrusleeping added kind/enhancement A net-new feature or improvement to an existing feature topic/connection-manager Issues related to Swarm.ConnMgr (connection manager) labels Mar 18, 2019

This was referenced Mar 27, 2019

Add peer protection capability libp2p/go-libp2p-interface-connmgr#14

Merged

Add peer protection capability (implementation) libp2p/go-libp2p-connmgr#36

Merged

jacobheun mentioned this issue Jul 3, 2019

Connection Management and definable Topologies libp2p/notes#13

Open

obo20 mentioned this issue Aug 23, 2019

Peer whitelist #6565

Closed

obo20 mentioned this issue Sep 22, 2019

Gateway NoFetch Enhancement #6659

Open

Stebalien mentioned this issue May 26, 2020

feat: implement peering service #7362

Merged

DonaldTsang mentioned this issue Sep 10, 2020

Content Resolution And Gateway Performance #6383

Open

SomajitDey mentioned this issue Aug 21, 2021

Added app: ipfs-chat ipfs/awesome-ipfs#391

Merged

23 tasks

synctext mentioned this issue Feb 1, 2022

How other related projects are discussed in the media Tribler/tribler#3771

Open

laurentsenta mentioned this issue Mar 4, 2022

Allow / Deny Bitswap replication based on peerID and CID #8763

Closed

8 tasks

Add functionality for advanced control over peer set #6097

Add functionality for advanced control over peer set #6097

Comments

whyrusleeping commented Mar 18, 2019

whyrusleeping commented Mar 18, 2019

raulk commented Mar 18, 2019

brianmcmichael commented Mar 27, 2019

brianmcmichael commented Mar 27, 2019

obo20 commented Mar 27, 2019

whyrusleeping commented Mar 27, 2019

brianmcmichael commented Mar 27, 2019

raulk commented Mar 27, 2019

raulk commented Mar 27, 2019

obo20 commented Mar 28, 2019 • edited Loading

lanzafame commented Mar 29, 2019

raulk commented Mar 29, 2019

obo20 commented Apr 1, 2019

raulk commented Apr 1, 2019 • edited Loading

raulk commented Apr 1, 2019

obo20 commented Apr 1, 2019

obo20 commented Apr 1, 2019 • edited Loading

raulk commented Apr 1, 2019 • edited Loading

Mikaela commented Apr 16, 2019

whyrusleeping commented Apr 16, 2019

Mikaela commented Apr 16, 2019

whyrusleeping commented Apr 24, 2019

jbenet commented Jun 29, 2019 • edited Loading

proposing new commands

Single Peer -- Keep connections to a specific peers

Peer Group -- Keep connections to a (changing) group of peers

obo20 commented Jul 3, 2019

hsanjuan commented Aug 21, 2019

Stebalien commented Aug 21, 2019

olizilla commented Sep 2, 2019

obo20 commented Sep 2, 2019

Stebalien commented Sep 9, 2019

obo20 commented Dec 4, 2019

Stebalien commented Dec 4, 2019

obo20 commented Dec 4, 2019 • edited Loading

Stebalien commented Dec 4, 2019

Winterhuman commented Mar 7, 2022

lidel commented Apr 5, 2022

sinkingsugar commented Jun 12, 2022 • edited Loading

obo20 commented Mar 28, 2019 •

edited

Loading

raulk commented Apr 1, 2019 •

edited

Loading

obo20 commented Apr 1, 2019 •

edited

Loading

raulk commented Apr 1, 2019 •

edited

Loading

jbenet commented Jun 29, 2019 •

edited

Loading

obo20 commented Dec 4, 2019 •

edited

Loading

sinkingsugar commented Jun 12, 2022 •

edited

Loading