Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer randomly disconnects when new peers join #1442

Closed
x48115 opened this issue Jul 27, 2023 · 9 comments
Closed

Peer randomly disconnects when new peers join #1442

x48115 opened this issue Jul 27, 2023 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@x48115
Copy link

x48115 commented Jul 27, 2023

Problem

Description

When new peers connect, existing peers sometimes get disconnected

Steps to reproduce:

import { createLightNode } from "@waku/sdk";

const createNode = async (i) => {
  const node = await createLightNode({
    defaultBootstrap: true,
  });
  node.libp2p.addEventListener("peer:connect", async (evt) => {
    const peer = evt.detail;
    const target = peer.remotePeer.toString();
    const local = node.libp2p.peerId.toString();
    console.log(`Peer ${i} ${local} connected: ${target}`);
  });
  node.libp2p.addEventListener("peer:disconnect", async (evt) => {
    const peer = evt.detail;
    const target = peer.remotePeer.toString();
    const local = node.libp2p.peerId.toString();
    console.log(`Peer ${i} ${local} disconnected: ${target}`);
  });
  await node.start();
};
for (let i = 0; i < 100; i++) {
  createNode(i);
}

Expected results

Expected existing peer connections to stay connected when new peers join

Actual results

Peers are disconnected randomly

Peer 9 12D3KooWQLFZ64W9XbLstEuQJq2vCbe5hqQD9A69yMYqcatTDRa5 connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 5 12D3KooWDAPFnL8EkbxUaDF3tAfhvv12tRAtNfE1jsuGvLDC16xW connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 3 12D3KooWNi6WpYVhdgu8B6idDiFxfBkPkYQrQE2SWJXzhpwF1SdB connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 7 12D3KooWFvawzXcRvY4aHRo4Wm34QzJfHKRzSZecNe9CHPtVkfv1 connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 10 12D3KooWGEEdvFqZiMMVEGXr4kE9aHWULmPJWUiaomsWtT7PxV6Y connected: 16Uiu2HAmL5okWopX7NqZWBUKVqW8iUxCEmd5GMHLVPwCgzYzQv3e
Peer 2 12D3KooWSBg2xvz7yDgR9Ts255vTCTxdh92amY8iuFNiu6USbKTy connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 6 12D3KooWRy9HCytXWmafSpp4mpmYxSPgTWZost5G6KMvxTJV4nXX connected: 16Uiu2HAmL5okWopX7NqZWBUKVqW8iUxCEmd5GMHLVPwCgzYzQv3e
Peer 14 12D3KooWMWNNPLMDukaLA9i5mxwVFuy7srxtqZeKELK8GdrVRxcg connected: 16Uiu2HAmL5okWopX7NqZWBUKVqW8iUxCEmd5GMHLVPwCgzYzQv3e
Peer 11 12D3KooWRujWUBgGgB7m7WRL76C49SRTpjLhNgCmxVzTkegHAzLh connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 1 12D3KooW9uzzFvpxqc53TWbzmJmLinXbsND1QKTRF5vhmr32o3BS connected: 16Uiu2HAmL5okWopX7NqZWBUKVqW8iUxCEmd5GMHLVPwCgzYzQv3e
Peer 4 12D3KooWBF8AGHoEHvf3XM4tbRz2Y1aKwvuJfRTXSnwmbcd9n4Yr connected: 16Uiu2HAmL5okWopX7NqZWBUKVqW8iUxCEmd5GMHLVPwCgzYzQv3e
Peer 9 12D3KooWQLFZ64W9XbLstEuQJq2vCbe5hqQD9A69yMYqcatTDRa5 disconnected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 8 12D3KooWAw45g6YGDwGTogHhXzhSG6VvxvNycKM54u9pmtxAnZzv connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 12 12D3KooWMCHg3adqRSXNNM68a4PKPcejarJ7J3GtBTv8qvVoJTGQ connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 0 12D3KooWFfgJrALMoLkp4mhAmqkrcoa1c1ivoWMwbc2jsVs2a9ma connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 13 12D3KooWQycfe9c4qRz4QxhTXsp9Lfh54QFgsiZ3CrMNAM8q74sT connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 9 12D3KooWQLFZ64W9XbLstEuQJq2vCbe5hqQD9A69yMYqcatTDRa5 connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 5 12D3KooWDAPFnL8EkbxUaDF3tAfhvv12tRAtNfE1jsuGvLDC16xW disconnected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 11 12D3KooWRujWUBgGgB7m7WRL76C49SRTpjLhNgCmxVzTkegHAzLh disconnected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 8 12D3KooWAw45g6YGDwGTogHhXzhSG6VvxvNycKM54u9pmtxAnZzv disconnected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 5 12D3KooWDAPFnL8EkbxUaDF3tAfhvv12tRAtNfE1jsuGvLDC16xW connected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 3 12D3KooWNi6WpYVhdgu8B6idDiFxfBkPkYQrQE2SWJXzhpwF1SdB disconnected: 16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA
Peer 12 12D3KooWMCHg3adqRSXNNM68a4PKPcejarJ7J3GtBTv8qvVoJTGQ disconnected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
Peer 8 12D3KooWAw45g6YGDwGTogHhXzhSG6VvxvNycKM54u9pmtxAnZzv connected: 16Uiu2HAm4v86W3bmT1BiH6oSPzcsSr24iDQpSN5Qa992BCjjwgrD
@fryorcraken fryorcraken added this to Waku Jul 27, 2023
@x48115
Copy link
Author

x48115 commented Jul 28, 2023

Tested with @waku/sdk version ^0.0.17 and it's still an issue

@x48115
Copy link
Author

x48115 commented Jul 28, 2023

FYI, I tested this with libp2p directly without waku and I'm still seeing issues, but, only in some situations:

import { noise } from "@chainsafe/libp2p-noise";
import { yamux } from "@chainsafe/libp2p-yamux";
import { gossipsub } from "@chainsafe/libp2p-gossipsub";
import { mplex } from "@libp2p/mplex";
import { tcp } from "@libp2p/tcp";
import { createLibp2p } from "libp2p";
import { identifyService } from "libp2p/identify";
import { bootstrap } from "@libp2p/bootstrap";

const bootstrappers = [
  "/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
];

const createNode = async (i) => {
  const node = await createLibp2p({
    addresses: {
      listen: ["/ip4/0.0.0.0/tcp/0"],
    },
    transports: [tcp()],
    streamMuxers: [yamux(), mplex()],
    connectionEncryption: [noise()],
    services: {
      pubsub: gossipsub(),
      identify: identifyService(),
    },
    peerDiscovery: [
      bootstrap({
        list: bootstrappers,
      }),
    ],
  });
  node.addEventListener("peer:connect", async (evt) => {
    console.log(`Peer ${i} connected`);
  });
  node.addEventListener("peer:disconnect", async (evt) => {
    console.log(`Peer ${i} disconnected`);
  });
  return node;
};
for (let i = 0; i < 100; i++) {
  createNode(i);
}

When I use this code, I get the same results, new connections are causing existing connections to drop.

However when I change default bootstrapper to: "/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt" the issue goes away.

This leads me to believe that the issue has to do with specific peers.

However, in my actual use case using default js-waku nodes, I'm only performing 2-3 simultaneous connections and I see connection issues consistently.

@x48115
Copy link
Author

x48115 commented Jul 28, 2023

Note, the peer disconnect would potentially not be a problem for my application, but since waku is only connecting to one peer by default (I tried overriding this to connect to more peers) this means that if that peer goes offline all communications in my system are lost, which is a blocker for development.

@fryorcraken
Copy link
Collaborator

Cc @danisharora099

@danisharora099
Copy link
Collaborator

This is very interesting. Thanks for opening an issue @x48115
I'll investigate this and get back with findings!

@danisharora099
Copy link
Collaborator

danisharora099 commented Sep 6, 2023

Update: @vpavlin mentioned how he ran into something similar while testing a js-waku node against his nwaku node deployed on Akash Network where he saw frequent disconnections to his js-waku node.

However, js-waku should still be able to reconnect:

@vpavlin
Copy link
Member

vpavlin commented Sep 6, 2023

I am not sure this is caused/related to js-waku. I see a weird peer behaviour on Akash in general (somehow the connected peers are capped at around 30, although my other node(s) get way more connected peers).

@danisharora099
Copy link
Collaborator

I am not sure this is caused/related to js-waku. I see a weird peer behaviour on Akash in general (somehow the connected peers are capped at around 30, although my other node(s) get way more connected peers).

can we check if other "stabler" nodes also see some disconnections with js-waku?

@danisharora099 danisharora099 self-assigned this Oct 13, 2023
@danisharora099 danisharora099 moved this from Priority to In Progress in Waku Oct 13, 2023
@danisharora099
Copy link
Collaborator

@x48115
Investigated this:

The reason this is happening is because you're running all js-waku nodes on the same IP, and nwaku has a limit of max 5 connections from the same IP. ref: https://github.com/waku-org/nwaku/blob/master/waku/node/peer_manager/peer_manager.nim#L367-L372
This means: when there is a 6th connection, from the same IP to the same nwaku node, it will drop a connection (the oldest peer connected)

Also, something to note: while unlikely, since this is a p2p network, there can be more reasons why a node might decide to drop a connection (bandwidth restrictions, num peers limit, etc) and that should be not be worrying as the node will keep on finding more peers to connect to.
However, for the case you mentioned, it's because of the IP collocation limit set to 5 on nwaku.

Please feel free to reopen this issue if you're still running into problems.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Waku Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

4 participants