Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Refine discv5 peer discovery process #269

Closed
hopeyen opened this issue Aug 14, 2023 · 2 comments
Closed

Spike: Refine discv5 peer discovery process #269

hopeyen opened this issue Aug 14, 2023 · 2 comments
Assignees
Labels
meta:triaged This issue has been triaged (has a good description, as well as labels for priority, size and type) p0 Critical priority size:medium Medium type:bug Something isn't working

Comments

@hopeyen
Copy link
Collaborator

hopeyen commented Aug 14, 2023

Describe the bug

We have not seen a radio that connects to more than 2 peers even though discv5 is enabled by default and there are more than 2 active indexers for certain deployments

Expected behavior

Active peer number should be within a small threshold of active indexer number

Notes for random walk discovery

Waku Discv5
Ethereum devp2p
When the node set is large or requested property is poor, the random walk will take longer between 2 nodes. Waku implemented the protocol for dedicated discovery network so nodes can query this discovery network for a random set of nodes and all (well-behaving) returned nodes can serve as bootstrap nodes for other Waku2 protocols.

Each node is represented by a node record (ENR). They are organized into a DHT based on their IP addresses, which in our case contains a UDP port. The distance between two nodes is the number of hops it takes to reach the other node through the DHT (random walk distance can only be >= DHT distance).
Each node maintains its own local node record and is updated periodically by querying other nodes in the DHT.
They establish sessions with each other to exchange messages. They establish a session by performing a handshake and exchange their node records. In the session they could discover new nodes, exchange node records, and perform other tasks.
Search: Nodes can search for other nodes by IP address, UDP port, or topic.

@hopeyen hopeyen added type:bug Something isn't working size:medium Medium p2 Medium priority labels Aug 14, 2023
@hopeyen hopeyen mentioned this issue Aug 22, 2023
2 tasks
@hopeyen hopeyen added p0 Critical priority and removed meta:awaiting-triage p2 Medium priority labels Sep 18, 2023
@pete-eiger pete-eiger added the meta:triaged This issue has been triaged (has a good description, as well as labels for priority, size and type) label Sep 21, 2023
@hopeyen hopeyen removed the meta:triaged This issue has been triaged (has a good description, as well as labels for priority, size and type) label Sep 21, 2023
@hopeyen hopeyen self-assigned this Sep 21, 2023
@hopeyen
Copy link
Collaborator Author

hopeyen commented Sep 21, 2023

Steps we take in exploring node connectivity:

  1. Received a ENR for the standard bootnodes for the wider Waku discovery network
  2. Supply the ENR to graphops bootnode fleet
  3. Restart the radio

We can expect log outputs

  2023-09-21T23:24:24.510062Z TRACE graphcast_sdk::graphcast_agent::waku_handling: Network peers, peers: Ok([WakuPeerData { peer_id: "16Uiu2HAmFABznVXZJjWBRARcDV9NoEzuXcCBCo6FM59ErmvUQr4A", protocols: [], addresses: ["/ip4/83.22.232.108/tcp/64415"], connected: false }, WakuPeerData { peer_id: "16Uiu2HAm8vusJU2NtrDkvVfnMCaV91uB9h1oFy435Nx1MhQxwp9C", protocols: ["/libp2p/autonat/1.0.0", "/rendezvous/1.0.0", "/vac/waku/store/2.0.0-beta4", "/vac/waku/filter-subscribe/2.0.0-beta1", "/vac/waku/filter/2.0.0-beta1", "/ipfs/id/1.0.0", "/libp2p/circuit/relay/0.2.0/hop", "/vac/waku/relay/2.0.0", "/ipfs/ping/1.0.0", "/vac/waku/lightpush/2.0.0-beta1"], addresses: ["/dns4/nwaku.silent.sg/tcp/30304", "/dns4/nwaku.silent.sg/tcp/8000/wss"], connected: true }, ..., WakuPeerData { peer_id: "16Uiu2HAmLvXVP6m3XGxTdkDmvyfZE7RYGyViqK4RzuAkaRNDXCyC", protocols: [], addresses: ["/ip4/127.0.0.1/tcp/60000", "/ip4/192.168.64.1/tcp/60000", "/ip4/50.92.212.135/tcp/60000", "/ip4/5.78.81.99/tcp/31900/p2p/16Uiu2HAmNeFcGzNtmw8bezZN399rwYUzwKv5rzDkzguzkonyqqcH/p2p-circuit", "/ip4/5.78.81.99/tcp/8000/wss/p2p/16Uiu2HAmNeFcGzNtmw8bezZN399rwYUzwKv5rzDkzguzkonyqqcH/p2p-circuit", "/ip4/209.38.225.104/tcp/60000/p2p/16Uiu2HAmK6w5yVgmP6MAp4FSg5AsNrRLBaB4VjYpwfzjrcRTNpw7/p2p-circuit"], connected: false }, WakuPeerData { peer_id: "16Uiu2HAmVkKntsECaYfefR1V2yCR79CegLATuTPE6B9TxgxBiiiA", protocols: ["/libp2p/circuit/relay/0.2.0/hop", "/vac/waku/relay/2.0.0", "/rendezvous/1.0.0", "/vac/waku/store/2.0.0-beta4", "/vac/waku/lightpush/2.0.0-beta1", "/ipfs/id/1.0.0", "/libp2p/autonat/1.0.0", "/ipfs/ping/1.0.0", "/vac/waku/filter-subscribe/2.0.0-beta1", "/vac/waku/filter/2.0.0-beta1", "/vac/waku/peer-exchange/2.0.0-alpha1"], addresses: ["/dns4/node-01.gc-us-central1-a.wakuv2.prod.statusim.net/tcp/30303", "/dns4/node-01.gc-us-central1-a.wakuv2.prod.statusim.net/tcp/8000/wss"], connected: true }, WakuPeerData { peer_id: "16Uiu2HAkyjvXPmymR5eRnvxCufRGZdfRrgjME6bmn3Xo6aprE1eo", protocols: ["/ipfs/id/1.0.0", "/ipfs/id/push/1.0.0", "/ipfs/ping/1.0.0", "/libp2p/dcutr", "/meshsub/1.0.0", "/meshsub/1.1.0", "/vac/waku/relay/2.0.0", "/floodsub/1.0.0", "/libp2p/autonat/1.0.0", "/libp2p/circuit/relay/0.2.0/stop"], addresses: ["/ip4/165.232.90.54/tcp/30504", "/ip4/127.0.0.1/tcp/30504", "/ip4/172.17.3.2/tcp/30504"], connected: true }, WakuPeerData { peer_id: "16Uiu2HAmD8LVsJd1cKYM5sE1diuR38U2iKZTd982M5Res4zD9mk6", protocols: [], addresses: ["/ip4/156.223.53.132/tcp/41371"], connected: false }])
    at /Users/hopeyen/.cargo/registry/src/index.crates.io-6f17d22bba15001f/graphcast-sdk-0.4.3/src/graphcast_agent/waku_handling.rs:485

  ...

  2023-09-21T23:24:34.560746Z  INFO subgraph_radio::operator: Network statuses, chainhead: "{ goerli: 9736384 }", num_gossip_peers: 27, num_topics: 2
    at subgraph-radio/src/operator/mod.rs:291

Seemingly working, so let's update the mainnet bootnodes as well with the same standard ENR as the bootstrap nodes. The current radio should already be using our bootnodes for discv5 mechanism

@pete-eiger pete-eiger added the meta:triaged This issue has been triaged (has a good description, as well as labels for priority, size and type) label Oct 2, 2023
@pete-eiger
Copy link
Contributor

Next steps on message routing in this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta:triaged This issue has been triaged (has a good description, as well as labels for priority, size and type) p0 Critical priority size:medium Medium type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants