Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(kad): make Kademlia::add_address return Result #4280

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ libp2p-floodsub = { version = "0.43.0", path = "protocols/floodsub" }
libp2p-gossipsub = { version = "0.45.1", path = "protocols/gossipsub" }
libp2p-identify = { version = "0.43.0", path = "protocols/identify" }
libp2p-identity = { version = "0.2.3" }
libp2p-kad = { version = "0.44.5", path = "protocols/kad" }
libp2p-kad = { version = "0.45.0", path = "protocols/kad" }
libp2p-mdns = { version = "0.44.0", path = "protocols/mdns" }
libp2p-memory-connection-limits = { version = "0.1.0", path = "misc/memory-connection-limits" }
libp2p-metrics = { version = "0.13.1", path = "misc/metrics" }
Expand Down
2 changes: 1 addition & 1 deletion examples/distributed-key-value-store/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ async fn main() -> Result<(), Box<dyn Error>> {
},
SwarmEvent::Behaviour(MyBehaviourEvent::Mdns(mdns::Event::Discovered(list))) => {
for (peer_id, multiaddr) in list {
swarm.behaviour_mut().kademlia.add_address(&peer_id, multiaddr);
swarm.behaviour_mut().kademlia.add_address(&peer_id, multiaddr)?;
}
}
SwarmEvent::Behaviour(MyBehaviourEvent::Kademlia(KademliaEvent::OutboundQueryProgressed { result, ..})) => {
Expand Down
3 changes: 2 additions & 1 deletion examples/file-sharing/src/network.rs
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,8 @@ impl EventLoop {
self.swarm
.behaviour_mut()
.kademlia
.add_address(&peer_id, peer_addr.clone());
.add_address(&peer_id, peer_addr.clone())
.expect("Adding a peer to the DHT");
match self.swarm.dial(peer_addr.with(Protocol::P2p(peer_id))) {
Ok(()) => {
e.insert(sender);
Expand Down
2 changes: 1 addition & 1 deletion examples/ipfs-kad/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ async fn main() -> Result<(), Box<dyn Error>> {
// into the `transport` resolves the `dnsaddr` when Kademlia tries
// to dial these nodes.
for peer in &BOOTNODES {
behaviour.add_address(&peer.parse()?, "/dnsaddr/bootstrap.libp2p.io".parse()?);
behaviour.add_address(&peer.parse()?, "/dnsaddr/bootstrap.libp2p.io".parse()?)?;
}

SwarmBuilder::with_async_std_executor(transport, behaviour, local_peer_id).build()
Expand Down
4 changes: 3 additions & 1 deletion misc/server/src/behaviour.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ impl Behaviour {
);
let bootaddr = Multiaddr::from_str("/dnsaddr/bootstrap.libp2p.io").unwrap();
for peer in &BOOTNODES {
kademlia.add_address(&PeerId::from_str(peer).unwrap(), bootaddr.clone());
kademlia
.add_address(&PeerId::from_str(peer).unwrap(), bootaddr.clone())
.expect("to add bootnode to the DHT");
}
kademlia.bootstrap().unwrap();
Some(kademlia)
Expand Down
7 changes: 7 additions & 0 deletions protocols/kad/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
## 0.45.0 - unreleased

- make `Kademlia::add_address` return `Result`.
See [PR 4280].

[PR 4280]: https://github.com/libp2p/rust-libp2p/pull/4280

## 0.44.5 - unreleased
- Migrate to `quick-protobuf-codec` crate for codec logic.
See [PR 4501].
Expand Down
2 changes: 1 addition & 1 deletion protocols/kad/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name = "libp2p-kad"
edition = "2021"
rust-version = { workspace = true }
description = "Kademlia protocol for libp2p"
version = "0.44.5"
version = "0.45.0"
authors = ["Parity Technologies <[email protected]>"]
license = "MIT"
repository = "https://github.com/libp2p/rust-libp2p"
Expand Down
42 changes: 26 additions & 16 deletions protocols/kad/src/behaviour.rs
Original file line number Diff line number Diff line change
Expand Up @@ -524,7 +524,11 @@ where
///
/// If the routing table has been updated as a result of this operation,
/// a [`KademliaEvent::RoutingUpdated`] event is emitted.
pub fn add_address(&mut self, peer: &PeerId, address: Multiaddr) -> RoutingUpdate {
pub fn add_address(
&mut self,
peer: &PeerId,
address: Multiaddr,
) -> Result<RoutingUpdateOk, RoutingUpdateError> {
Comment on lines +515 to +519
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that this is only coming up now but I just realized that we have a feature coming up that conflicts with this idea. It is here: #4371

This function will likely be deprecated in favor of a general add_address_of_peer function on Swarm. This means, we won't be able to return a specific error there!

In fact, we won't be able to return anything at all.

Thus, I think we should actually change this PR into preparing this API for the above change. Here is what I'd suggest but I am interested to hear your and @mxinden's opinion:

  1. Mark RoutingUpdate as deprecated but retain the current functionality.
  2. Always emit KademliaEvent::RoutingUpdated, even for the immediate RoutingUpdate::Success case
  3. Add a debug_assert to the SelfEntry case. If we attempt to add ourselves to the routing table, this is a programmer bug and should not happen. We can ignore that in production (but should probably emit a fat warn!), hence the debug_assert.
  4. We already log a warning for the bucket-full case so we can just leave it at that.

This makes this a backwards-compatible change and we can land it right away.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this as well. I am sorry @arsenron.

Suggestion overall sounds good to me.

Add a debug_assert to the SelfEntry case. If we attempt to add ourselves to the routing table, this is a programmer bug and should not happen. We can ignore that in production (but should probably emit a fat warn!), hence the debug_assert.

Panicing on invalid user input seems rather drastic. Wouldn't a warn! suffice in all cases? Also, not as part of this pull request, we should handle the SelfEntry on the libp2p-swarm level. We should at least log when a NetworkBehaviour or user (via Swarm::add_address_of_peer) wants to report a an external address of a remote peer where that peer is actually the local peer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this as well. I am sorry @arsenron.

Suggestion overall sounds good to me.

Add a debug_assert to the SelfEntry case. If we attempt to add ourselves to the routing table, this is a programmer bug and should not happen. We can ignore that in production (but should probably emit a fat warn!), hence the debug_assert.

Panicing on invalid user input seems rather drastic. Wouldn't a warn! suffice in all cases?

The warn will be there in all cases but pancking in debug builds is fine, no? You didn't validate your inputs correctly in that case.

Also, not as part of this pull request, we should handle the SelfEntry on the libp2p-swarm level. We should at least log when a NetworkBehaviour or user (via Swarm::add_address_of_peer) wants to report a an external address of a remote peer where that peer is actually the local peer.

True, as part of #4371, we should probably not forward any of these events! cc @StemCll

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenron I fixed the merge conflicts for you. Are you still interested in working on this given the above insight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomaseizinger sorry, little busy now, but I will check it a bit later

let key = kbucket::Key::from(*peer);
match self.kbuckets.entry(&key) {
kbucket::Entry::Present(mut entry, _) => {
Expand All @@ -543,11 +547,11 @@ where
},
))
}
RoutingUpdate::Success
Ok(RoutingUpdateOk::Success)
}
kbucket::Entry::Pending(mut entry, _) => {
entry.value().insert(address);
RoutingUpdate::Pending
Ok(RoutingUpdateOk::Pending)
}
kbucket::Entry::Absent(entry) => {
let addresses = Addresses::new(address);
Expand All @@ -571,23 +575,23 @@ where
.expect("Not kbucket::Entry::SelfEntry."),
},
));
RoutingUpdate::Success
Ok(RoutingUpdateOk::Success)
}
kbucket::InsertResult::Full => {
debug!("Bucket full. Peer not added to routing table: {}", peer);
RoutingUpdate::Failed
debug!("Bucket full. Peer not added to routing table: {peer}");
Err(RoutingUpdateError::BucketFull(*peer))
}
kbucket::InsertResult::Pending { disconnected } => {
self.queued_events.push_back(ToSwarm::Dial {
opts: DialOpts::peer_id(disconnected.into_preimage())
.condition(dial_opts::PeerCondition::NotDialing)
.build(),
});
RoutingUpdate::Pending
Ok(RoutingUpdateOk::Pending)
}
}
}
kbucket::Entry::SelfEntry => RoutingUpdate::Failed,
kbucket::Entry::SelfEntry => Err(RoutingUpdateError::SelfEntry),
}
}

Expand Down Expand Up @@ -3290,9 +3294,9 @@ impl fmt::Display for NoKnownPeers {

impl std::error::Error for NoKnownPeers {}

/// The possible outcomes of [`Kademlia::add_address`].
/// The possible success outcomes of [`Kademlia::add_address`].
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum RoutingUpdate {
pub enum RoutingUpdateOk {
/// The given peer and address has been added to the routing
/// table.
Success,
Expand All @@ -3302,12 +3306,18 @@ pub enum RoutingUpdate {
/// in the routing table, [`KademliaEvent::RoutingUpdated`]
/// is eventually emitted.
Pending,
/// The routing table update failed, either because the
/// corresponding bucket for the peer is full and the
/// pending slot(s) are occupied, or because the given
/// peer ID is deemed invalid (e.g. refers to the local
/// peer ID).
Failed,
}

/// The routing table update failed, either because the
/// corresponding bucket for the peer is full and the
/// pending slot(s) are occupied, or because the given
/// peer ID refers to the local peer ID.
#[derive(Debug, Clone, Error)]
pub enum RoutingUpdateError {
#[error("peer `{0}` is not added to the DHT as the corresponding bucket is full")]
BucketFull(PeerId),
#[error("cannot add local peer to the DHT")]
SelfEntry,
}

#[derive(PartialEq, Copy, Clone, Debug)]
Expand Down
49 changes: 36 additions & 13 deletions protocols/kad/src/behaviour/test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,8 @@ fn build_connected_nodes_with_config(
swarms[i]
.1
.behaviour_mut()
.add_address(peer_id, addr.clone());
.add_address(peer_id, addr.clone())
.unwrap();
}
if j % step == 0 {
i += step;
Expand All @@ -131,7 +132,12 @@ fn build_fully_connected_nodes_with_config(

for (_addr, swarm) in swarms.iter_mut() {
for (addr, peer) in &swarm_addr_and_peer_id {
swarm.behaviour_mut().add_address(peer, addr.clone());
if swarm.local_peer_id() != peer {
swarm
.behaviour_mut()
.add_address(peer, addr.clone())
.unwrap();
}
}
}

Expand Down Expand Up @@ -327,7 +333,8 @@ fn unresponsive_not_returned_direct() {
for _ in 0..10 {
swarms[0]
.behaviour_mut()
.add_address(&PeerId::random(), Protocol::Udp(10u16).into());
.add_address(&PeerId::random(), Protocol::Udp(10u16).into())
.unwrap();
}

// Ask first to search a random value.
Expand Down Expand Up @@ -373,7 +380,8 @@ fn unresponsive_not_returned_indirect() {
swarms[0]
.1
.behaviour_mut()
.add_address(&PeerId::random(), multiaddr![Udp(10u16)]);
.add_address(&PeerId::random(), multiaddr![Udp(10u16)])
.unwrap();
}

// Connect second to first.
Expand All @@ -382,7 +390,8 @@ fn unresponsive_not_returned_indirect() {
swarms[1]
.1
.behaviour_mut()
.add_address(&first_peer_id, first_address);
.add_address(&first_peer_id, first_address)
.unwrap();

// Drop the swarm addresses.
let mut swarms = swarms
Expand Down Expand Up @@ -434,11 +443,13 @@ fn get_record_not_found() {
swarms[0]
.1
.behaviour_mut()
.add_address(&swarm_ids[1], second);
.add_address(&swarm_ids[1], second)
.unwrap();
swarms[1]
.1
.behaviour_mut()
.add_address(&swarm_ids[2], third);
.add_address(&swarm_ids[2], third)
.unwrap();

// Drop the swarm addresses.
let mut swarms = swarms
Expand Down Expand Up @@ -515,7 +526,8 @@ fn put_record() {
single_swarm
.1
.behaviour_mut()
.add_address(swarm.1.local_peer_id(), swarm.0.clone());
.add_address(swarm.1.local_peer_id(), swarm.0.clone())
.unwrap();
}

let mut swarms = vec![single_swarm];
Expand Down Expand Up @@ -747,7 +759,11 @@ fn get_record() {
*Swarm::local_peer_id(&swarms[i + 1].1),
swarms[i + 1].0.clone(),
);
swarms[i].1.behaviour_mut().add_address(&peer_id, address);
swarms[i]
.1
.behaviour_mut()
.add_address(&peer_id, address)
.unwrap();
}

// Drop the swarm addresses.
Expand Down Expand Up @@ -886,7 +902,8 @@ fn add_provider() {
single_swarm
.1
.behaviour_mut()
.add_address(swarm.1.local_peer_id(), swarm.0.clone());
.add_address(swarm.1.local_peer_id(), swarm.0.clone())
.unwrap();
}

let mut swarms = vec![single_swarm];
Expand Down Expand Up @@ -1117,11 +1134,13 @@ fn disjoint_query_does_not_finish_before_all_paths_did() {
alice
.1
.behaviour_mut()
.add_address(trudy.1.local_peer_id(), trudy.0.clone());
.add_address(trudy.1.local_peer_id(), trudy.0.clone())
.unwrap();
alice
.1
.behaviour_mut()
.add_address(bob.1.local_peer_id(), bob.0.clone());
.add_address(bob.1.local_peer_id(), bob.0.clone())
.unwrap();

// Drop the swarm addresses.
let (mut alice, mut bob, mut trudy) = (alice.1, bob.1, trudy.1);
Expand Down Expand Up @@ -1442,7 +1461,11 @@ fn get_providers_limit<const N: usize>() {
*Swarm::local_peer_id(&swarms[i + 1].1),
swarms[i + 1].0.clone(),
);
swarms[i].1.behaviour_mut().add_address(&peer_id, address);
swarms[i]
.1
.behaviour_mut()
.add_address(&peer_id, address)
.unwrap();
}

// Drop the swarm addresses.
Expand Down
2 changes: 1 addition & 1 deletion protocols/kad/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ pub use behaviour::{
GetClosestPeersResult, GetProvidersError, GetProvidersOk, GetProvidersResult, GetRecordError,
GetRecordOk, GetRecordResult, InboundRequest, Mode, NoKnownPeers, PeerRecord, PutRecordContext,
PutRecordError, PutRecordOk, PutRecordPhase, PutRecordResult, QueryInfo, QueryMut, QueryRef,
QueryResult, QueryStats, RoutingUpdate,
QueryResult, QueryStats, RoutingUpdateError, RoutingUpdateOk,
};
pub use behaviour::{
Kademlia, KademliaBucketInserts, KademliaCaching, KademliaConfig, KademliaEvent,
Expand Down