-
Notifications
You must be signed in to change notification settings - Fork 974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kademlia: Optimise iteration over closest keys / entries. #1117
Conversation
7ca0e04
to
765033a
Compare
The current implementation for finding the entries whose keys are closest to some target key in the Kademlia routing table involves copying the keys of all buckets into a new `Vec` which is then sorted based on the distances to the target and turned into an iterator from which only a small number of elements (by default 20) are drawn. This commit introduces an iterator over buckets for finding the closest keys to a target that visits the buckets in the optimal order, based on the information contained in the distance bit-string representing the distance between the local key and the target. Correctness is tested against full-table scans. Also included: * Updated documentation. * The `Entry` API was moved to the `kbucket::entry` sub-module for ease of maintenance. * The pending node handling has been slightly refactored in order to bring code and documentation in agreement and clarify the semantics a little.
As a heads up, let me know whether this is ready for review (otherwise I won't review it). |
Now ready for review. The PR description has been updated and should be re-read if someone read an earlier version. I'm only polishing documentation here and there and maybe add one or the other additional test that comes to mind, but otherwise I'm moving on to #146 from here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nit-picks. I admit that I don't fully understand the logic of the closest iterator, but that's because I admit that I'm having trouble warping my head aroud the distance thing. But the code looks solid.
As a general remark, I'm not a fan of having the data structure update itself over time. I think it would be preferable to explicitly call a method that accounts for timeouts in the k-buckets, instead of having for example iter()
automatically account for that. However, since it was already like that before, that's out of the scope of this PR.
core/src/swarm/swarm.rs
Outdated
@@ -377,7 +377,7 @@ impl<'a> PollParameters<'a> { | |||
} | |||
|
|||
/// Returns the list of the addresses nodes can use to reach us. | |||
pub fn external_addresses(&self) -> impl ExactSizeIterator<Item = &Multiaddr> { | |||
pub fn external_addresses(&self) -> impl ExactSizeIterator<Item = &Multiaddr> + Clone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing the way that you use external_addresses
, I don't think that this additional Clone
is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this? It requires one more cloning of the addresses than is strictly necessary though, doesn't it? Do you prefer that or am I overlooking something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't possible to directly write multiaddrs: parameters.external_addresses().cloned().collect()
and remove that let local_addrs =
altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally, thanks!
protocols/kad/src/behaviour.rs
Outdated
@@ -203,39 +204,49 @@ impl<TSubstream> Kademlia<TSubstream> { | |||
/// Adds a known address for the given `PeerId`. We are connected to this address. | |||
// TODO: report if the address was inserted? also, semantics unclear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally out of the scope of this PR, but the semantics of add_connected_address
vs add_not_connected_address
are extremely crappy and come from a time where Kademlia was even less correctly implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What irritates me about these functions is that they seemingly allow user code to choose whether a node is considered to be connected, even though the Kademlia behaviour has no knowledge of that connection. I know that the old implementation actually ignored the connected
argument, making the two functions equivalent, maybe exactly for that reason. How about then fusing these two functions into just add_address
and give it the semantics of adding a known address for a peer to the routing table, with no influence on the connection status (meaning disconnected if the peer associated with the address is not yet in the routing table)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with fusing methods.
protocols/kad/src/kbucket.rs
Outdated
} | ||
|
||
impl<T> Eq for Key<T> {} | ||
/// A (safe) index into a `KBucketsTable`, i.e. a non-negative integer in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit-pick: to me "safe" means "memory-safe".
/// A (safe) index into a `KBucketsTable`, i.e. a non-negative integer in the | |
/// A (type-safe) index into a `KBucketsTable`, i.e. a non-negative integer in the |
/// `None` indicates that there are no connected entries in the bucket, i.e. | ||
/// the bucket is either empty, or contains only entries for peers that are | ||
/// considered disconnected. | ||
first_connected_pos: Option<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use an Option
instead of setting it to nodes.len()
? For simplicity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the semantics clearer if first_connected_pos
is either not set (None
) or is a valid index into nodes
, instead of attaching special meaning to out-of-bounds values (w.r.t. nodes
).
It is certainly not straight-forward but maybe a small example may help to illustrate the principle and provide an intuition for why this procedure works (for which the tests against full-table scans provide additional confidence): Let the keyspace be The closest key to So the next closest key to the target not in bucket 2 must have distance at least 4 to target, i.e. must differ in the bit position for So the next closest key to the original target not in bucket 1 must have distance at least 6, i.e. must differ in the bit positions for So the next closest key to the original target not in bucket 0 must have distance at least 8, i.e. must differ in the bit position for The order in which to visit the buckets to find the closest keys to Closer inspection of this procedure shows that it derives mechanically from the binary representation of the distance between
I share the mixed feelings about the current approach. On the upside, a user of the API cannot forget to apply pending entries, since that happens automatically when accessing the routing table. The downside is that the I agree that the alternative of having an explicit API call of the form A second alternative could be to leave the concept of pending entries entirely outside of the |
Thanks for the explanation!
Yes, that's what I had in mind.
I think that this is a viable option. I expect the last few buckets to be full and constantly have a pending node, but the first 245 buckets or so probably will probably never be full. But again, this is for future work. |
* Kademlia: Optimise iteration over closest entries. The current implementation for finding the entries whose keys are closest to some target key in the Kademlia routing table involves copying the keys of all buckets into a new `Vec` which is then sorted based on the distances to the target and turned into an iterator from which only a small number of elements (by default 20) are drawn. This commit introduces an iterator over buckets for finding the closest keys to a target that visits the buckets in the optimal order, based on the information contained in the distance bit-string representing the distance between the local key and the target. Correctness is tested against full-table scans. Also included: * Updated documentation. * The `Entry` API was moved to the `kbucket::entry` sub-module for ease of maintenance. * The pending node handling has been slightly refactored in order to bring code and documentation in agreement and clarify the semantics a little. * Rewrite pending node handling and add tests.
Based on #1108.
The current implementation for finding the entries whose keys are closest to some target key in the Kademlia routing table involves copying the keys of all buckets into a new
Vec
which is then sorted based on thedistances of the entries to the target and turned into an iterator from which only a small number of elements (by default 20) are drawn.
This commit introduces an iterator for finding the closest keys (or entries) to a target that visits the buckets in the optimal order, based on the information contained in the distance bit-string representing the distance between the local key and the target (introduced in #1108). I.e. the needed contents of the respectively next bucket are cloned and sorted only as more elements are drawn from the iterator.
Correctness is tested against full-table scans.
Because there was some overlap between the handling of pending nodes and the newly introduced iterator(s) in the sense that any access to a bucket should check for applicability of pending nodes to bring the bucket up-to-date, and there were a few TODOs left around the pending node handling, I refactored that part and added more tests. In particular, I did the following:
kbucket::bucket
, extracting large parts of code formerly directly embedded into theEntry
API. That improves reuse and testability.Entry
API was moved to thekbucket::entry
sub-module and simplified due to the first point. TheEntry
API now just mediates access to the internal bucket API.The nodes in a bucket are ordered from least-recently connected to most-recently connected
, i.e. a "connection-oriented" variant of what is described in the paper.Due to refactorings the diffs are a bit large and I suggest the following (bottom-up) order for an effective review:
kbucket::bucket
module.kbucket::entry
module.kbucket
module and the new top-level functionsclosest
andclosest_keys
, implementing the optimised bucket iteration. Here is the relevant part in thekbucket
module.kbucket
andbehaviour
modules which adopt the rest of the code to the above changes. That should be quick after seeing all the previous changes.