Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kademlia: Address some TODOs - Refactoring - API updates. #1174

Merged
merged 7 commits into from
Jul 3, 2019

Conversation

romanb
Copy link
Contributor

@romanb romanb commented Jun 13, 2019

Sits on top of #1154, addressing some left-over TODOs, as well as some refactoring and preparation for further work on the (still incomplete) record and provider system.

The following left-over issues from earlier work have been addressed:

  • The key for FIND_NODE requests is generalised to any Multihash, instead of just peer IDs.
  • All queries get a (configurable) timeout.
  • Finishing queries as soon as enough results have been received is simplified, avoiding code duplication.
  • No more panics in provider-API-related code paths. The provider API is however still untested and (I think) still incomplete (e.g. expiration of provider records).
  • Numerous smaller TODOs encountered in the code.

The following public API changes / additions have been made:

  • Introduced a KademliaConfig with new configuration options for the replication factor and query timeouts.
  • Renamed find_node to get_closest_peers.
  • Renamed get_value to get_record and put_value to put_record, introducing a Quorum parameter for both functions, replacing the existing num_results parameter with clearer semantics.
  • Renamed add_providing to start_providing and remove_providing to stop_providing.
  • Added a bootstrap function that implements a (almost) standard Kademlia bootstrapping procedure.
  • Renamed KademliaOut to KademliaEvent with an updated list of constructors (some renaming). All events that report query results now report a Result to uniformly permit reporting of errors.

The following refactorings have been made:

  • Introduced some constants.
  • Consolidated query.rs and write.rs behind a common interface to reduce duplication and facilitate better code reuse, introducing the notion of a query peer "iterator". query/peers/closest.rs contains the code that was formerly in query.rs. query/peers/fixed.rs contains a modified variant of write.rs (which is removed). The new query.rs provides an interface for working with a collection of queries, taking over some code from behaviour.rs.
  • Reduced code duplication in tests and switched to the current_thread runtime for polling swarms to avoid spurious errors in the test output due to aborted connections when a test finishes prematurely (e.g. because a quorum of results has been collected).
  • Some additions / improvements to the existing tests.

Roman S. Borschel added 3 commits June 29, 2019 16:51
The following left-over issues are addressed:

  * The key for FIND_NODE requests is generalised to any Multihash,
    instead of just peer IDs.
  * All queries get a (configurable) timeout.
  * Finishing queries as soon as enough results have been received is simplified
    to avoid code duplication.
  * No more panics in provider-API-related code paths. The provider API is
    however still untested and (I think) still incomplete (e.g. expiration
    of provider records).
  * Numerous smaller TODOs encountered in the code.

The following public API changes / additions are made:

  * Introduce a `KademliaConfig` with new configuration options for
    the replication factor and query timeouts.
  * Rename `find_node` to `get_closest_peers`.
  * Rename `get_value` to `get_record` and `put_value` to `put_record`,
    introducing a `Quorum` parameter for both functions, replacing the
    existing `num_results` parameter with clearer semantics.
  * Rename `add_providing` to `start_providing` and `remove_providing`
    to `stop_providing`.
  * Add a `bootstrap` function that implements a (almost) standard
    Kademlia bootstrapping procedure.
  * Rename `KademliaOut` to `KademliaEvent` with an updated list of
    constructors (some renaming). All events that report query results
    now report a `Result` to uniformly permit reporting of errors.

The following refactorings are made:

  * Introduce some constants.
  * Consolidate `query.rs` and `write.rs` behind a common query interface
    to reduce duplication and facilitate better code reuse, introducing
    the notion of a query peer iterator. `query/peers/closest.rs`
    contains the code that was formerly in `query.rs`. `query/peers/fixed.rs` contains
    a modified variant of `write.rs` (which is removed). The new `query.rs`
    provides an interface for working with a collection of queries, taking
    over some code from `behaviour.rs`.
  * Reduce code duplication in tests and use the current_thread runtime for
    polling swarms to avoid spurious errors in the test output due to aborted
    connections when a test finishes prematurely (e.g. because a quorum of
    results has been collected).
  * Some additions / improvements to the existing tests.
behaviour.add_address(&"QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64".parse().unwrap(), "/ip6/2604:a880:800:10::4a:5001/tcp/4001".parse().unwrap());
behaviour.add_address(&"QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd".parse().unwrap(), "/ip6/2a03:b0c0:0:1010::23:1001/tcp/4001".parse().unwrap());
libp2p::core::Swarm::new(transport, behaviour, local_peer_id)
/*behaviour.add_address(&"QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN".parse().unwrap(), "/dnsaddr/bootstrap.libp2p.io".parse().unwrap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why comment out addresses just because they don't work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because they unnecessarily and permanently slow down running the example, which already runs into plenty of timeouts as it is.

protocols/kad/src/kbucket/key.rs Show resolved Hide resolved
protocols/kad/src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
protocols/kad/src/query/peers/closest.rs Show resolved Hide resolved
KademliaOut::PutValueResult(
PutValueResult::Err { key, cause: error }
)
/// Stores a record in the DHT.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Stores a record in the DHT.
/// Starts the process of storing a record in the DHT.

protocols/kad/src/behaviour.rs Show resolved Hide resolved
protocols/kad/src/behaviour.rs Show resolved Hide resolved
/// The successful result of [`Kademlia::bootstrap`].
#[derive(Debug, Clone)]
pub struct BootstrapOk {
pub peer: PeerId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is that value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The peer ID that is the target of the bootstrapping query - that is, either the local peer ID (first query) or a random peer ID for a specific bucket (follow-up queries). I can add a doc-comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it sounded obvious that BootstrapOk was only produced once at the very end of the bootstrapping process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which, to me, shows that more documentation is needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the existing documentation of Kademlia::bootstrap? I think it does explain it in sufficient detail, but let me know what you think is missing.


/// The error result of [`Kademlia::put_record`].
#[derive(Debug, Clone)]
pub enum PutRecordError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A method into_key() would probably be appreciated by the user who wants to try again.

@tomaka
Copy link
Member

tomaka commented Jul 3, 2019

It looks like you don't want to put the mention in the docs that get_record and put_record only start the process instead of doing it instantly!

Other than that, looks good to me.

@romanb
Copy link
Contributor Author

romanb commented Jul 3, 2019

I'm still unsure about the best wording in the documentation, but we can always change it later.

Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants