Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kademlia: Refactor iterative queries. #1154

Merged
merged 2 commits into from
Jun 20, 2019
Merged

Commits on Jun 13, 2019

  1. Refactor iterative queries.

    Refactoring of iterative queries (`query.rs`) to improve both
    correctness and performance (for larger DHTs):
    
    Correctness:
    
      1. Queries no longer terminate prematurely due to counting results
         from peers farther from the target while results from closer
         peers are still pending. (libp2p#1105).
    
      2. Queries no longer ignore reported closer peers that are not duplicates
         just because they are currently not among the `num_results` closest.
         The currently `max_results` closest may contain peers marked as failed
         or pending / waiting. Hence all reported closer peers that are not
         duplicates must be considered candidates that may still end up
         among the `num_results` closest that successfully responded.
    
      3. Bounded parallelism based on the `active_counter` was not working
         correctly, as new (not yet contacted) peers closer to the target
         may be discovered at any time and thus appear in `closer_peers`
         before the already active / pending peers.
    
      4. The `Frozen` query mechanism allowed all remaining not-yet contacted
         peers to be contacted, but their results were discarded, because
         `inject_rpc_result` would only incorporate results while the
         query is `Iterating`. The `Frozen` state has been reworked into
         a `Stalled` state that implements a slightly more permissive
         variant of the following from the paper / specs: "If a round of
         FIND_NODEs fails to return a node any closer than the closest
         already seen, the initiator resends the FIND_NODE to all of the
         k closest nodes it has not already queried.". Importantly, though
         not explicitly mentioned, the query can move back to `Iterating`
         if it makes further progress again as a result of these requests.
         The `Stalled` state thus allows (temporarily) higher parallelism
         in an effort to make progress and bring the query to an end.
    
    Performance:
    
      1. Repeated distance calculations between the same peers and the
         target is avoided.
    
      2. Enabled by libp2p#1108, use of a more appropriate data structure (`BTreeMap`) for
         the incrementally updated list of closer peers. The data structure needs
         efficient lookups (to avoid duplicates) and insertions at any position,
         both of which large(r) vectors are not that good at. Unscientific benchmarks
         showed a ~40-60% improvement in somewhat pathological scenarios with at least
         20 healthy nodes, each possibly returning a distinct list of closer 20 peers
         to the requestor. A previous assumption may have been that the vector always
         stays very small, but that is not the case in larger clusters: Even if the
         lists of closer peers reported by the 20 contacted peers are heavily overlapping,
         typically a lot more than 20 peers have to be (at least temporarily) considered
         as closest peers until the query completes. See also issue (2) above.
    
    New tests are added for:
    
      * Query termination conditions.
      * Bounded parallelism.
      * Absence of duplicates.
    Roman S. Borschel committed Jun 13, 2019
    Configuration menu
    Copy the full SHA
    271b496 View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2019

  1. Configuration menu
    Copy the full SHA
    7fc00ad View commit details
    Browse the repository at this point in the history