protocols/kad: Why run caching procedure for queries with Quorum::One only? #1577

mxinden · 2020-05-15T15:45:28Z

When querying for a key, Kademlia remembers the closest node that did not have the key to later on store the record on that node.

Paper page 7:

For caching purposes, once a lookup suceeds, the requesting node stores the <key, value> pair at the closest node it observed to the key that did not return the value.

This is implemented here:

rust-libp2p/protocols/kad/src/behaviour.rs

Lines 1306 to 1320 in c271f6f

    
           } else if quorum.get() == 1 { 
        
               // It is a "standard" Kademlia query, for which the 
        
               // closest node to the key that did *not* return the 
        
               // value is tracked in order to cache the record on 
        
               // that node if the query turns out to be successful. 
        
               let source_key = kbucket::Key::from(source.clone()); 
        
               if let Some(cache_key) = cache_at { 
        
                   let key = kbucket::Key::new(key.clone()); 
        
                   if source_key.distance(&key) < cache_key.distance(&key) { 
        
                       *cache_at = Some(source_key) 
        
                   } 
        
               } else { 
        
                   *cache_at = Some(source_key) 
        
               } 
        
           }

Why is this procedure only triggered for queries with Quorum::One? Why not do this for all queries?

//CC @romanb

The text was updated successfully, but these errors were encountered:

romanb · 2020-05-16T14:15:24Z

I have some difficulties remembering but I think it really came down to the semantics just being a bit unclear. In standard Kademlia the record is cached at the nearest peer to the key that did not return the value on a successful lookup. A successful lookup in standard Kademlia is with quorum 1. If this is done for quorums > 1 as well, then the record will be cached even if, from a user's perspective, the lookup wasn't actually successful (e.g. quorum not satisfied). That said, I don't currently see any harm in doing that either, so if there was a very strong reason, I must have forgotten about it by now.

romanb · 2020-05-16T14:21:49Z

I think I remember now. The question is: With a quorum > 1, which of the returned records do you cache? They may not be the same, and the reason someone uses a quorum > 1 to begin with is to make a choice among the obtained results, usually. That is a choice that we cannot make internally in libp2p.

romanb · 2021-01-26T15:23:25Z

@mxinden Did my last comment answer your question and/or do you have specific suggestions for what we could do?

mxinden · 2021-02-01T08:14:06Z

I think I remember now. The question is: With a quorum > 1, which of the returned records do you cache? They may not be the same, and the reason someone uses a quorum > 1 to begin with is to make a choice among the obtained results, usually. That is a choice that we cannot make internally in libp2p.

Thanks for the follow up. I agree that this is problematic. I can think of two improvements (not mutually exclusive):

Once a query with quorum > 1 is finished AND was successful AND all records are the same then cache the record at the closest node.
Once a query with quorum > 1 is finished but is returned to the user with a diverging set of records, allow the user to choose one to be cached at the closest node via e.g. a function like cache_record_at_closest_node.

What do you think? Do you have alternative ideas how to make this integral part of Kademlia work for queries with quorum > 1?

romanb · 2021-02-01T14:07:06Z

Once a query with quorum > 1 is finished AND was successful AND all records are the same then cache the record at the closest node.

I assume that comparing Vec<u8>s of equal length, which would be the common case, usually boils down to a byte-for-byte comparison and since records can be large(r), I don't think this is something that should be done by default, at least. Though this is just my first reaction (I did think about the option in the past).

Once a query with quorum > 1 is finished but is returned to the user with a diverging set of records, allow the user to choose one to be cached at the closest node via e.g. a function like cache_record_at_closest_node.

That crossed my mind as well. The difficulty there would be to come up with a practical interface. It would seem simple enough to allow configuration of a referentially transparent function, i.e. a function pointer of the kind fn(Vec<u8>, Vec<u8>) -> Vec<u8>, but that would be quite limiting, as you cannot access any contextual data when making the decision. To overcome that, one could introduce a custom trait and type parameter. However, in order to make a decision between two records, you would probably need to decode the bytes into some struct, which may be a bit of a waste if you're decoding the records again when you ultimately get them from the API (e.g. in a GetRecordResult).

The above considerations so far led me to believe that it may be preferable to leave these kinds of write-back after queries to client code, which in any case needs to decide (usually after decoding) which record(s) to keep for queries with quorum > 1. It is thus thinkable that we could provide a bit more metadata in a GetRecordResult, like the closest node(s) to the key that did not return a record. If there is then also a put_record API for sending a record to a specific given list of target peers, that might enable client code to do such write-back caching in any way they desire.

romanb · 2021-02-01T22:14:50Z

Once a query with quorum > 1 is finished but is returned to the user with a diverging set of records, allow the user to choose one to be cached at the closest node via e.g. a function like cache_record_at_closest_node.

I just realised that you probably mean something along the lines of what I wrote in the last paragraph of my previous comment. For some reason my first reading of this comment of yours was that it suggests having a (configurable) function that decides which record to keep / cache, which I referred to in the second to last paragraph.

mxinden · 2021-02-04T16:39:13Z

For some reason my first reading of this comment of yours was that it suggests having a (configurable) function that decides which record to keep / cache, which I referred to in the second to last paragraph.

As you stated above, I share the concern that this is an interface hard to get right. Additional complications might arise with people wanting to call an async function in that function, which reminds me of the problems we are facing in Substrate with the GossipValidator today (see paritytech/substrate#6335).

It is thus thinkable that we could provide a bit more metadata in a GetRecordResult, like the closest node(s) to the key that did not return a record. If there is then also a put_record API for sending a record to a specific given list of target peers, that might enable client code to do such write-back caching in any way they desire.

I would be in favor of this approach.

Closes libp2p#1577

e.g. for write-back caching after a successful read. In that context, peers that were queried in a successful `Kademlia::get_record` operation but did not return a record are now returned in the `GetRecordOk::no_record` list of peer IDs. Closes libp2p#1577.

* Add `Kademlia::put_record_to` for storing a record at specific nodes, e.g. for write-back caching after a successful read. In that context, peers that were queried in a successful `Kademlia::get_record` operation but did not return a record are now returned in the `GetRecordOk::no_record` list of peer IDs. Closes #1577. * Update protocols/kad/src/behaviour.rs Co-authored-by: Max Inden <[email protected]> * Refine implementation. Rather than returning the peers that are cache candidates in a `Vec` in an arbitrary order, use a `BTreeMap`. Furthermore, make the caching configurable, being enabled by default with `max_peers` of 1. By configuring it with `max_peers` > 1 it is now also possible to control at how many nodes the record is cached automatically after a lookup with quorum 1. When enabled, successful lookups will always return `cache_candidates`, which can be used explicitly with `Kademlia::put_record_to` after lookups with a quorum > 1. * Re-export KademliaCaching * Update protocols/kad/CHANGELOG.md Co-authored-by: Max Inden <[email protected]> * Clarify changelog. Co-authored-by: Max Inden <[email protected]>

* Add `Kademlia::put_record_to` for storing a record at specific nodes, e.g. for write-back caching after a successful read. In that context, peers that were queried in a successful `Kademlia::get_record` operation but did not return a record are now returned in the `GetRecordOk::no_record` list of peer IDs. Closes libp2p/rust-libp2p#1577. * Update protocols/kad/src/behaviour.rs Co-authored-by: Max Inden <[email protected]> * Refine implementation. Rather than returning the peers that are cache candidates in a `Vec` in an arbitrary order, use a `BTreeMap`. Furthermore, make the caching configurable, being enabled by default with `max_peers` of 1. By configuring it with `max_peers` > 1 it is now also possible to control at how many nodes the record is cached automatically after a lookup with quorum 1. When enabled, successful lookups will always return `cache_candidates`, which can be used explicitly with `Kademlia::put_record_to` after lookups with a quorum > 1. * Re-export KademliaCaching * Update protocols/kad/CHANGELOG.md Co-authored-by: Max Inden <[email protected]> * Clarify changelog. Co-authored-by: Max Inden <[email protected]>

mxinden added the question label May 15, 2020

romanb pushed a commit to romanb/rust-libp2p that referenced this issue Mar 4, 2021

wip

be7e9a4

Closes libp2p#1577

romanb mentioned this issue Mar 4, 2021

[kad] Provide a targeted store operation. #1988

Merged

romanb linked a pull request Mar 4, 2021 that will close this issue

[kad] Provide a targeted store operation. #1988

Merged

romanb closed this as completed in #1988 Mar 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

protocols/kad: Why run caching procedure for queries with Quorum::One only? #1577

protocols/kad: Why run caching procedure for queries with Quorum::One only? #1577

mxinden commented May 15, 2020

romanb commented May 16, 2020 •

edited

Loading

romanb commented May 16, 2020 •

edited

Loading

romanb commented Jan 26, 2021

mxinden commented Feb 1, 2021

romanb commented Feb 1, 2021

romanb commented Feb 1, 2021 •

edited

Loading

mxinden commented Feb 4, 2021

protocols/kad: Why run caching procedure for queries with Quorum::One only? #1577

protocols/kad: Why run caching procedure for queries with Quorum::One only? #1577

Comments

mxinden commented May 15, 2020

romanb commented May 16, 2020 • edited Loading

romanb commented May 16, 2020 • edited Loading

romanb commented Jan 26, 2021

mxinden commented Feb 1, 2021

romanb commented Feb 1, 2021

romanb commented Feb 1, 2021 • edited Loading

mxinden commented Feb 4, 2021

romanb commented May 16, 2020 •

edited

Loading

romanb commented May 16, 2020 •

edited

Loading

romanb commented Feb 1, 2021 •

edited

Loading