-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHT Spec #14
DHT Spec #14
Changes from 5 commits
9d2e6e6
1feb2cf
2135736
751c906
0e260f7
dad0bb7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
IPFS Routing Protocol Spec | ||
========================== | ||
|
||
Authors: David Dias | ||
|
||
Reviewers: | ||
|
||
TODOS: | ||
|
||
----------------------- | ||
|
||
> This spec defines the routing protocol spec, covering `Peer discovery`, `Routing` and the `DHT`. The spec is a **Work In Progress**. | ||
|
||
## Supports | ||
|
||
- Peer discovery through | ||
- mdns | ||
- custom peers list | ||
- random walking on the network | ||
- Routing primitives | ||
- Publish and fetch content (also providing) | ||
- Maintaining partial state of the network | ||
- DHT | ||
- kbucket | ||
|
||
### Overview | ||
|
||
The Routing Protocol is divided in three major components, these are: | ||
- Peer Discovery: Responsible for filling our kbucket with best candidates. | ||
- Interface: Our routing primitives that are offered for the user, such as finding and publishing content, including the storage and lookup of metadata (Providers). | ||
- Peer-to-peer Structured Overlay Network: Algorithm for the implicit network organization, based on [Coral](http://iptps03.cs.berkeley.edu/final-papers/coral.pdf) and [mainlineDHT](http://www.bittorrent.org/beps/bep_0005.html) | ||
|
||
Bootstrapping the routing happens by connecting to a predefined "railing" peers list, shipped with the go-ipfs release and/or by discovery through mDNS. Once at least one peer is found and added to the kbucket, the routing changes to an active state and our peer becomes able to route and receive messages. | ||
|
||
### Peer Discovery | ||
|
||
#### bootstrap peer list | ||
|
||
List with known and trusted peers shipped with IPFS. | ||
|
||
- _How is this list updated?_ | ||
- _Is this list updated periodically_? | ||
|
||
#### random walk | ||
|
||
IPFS issues random Peer lookups periodically to refresh our kbucket if needed. For impl reference, see: https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L88-L109. | ||
|
||
#### mDNS | ||
|
||
In addition to known peers and random lookups, IPFS also performs Peer Discovery through mDNS ([MultiCast DNS](https://tools.ietf.org/html/rfc6762)) | ||
|
||
-_How offen do we issue this searches?_ | ||
|
||
### Routing | ||
|
||
For impl reference, check: https://github.com/ipfs/go-ipfs/blob/master/routing/routing.go#L19-L49 | ||
|
||
#### Find a peer | ||
|
||
_When searching for a peer, do we fetch the kbucket from a peer and see which peer we want to ping next or do we ask for a given Id to a peer and that peer replies to us with the best candidate (or itself if it is the case)?_ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It returns a set (3?) of best candidates I think. |
||
|
||
#### Ping | ||
|
||
Ping mechanism (for heartbeats). Ping a peer and log the time it took to answer. | ||
|
||
_what if the Id doesn't exist? Is there any rule for non existing peers? Should we log time for best matches as well?_ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
if we exhaust the query set and haven't found the peer in question, we exit with an error. |
||
|
||
#### Provide | ||
|
||
Providing is the process of storing/updating the metadata (pointers) of where the blocks of a given file are stored/available in the IPFS network. What this means is that the DHT is not used for block discovery, but for the metadata which identifies where they are, instead. | ||
When a node advertises a block available for download, IPFS stores a record in the DHT with its own Peer.ID. This is termed "providing". the node becomes a "provider". Requesters who wish to retrieve the content, query the DHT (or DSHT) and need only to retrieve a subset of providers, not all of them. (this works better with huge DHTs, and latency-aware DHTs like coral). | ||
|
||
We provide once per block, because every block (even sub-blocks) are independently addressable by their hash. (yes, this is expensive, but we can mitigate the cost with better DHT + record designs, bloom filters, and more) | ||
|
||
There is an optimistic optimization -- which is that if a node is storing a node that is the parent (root/ancestor) of other nodes, then it is much more likely to also be storing the children. So when a requester attempts to pull down a large dag, it first queries the DHT for providers of the root. Once the requester finds some and connects directly to retrieve the blocks, bitswap will optimistically send them the "wantlist", which will usually obviate any more dht queries for that dag. we haven't measured this to be true yet -- we need to -- but in practice it seems to work quite well, else we wouldnt see as quick download speeds. (one way to look at it, is "per-dag swarms that overlap", but it's not a fully correct statement as having a root doesn't necessarily mean a node has any or all children.) | ||
|
||
Providing a block happens as it gets added. Reproviding happens periodically, currently 0.5 * dht record timeout ~= 12 hours. | ||
|
||
#### Get value | ||
|
||
|
||
|
||
#### Put value | ||
|
||
_not 100% about this happens exactly. From what I understand, the IPFS node that is adding the file, breaks the file into blocks, creates the hashes and provides each single one of them. When do we execute a Put? Replicas are done through "Get", right?_ | ||
|
||
### DHT | ||
|
||
explain: | ||
- dht/coral, how the algo works | ||
- kbucket | ||
- each time a contact is made with a new peer, we check to see if it is a better candidate for our kbucket | ||
- xor metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should separate out the DHT from Routing. Routing is an interface, that is satisfied by DHT.