-
Notifications
You must be signed in to change notification settings - Fork 1.3k
🌟 adding Torrent support to IPFS #779
Comments
This sounds like the pragmatic way for me too -- we'll likely get a better idea of what to do with the whole torrent in the process of working on this. Given that the torrent file itself is not already content-addressed, it's also the "correct" way I think. Magnet URIs address the info hash anyway. |
{
"infoHash": "d2474e86c95b19b8bcfdb92bc12c9d44667cfa36",
"infoHashBuffer": {"/": "$infoHashAsCID"},
"name": "Leaves of Grass by Walt Whitman.epub",
} |
Notes from a chat with @jbenet and @whyrusleeping
This leads to the following steps 1. Implement the IPLD Formats to support torrents
2. Implement a blockstore that uses webtorrent as it's storage driver
3. Implement the
|
\o/ |
@diasdavid maybe wait with the torrent blob store for the datastore refactor? |
@dignifiedquire I see the value, but won't block Torrent support because of the datastore refactor, it is not a dependency. |
To keep on log, here is the real structure of both Torrent file and info fields - https://wiki.theory.org/BitTorrentSpecification#Metainfo_File_Structure |
Bringing this one back (🎪 )
It turns out that we might actually just need to do the bencode, because the format, as described in -- https://wiki.theory.org/BitTorrentSpecification#Metainfo_File_Structure -- prescribes that the SHA1 hashes of the pieces be all concatenated, which means that there won't be any This means that we won't be able to use IPLD resolver to traverse through, without transforming the data, as that pieces field will just be a very long byte array value. |
It's pretty ironic, but we can exploit the fact that it prescribes SHA1 and split every 40 bytes. |
20 bytes*, @lgierth we can indeed, that falls into the 'Transformations' category, as IPLD compatible format goes, we are strict about not messing with the data. |
@diasdavid I dont think splitting on 20 bytes for each piece id is any different than biting off the first N bytes for the first parameter of any binary serialization i would say its not a transformation if the serialization doesn't need to change our thinking seems to diverge here, based on previous discussions around ethereum resolvers |
@kumavis agreed that there might be space to be a little less strict with the separation of I'll be with @nicola next week and revisit this question for IPLD transformations. Let's continue this thread on the IPLD repo ipld/ipld#13. |
I think ipld/ipld#13 is slightly more complicated (pre-process with hash, split into halfbytes). splitting the concatenated SHA1 refs still falls under (consume path part, return result) which is no more of a transformation than any IPFS resolver performs. |
I wanted to note the release of The BitTorrent Protocol Specification v2. I don't expect it to be fully supported soon, but it's probably worth being aware of them when designing v1 support. My understanding may not be entirely correct, but here are the key points as I understand them: v2 torrents use different structures than v1 in the info dictionary and metainfo .torrent files. v2 torrents are identified using SHA-2-256 hash of the info dictionary, truncated to 20 bytes to match the length of v1's SHA-1 hashes. It's possible to create hybrid torrents that contain both v1 and v2 structures, and can be identified by either hash. Because a different hash function is used, v1 and v2 torrents' IPFS paths be distinguished (because that's included in their multihash):
BitTorrent magnet links do not have this information; v1 and v2 magnet links cannot be distinguished. I think you need to connect connect to the torrent swarm and download the metadata before you can check which version and hash algorithm were used. So it may not be strictly correctly possible to map BitTorrent magnet URLs (e.g. ipfs/ipfs-companion#256) to a specific IPFS path, because the hash algorithm will not be known. |
ping @arvidn Maybe you know if magnet: links uniquely identify content, or if it needs network discovery, and if this is considered a feature or bug for v2? |
What I wrote above is wrong! I apologize for the misinformation. >_< The updated BEP-9 does in fact use a multihash under a different key to identify a v2 torrent data. I thought that this was cut out before the final version. (The idea of using multihash elsewhere in the protocol was cut, I didn't realize it remained here.) So I think the direct mapping is like: SHA-1, v1 truncated SHA-2-256, v2 Hybrid torrents still have two possible addresses, but that shouldn't be a problem. |
yeah, the hash in the magnet link definitely identifies the content. However, it also identifies some other metadata such as piece size, file names, etc. So even with bittorrent v1, it's possible to have two separate magnet links refer to exactly identical content (but with different piece sizes for instance). |
Great feature (adding torrent support)! What's the current status (no activity on this thread for 4 years)? Also does anyone know if there are similar torrent supporting efforts going on in go-ipfs? |
js-ipfs is being deprecated in favor of Helia. You can follow the migration plan here #4336 and read the migration guide. This piece of work was never completed - there are |
I've started working in enabling Torrent support for js-ipfs, very much in the same way that we have support for: dag-pb, dag-cbor, eth-blocks, eth-tx, zcash (go-ipfs only), git (go-ipfs only) and bitcoin (go-ipfs only).
The end goal is to expose two top level commands to add and retrieve files that are Torrents, from the IPFS or BitTorrent network (through a bridge and in the future, by connecting directly). The commands being:
However, I stumbled upon a question in which we will have to make a decision and I would like to get feedback before going at full speed. In BitTorrent, torrent files are not referenced by a Cryptographic hash due to their ephemeral and mutable nature (in fact, decoding and encoding is not even always idempotent by spec), the only thing that has a cryptographic identifier is the
info
field in the torrent file.I started implementing the IPLD format for a Torrent file, but I'm guessing that most people will want to fetch their torrent through the infoHash of the torrent that they get from a thing like a magnetic URI, the crux is that there is never a file for the
info
field, as soon as a infoHash query is performed, a Torrent file is retried, rising the question of:Should
dag.get(<infoHash>/somePath)
resolve through the retrieved Torrent file or only over the info field?info
field a full standalone object that can be transferred independently (the solution I'm leaning towards). This option would result in two multicodecs for Torrents,torrent-file
andtorrent-info
.Thoughts? //cc @jbenet @whyrusleeping @nicola
The text was updated successfully, but these errors were encountered: