-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore using Libtorrent as our database, filesystem, and dissemination solution #3484
Comments
We could use Linear Tape File System for this stuff. |
As everyone will have their own blockchain/filesystem, it would be reasonably fast. And one could defrag it sometimes, cleaning the unneeded stuff. To do incremental updates, one could use torrent "update torrent" feature, adding new variable-sized blocks in the form of files. This could be described as a "streamed filesystem". |
We could use this: https://www.libtorrent.org/manual-ref.html#ssl-torrents for tying the libtorrent download to the channel owner's public key on the transport level (SSL). Also with the Here is some prototype code for fitting arbitrary key value stores into files for a 16MB piece torrent: import os
from libtorrent import (add_files, bdecode, bencode, create_torrent, create_torrent_flags_t,
file_storage, set_piece_hashes)
PIECE_SIZE = 16*1024*1024 # 16 MB
class Chunk(object):
def __init__(self):
super(Chunk, self).__init__()
self.data = {}
self.current_length = 0
self.max_length = PIECE_SIZE - 2 # 16MB - len('d') len('e')
def add(self, key, value):
key_len = len(key)
value_len = len(value)
combined_len = len(str(key_len)) + len(str(value_len)) + key_len + value_len + 4
if self.current_length + combined_len <= self.max_length:
self.data[key] = value
self.current_length += combined_len
return True
return False
def remove(self, key):
self.data.pop(key)
def serialize(self):
return bencode(self.data)
@classmethod
def unserialize(cls, data):
out = cls()
for key, value in bdecode(data).iteritems():
out.add(key, value)
return out
class ChunkedTable(object):
def __init__(self):
super(ChunkedTable, self).__init__()
self.chunklist = {}
def add(self, key, value):
for chunk in self.chunklist.values():
if chunk.add(key, value):
return
chunk = Chunk()
if not chunk.add(key, value):
return False # key value pair too large for any container
self.chunklist[len(self.chunklist)] = chunk
def remove(self, key):
for chunk in self.chunklist.values():
chunk.remove(key)
def serialize(self):
out = {}
for i in range(len(self.chunklist)):
out[str(i)] = self.chunklist[i].serialize()
return out
@classmethod
def unserialize(cls, map):
chunk_table = ChunkedTable()
for i in map.keys():
chunk_table.chunklist[int(i)] = Chunk.unserialize(map[i])
return chunk_table
def get_all(self):
out = {}
for chunk in self.chunklist.values():
out.update(chunk.data)
return out
class Channel(object):
def __init__(self, name, directory=".", allow_edit=False):
super(Channel, self).__init__()
self.name = name
self.channel_directory = os.path.abspath(os.path.join(directory, name))
if not os.path.isdir(self.channel_directory):
os.makedirs(self.channel_directory)
self.chunked_table = ChunkedTable()
def add_magnetlink(self, magnetlink):
self.chunked_table.add(magnetlink, "")
def remove_magnetlink(self, magnetlink):
self.chunked_table.remove(magnetlink)
def get_magnetlinks(self):
return self.chunked_table.get_all().keys()
def commit(self):
for filename, content in self.chunked_table.serialize().iteritems():
with open(os.path.join(self.channel_directory, filename), 'w') as f:
f.write(content)
def make_torrent(self):
fs = file_storage()
add_files(fs, self.channel_directory)
flags = create_torrent_flags_t.optimize | create_torrent_flags_t.calculate_file_hashes
t = create_torrent(fs, piece_size=PIECE_SIZE, flags=flags)
t.set_priv(False)
set_piece_hashes(t, ".")
torrent_name = os.path.join(self.channel_directory, self.name + ".torrent")
with open(torrent_name, 'w') as f:
f.write(bencode(t.generate()))
return torrent_name
def load(self):
files = os.listdir(self.channel_directory)
data = {}
for filename in files:
if filename.isdigit():
with open(os.path.join(self.channel_directory, filename), 'r') as f:
data[filename] = f.read()
self.chunked_table = ChunkedTable.unserialize(data)
# TEST
channel = Channel('mychannel', allow_edit=True)
channel.add_magnetlink('a'*20)
channel.add_magnetlink('b'*20)
channel.remove_magnetlink('a'*20)
channel.commit()
torrent = channel.make_torrent()
discovered_channel = Channel('mychannel')
discovered_channel.load()
print discovered_channel.get_magnetlinks() |
there's even a mutable_torrent_support flag for |
@arvidn thanks, it seems we wont even need to make it a merkle tree torrent with our structure then. |
Moved the development of this to my fork: https://github.com/qstokkink/tribler/blob/allchannel2/Tribler/community/allchannel2/structures.py |
Random idea: dissemination of content metadata could be rewarded with bandwidth tokens when being performed over anonymous tunnels (see ticket #3337). This would work better if we have (a set of) thumbnails attached to each content torrent. Whether this reward scheme is a good idea or not, is open for discussion. We could even provide something like 'hidden channels' where channel content metadata is only seeded over end-to-end tunnels. I'm not sure about the legal implications of this though. |
@devos50 In principle I think this is a good idea. I also added a secret feature in #3489 to directly store metadata alongside the magnetlinks in the channels. Providing incentive to share the channels would be good. It does strike me as overkill for most channels to actually use tunnels for infohash dissemination. Actually downloading the channel contents might lend itself more to anonymization & payout. This brings up another question: should you be paid equally for relaying tunnel traffic, exiting tunnel traffic and sharing channels? Should it be a marketplace which can be mined? |
Lets keep things as simple as possible for 2018.. |
Attacking Merkle Trees with a second preimage attack https://news.ycombinator.com/item?id=16572793 |
TrustChain blocks can have arbitrary size and contents. Currently, we store the user's TrustChain in the SQLite database and spread it out using IPv8 queries. But TrustChain is, by definition, an append-only data structure. This means we can write it into a file on disk as it grows, and periodically publish it in a torrent, as we do with GigaChannels. In fact, GigaChannel already features simple and efficient code to do just that: periodically dump binary data into file chunks, dynamically compressing the dumped data with LZ4 (and serving queries through IPv8). As a first step towards this goal, we could add sidechain support to TrustChain, so we can experiment with various forms of sidechain offloading. |
Please focus on a minimal viable PR. This is outside the sprint scope. We have another ticket on using torrents for Trustchain, it seems like a smart idea. We will explore later. |
@synctext , this was exactly my point: we first finish the current GigaChannel PR and release 7.2. Then, when we get enough experience with using Libtorrent as a channels dissemination engine we continue with this issue (TrustChain merging). |
We believe this issue has been sufficiently addressed with our recent gigachannel efforts. We are now using libtorrent as underlying mechanism to disseminate magnet links and metadata. |
Seed, stop seeding, modify file/blob, redo hash check, seed.
Block alignment is essential. Bittorrent pieces align with fixed-sized Trustchain records? Or variable size? Filesystem for multiple chains?
First: seek related work! This stuff has been beaten to death in the past 30 years. Post papers, architectures, ideas.
The text was updated successfully, but these errors were encountered: