-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content popularity #3660
Content popularity #3660
Conversation
@@ -303,6 +304,21 @@ def load_ipv8_overlays(self): | |||
|
|||
self.ipv8.strategies.append((RandomWalk(self.market_community), 20)) | |||
|
|||
# Popular Community | |||
if self.session.config.get_popular_community_enabled() or True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line always evaluate to True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! good catch
# Popular Community | ||
|
||
def get_popular_community_enabled(self): | ||
return self.config['popular_community']['enabled'] or True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is likely a mistake too.
4e56d0a
to
48bdeea
Compare
retest this please |
21fb686
to
00659e4
Compare
dcc815a
to
4bb446e
Compare
retest this please |
@@ -873,6 +889,9 @@ def early_shutdown(self): | |||
yield self.ipv8.unload_overlay(self.tunnel_community) | |||
yield self.ipv8.unload_overlay(self.triblerchain_community) | |||
|
|||
if self.popular_community: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you unload this separately? The tunnel and triblerchain are only unloaded the way they are to enforce the unload ordering.
0c3d7f1
to
bfc39e5
Compare
query = "ubuntu" | ||
self.nodes[0].overlay.send_torrent_search_request(query) | ||
|
||
yield self.deliver_messages() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the asserts?
serialized_response = self.serializer.pack_multiple(in_response.to_pack_list()) | ||
|
||
# Deserialize request and test it | ||
(deserialized_ressponse, _) = self.serializer.unpack_multiple(ContentInfoResponse.format_list, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many s's
bc19910
to
c99e53e
Compare
The |
cbcc9d1
to
97960d5
Compare
@xoriole I guess this PR is almost ready for merge. If you still have some time, please cleanup your commit history so you have a logical sequence of commits (i.e. three commits: added pubsub community, added popularity community, improved tests). @qstokkink can you also give your seal of approval soon? I've updated the SonarQube technical debt threshold to 24 days so it should pass when you retest this PR. This is really not the way we should manage technical debt but I guess that @ichorid will significantly lower the technical debt once the ORM model becomes active (since we can remove all Tribler database crap then). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, address these issues. These are easy to fix, but neglecting them will have a large impact on security.
elif peer in self.publishers: | ||
self.publishers.remove(peer) | ||
|
||
def create_message_packet(self, message_type, payload): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we definitely have to put this into IPv8 for everyone's convenience.
payload = payload if isinstance(payload, list) else payload.to_pack_list() | ||
return self._ez_pack(self._prefix, message_type, [auth, dist, payload]) | ||
|
||
def broadcast_message(self, packet, peer=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second this.
""" | ||
# Add or remove the publisher peer | ||
if subscribe: | ||
self.publishers.add(peer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we add publisher twice: once here (on sending sub packet), and once on receiving confirmation packet:
line 178:
if payload.subscribe:
self.publishers.add(peer)
Is it by design or I don't get something?
If I would design this system, I would only add peers to my publishers' list on the response packet. And to prevent abuse I would put in each sub packet some nonce, and remember it, and only accept response packets with this nonce (this would solve the packet loss/timeout problem, by the way).
How do you handle timeouts?
Tribler/community/popular/pubsub.py
Outdated
peer = self.get_peer_from_auth(auth, source_address) | ||
|
||
if payload.subscribe: | ||
self.publishers.add(peer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ноw do you prevent abuse of this packets? Do you track unanswered subscribe requests?
self.publishers.remove(peer) | ||
|
||
# Create subscription packet and send it | ||
subscription = ContentSubscription(cache.number, subscribe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cache.number
is incremented sequentially, and after seeing one of them you could guess the next one, you still have a security hole there. Imagine the following attack scenario:
Alice sends a subscription message to Bob. Bob's private key is compromised by Charlie. Charlie eavesdrops the message, extract from it the cache.number
and generates a long sequence of cache.number
s that follow the original one. Next, Charlie creates some sybil private keys and starts to bomb Alice's machine with ContentSubscription
messages. From this moment, any Alice's request for subscription will end up subscribing to Charlie.
You should never use a non-random nonce in crypto.
So, if cache.number
is generated at random, you're good. If not, you're screwed.
Please, check it.
There is another way to fix this problem. You already store the whole packet in the cache, so you can check if the reply was actually from the person you send the original request to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache.number
is generated randomly.
cache = self.request_cache.add(ContentRequest(self.request_cache, MSG_SUBSCRIBE, None)) | ||
# Add or remove the publisher peer | ||
if subscribe: | ||
self.publishers.add(peer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite you're using a set
to store the publishers
, you should still only add them on receiving successful ContentSubscription
message. Otherwise, the confirmation message is useless, and you become vulnerable to packet loss.
Please, make publishers
only be added on successful confirmation message.
(And BTW, do subscriptions time out?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! will make the change.
Subscriptions do not time out. However, there are periodically refreshed to fill up for disconnected peers.
It would be great to post some measurement graphs here.
Q: Quality of popularity gossip information, freshness of information, coverage of information, usability of information, overlap, bandwidth-cost, etc. |
This pull request provides a community for sharing popular and alive content to the connected peers. Right now the content is torrent only but can be extended to channels and others.
Model for sharing the content is publish subscribe. At bootstrap, a peer subscribes to a set of its connected peers sorted in descending order based on the individual trust score of the peers.
When a peer accepts the subscription request, it sends its most popular torrents it had checked to the subscriber and then periodically sends the updated torrent checker responses.
The case of conflict when two peers send the torrent response for the same torrent, the decision to accept or reject any of the response is based on