-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state-bridge: Census peer scoring #1501
Comments
I'll propose an alternate framework. Weights are all additive to avoid multipliers, making it much easier to reason about the ranges for different values.
Given the list of interested peers, a weighted random selection should be made from the calculated weights. The main thing I'd like to augment this with would be data transfer rate which I think requires us to have an actual ongoing measurement of each individual peer's throughput in something like bytes/sec. With that we can award additional weight based on how they do in comparison with the other peers. |
Since you didn't mention it, do you think it's important to prioritize / give higher weigh to nodes that are closer to the content?
Can you elaborate, from probability/algorithm point of view, how negative weights would be used? I guess your idea can be used in the same way as if starting weight is 200 and min-max range is 0-400, but I'm not sure if probability distribution you had in mind is different. I'm also pretty sure we can implement multiple different "peer scoring" algorithms and compare performance (e.g. percentage of failed attempts). |
For the range, I imagined that implementations would treat this as a weight in the range 0-400. It felt awkward to define the algorithm with each node have a starting weight of 200. Also worth saying that the exact numbers here are utterly arbitrary and I'm totally open to suggestions for better numbers.
No, I don't think distance should be apart of this. The neighborhood gossip should take care of getting content to the closest nodes and I don't think it is likely to improve our primary performance metric (rate-or-speed of gossip) to push it to the closest nodes, nor do I think it improves network health. |
If we were able to have throughput measurements, then I would suggest this modification.
I also think that we should add extra penalties for failed transfers. In theory we could do this by measuring wasted bandwidth and time. The MBR tells us how long a transfer should take on average. We should be able to combine this with the payload size to get a time |
Before you implement this in trin, can you make sure that we have measurable metrics for knowing whether the change improved the throughput of the bridge... |
One thing to note is currently we filter out ultralight nodes, peer scoring mainly helps with peers what don't peer form well. As our networks grow bad peers will grow. Currently, filtering out Ultralight made a big different; I think peer scoring is the right way to do this, as we shouldn't discriminate a node based on implementation but on performance. Bad nodes blocks throughput we can be using for working nodes. In a perfect world something like this wouldn't make a difference. My assumption is as the networks grow the importance of peer scoring grows |
Problem
The Census only keeps track of all available peers, and their liveness is checked once per hour.
When peers are requested for a given content id, census filters out all peers whose radius doesn't include provided content, and returns first X (4 at the moment of writing).
Ideally, we should have some reputation or scoring system for the peers and use it when selecting them.
High level idea
Firstly, we should still filter out peers that are not interested in the content (as we do now). Then, each peer will be assigned weight, and X peers will be selected randomly based on their weight.
The weight can be calculated as product of following: distance, reputation and recency weights
Note: All constants can be configurable parameters. Given values are picked by feeling and could be a good starting point (unless somebody has some other suggestions).
Distance weight
This is the main weigh: peers should be selected based on their distance to the content.
The
as_u16
function takes 16 most significant bits (2 bytes) of theU256
distance metricReputation weight
This is the main penalty weight: peers with recent failed attempts should be penalized.
The
recent_failed_attempts
is the number of recently failed OFFER requests to the peer. Only last 16 OFFER requests should be considered. Rejected offers shouldn't be counted as failures .Each failed attempt lowers the weight by factor of 2.
Recency weight
This is the main recovery function: it "rewards" peers that weren't offered content in a long time.
Peers that were failing requests in the past but have recovered in the meantime would have hard time being selected, and this weight should help with that.
For every 5 minutes that we didn't offer content to the peer, their weight increases by factor of 2
Assuming all peers are healthy, this shouldn't play big factor because of the high volume and random distribution of content ids (meaning all peers should be selected before this weight grows too big).
This only starts playing important role for peers that are penalized by
reputation_weight
, in which case it takes 5 minutes to recover one failed attempt.Future improvements
Some ideas omitted from the initial approach as I want to start with something simple and add more complexity afterwards if needed.
The text was updated successfully, but these errors were encountered: