-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: blacklist peers sharing more than x IPs #1749
Comments
Per discussion: A strategy that works well is to introduce two limits: a soft limit and a hard limit. The hard limit is set high and acts as an absolute point beyond which no more connections are accepted. The soft limit is used as a target for the number of peers or resources that are allowed and periodically the software takes action to reach that limit either by connecting to more peers or disconnecting poor peers. Critically, the software continues to make outgoing connections and accept incoming connections (usually with a token bucket-controlled limit to incoming connection rate) in this state. Disconnection is done according to a varied set of criteria that focuses on certain metrics, primarily diversity - diversity can roughly be defined as covering as many bases as possible: geographic, utility, in/out, old/new etc. In this strategy, "connecting from the same IP" doesn't get a point for "additional IP diversity" and therefore scores lower. Looking at an example, with soft limit 100, hard limit 200, connection rate 10 peers/s and one "cleanup" per second, it's easy to see that even if the spammer creates 200 connections, these will get closed quickly down to 100 meaning there are 100 free "slots" for others to use to get an initial connection - even if the spammer keeps connecting, there are enough free "slots" in the queue for others to connect as well because of the rate limiting - in each round of cleanup, the connection of the spammer keeps going down whereas others get a fair chance based on their diversity contribution. This strategy can further be "strenghened" by applying a two-level "token-bucket" strategy with a per-ip token bucket and a global one. The two-level token bucket strategy also applies to all kinds of other scenarios for rate limiting and spam control: an individual limit to balance resources across peers without significantly affecting burst performance and a global limit to protect the node itself. |
But isn't it better to not even allow these connections in the first place? Than allowing them and then having to prune? If instead of 1 spammer we have 5, isn't this an attack vector? Why having this "optimistic connection" instead of a "strict blacklisting"?
This is indeed interesting. Can see that prysm has some leaky bucket for ips see. |
Because being permissive is more useful in general, specially for serving nodes - ie it's more common that there are legitimate users than that there are spammers, when the balance of cost for the two is "neutral", in terms of resource usage. A spammer that connects without negatively affecting the service isn't achieving their goal whereas with this strategy, you are achieving your goal of serving legitimate users even in the presence of a multi-connection spammer. This explains why it's not worse to be permissive for legitimate users but it's worse for the spammer. To understand why it's better to be permissive, you focus on the fact that you don't know that 5 connections from the same IP is bad or wrong: you're trying to catch spammers, not 5-connections-from-the-same-ip but the two are not the same. One case of malicious use seen right now in the network does not generalize to "all cases of 5 connections from the same IP are bad". 10 students in the same university classroom using a nat will exhibit this pattern too so your proposed solution catches some spammers and some legitimate users. In short, blacklisting is sometimes bad and sometimes good. Being permissive is never bad and sometimes good. The cost of accepting a connection is assumed to be negligible here: it is made negligible by the rate limiter which ensures that the majority of resources are spent on legitimate connected users, over time. |
Thanks for raising the issue @alrevuelta ! imo, I would follow the point iii (limit ips) and set it as a hard limit, We could carry on with the permissive approach as advised by @arnetheduck, using the soft limit as an implicit blacklist. On the other hand, regardless the connections being legit or spammers, that hard limit is very important as a mean of protection. For example, there could be a legit user that runs a client app which inadvertently tries to establish as many connections as possible due to a particular bug in the client's app. |
One more reason why soft limits work better: actually allowing the spammer to connect often slows down the connect / drop cycle - the technique is generally known as tarpitting and works well against buggy clients in particular tat sit in a tight connect loop by mistake - the "spammer" in this case has to deal with not knowing if their connection attempt is slow on their side or the target side. Regarding hard limits, these are actually there mainly to protect against bugs in nwaku itself - in the example I used, it would never be hit because the "cleanup" procedure would run frequently enough to always stay clear of it when combined with rate limiting - it's simply there in case everything goes wrong at once, to not crash. Keep in mind that the OS already deals with a lot of spam if you just let it - ie it has its own incoming connection queue strategy which works well if you only let it do its work (and don't prematurely accept connections for example). |
Tracking the inbound connection rate limit here.
|
I don't think this matters greatly so long as some limit exists and 10 is as good a starting point as any - consider the numbers: a connection attempt is more or less a few packets the TCP/IP level - the node on the other hand handles thousands of valid network packets per second so as long as the ratio between the two is kept under control, it's fine. Key is to not call the OS One thing to note is that if the queue is too short or the number is too low, it will be slightly easier for a spammer to fill the queue - this is where the local/global limit comes in where the global limit, set at a higher rate, should control the accept call while the per-ip limit should be used to delay (or abort) per-ip connection negotiation after accept. |
Problem
Each node has a maximum amount of
in
/out
connections. If an attacker creates multiple peers under the same IP and connects to us, we can run out ofin
slots to serve other honest peers. This attack can be extended to not just one node but the whole network, and will end up leaving very few availablein
slots for honest peers to connect. This attack has a huge impact on service protocols like store, because the network (depending on its size) can run out of slots to serve this protocols.There is no single solution to prevent this attack, but more like a combination of:
out
peers, so that the node can control some of the peers that is connected to (implemented)Suggested solution
This issue suggests to implement 3) as follows. It aims to mitigate the above explained attack by relying on the fact that sybil attacking with IPs is way harder than with peerIds (which is trivial). It suggests to limit the amount of peers that we see for each IP, so that if a given IP has multiple peers behind, we ignore them (exceeding a threshold):
Solution:
collocationFactor
that limits the amount of peers we allow from each ip. Example: 5.ip_1
is shared amongp1..5
peers then this table will storeip1->5
.p6
is discovered and hasip1
, don't add it to the peerstore and skip it (sincecollocationFactor
= 5). This protects our peerstore to be full of possible spammers.collocationFactor
conns from the same IP (unsure if this can be done in nim-libp2p, in golibp2p there isConnectionGater
that allows to define a handle that is executed before completing the connection)collocationFactor
. This shouldn't happen if they are never added into the peerstore.Note. This can ben enforced on two ways:
collocationFactor
sharing the same IP.collocationFactor
)Inspiration:
Alternatives considered
Gossipsub scoring takes this into account. This parameter is given a weight and is part of a final score with other parameters. imho limiting the peers we see from each IP should be either ok or ban (disconnect). It should be possible to tweak the weights to simulate this behaviour, so that if the amount of peers behind a given ip is > x, then the score automatically drops bellow the threshold and a disconnection is triggered.
Main problem with gossipsub scoring, is that it only applies to gossipsub, and we need protection also for service protocols. Gossipsub scoring can be integrated later on, but I would considering adding this layer on top that restricts the peers per IP for all connections.
The text was updated successfully, but these errors were encountered: