-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Burst Denial of Service in Proof of Time #2142
Comments
I'm not sure you understand it quite the same way, so I will clarify. It is not 8 parallel tasks that we are being used, it is instruction-level parallelism in modern processors. Modern processors can verify 8 checkpoints (16 effective independent computations since we can do both AES encoding and decoding in opposite direction due to the fact that we know inputs and outputs and can go both ways and check if middle value matches) on a single core ~8 times faster than prover, we described it in https://forum.subspace.network/t/proof-of-time-optimistic-vs-pessimistic-verification-analysis/1623 (sections "Energy" and "Commodity Hardware"). By the time we have ASICs that are faster, we expect AVX512 (or AES10 or whatever they end up calling it) will also have wider AES instructions (VAES) enabled on all modern processors (currently it is "complicated") and verification time as well as efficiency will shrink further. I recommed you to read linked forum post in full, it contains a lot of insigths.
First of all For those potential proofs that did pass all of the superficial (but cheap) checks of Only if all of those checks succeded (if any of them do not, reputation of the peer is decreasesd by a bit) and we have several candidates we check whether it is cheaper computation-wise to generate proof ourselves or to verify candidates. Either way this those peers that sent us the incorrect proof will be banned on the network level and will not be able to send us proofs going forward.
Yes, the bigger question of identifying "under attack" conditions and become more concervative with connections going forward is still applicable, but more of a network-related topic. I believe there are more cases like this in the protocol and it is not just Subspace, you can attack any blockchain node this way even though the cost here is much larger. @dariolina we should probably prioritize work in this direction. Additional proposed mitigationI'm not sure focus on One of the mitigations I can think of for PoT specifically is to always default to verification, but sort peers by reputation and pick smaller number of those that have the highest reputation first. Maybe even check the quorum and if most of those with high reputation have sent us the same set of checkpoints, chances are those are the checkpoints we should verify before checking anything else. |
yes we understood it like that, however it is difficult to put into exact words :)
IMHO that is the better way to handle this. this way an attacker first has to play nice for quite some time for just a short attack span. |
I agree this sounds promising. Since we already have to store information about the sender, we might as well prioritize verification of PoT from our "usual" Timekeepers. |
I think we can even reduce it to one, just pick the sender with highest reputation, it should be the most likely to send us correct proof. We can fall back to more or to proving in case that fails, which will not happen in most cases. |
In R&D sync we have discussed a need for a generally more robust "under attack" mode. The approach suggested here to verify from the "best" peer first, surely doesn't hurt and prevent some attacks on the lower-effort end of the spectrum. @nazar-pc I will add this to spec, if you are up to it. |
Please do, it is a non-beaking protocol change we can introduce any time for those who upgrade. I'll implement it in the near future. |
There is lack of Substrate API for reputation querying right now, see paritytech/polkadot-sdk#2185 I'll look into exposing it in Substrate once I get some feedback on what it might look like upstream. Un-assigning Dariia now, since the immediate fix for the scope of this issue is implementation-only. |
Fix implemented in #2320 |
[Medium] Burst Denial of Service in Proof of Time
Summary
An attacker can abuse the high cost associated with verifying Proof of Time (PoT).
Issue details
Based on the AES Latency Report, an estimated ASIC is expected to perform the PoT generation 2.4x faster than a COTS hardware.
When an ASIC produces PoTs every second, COTS hardware takes 2.4 seconds to generate it independently and 300ms to verify a PoT message, assuming it can execute 8 parallel tasks, one for each checkpoint.
Attack scenario 1: If up to seven received PoT messages are in the queue for an iteration
let correct_proof = if potentially_matching_proofs.len() < EXPECTED_POT_VERIFICATION_SPEEDUP {
[1]then one PoT message after another is checked, and the first that is successfully verified is taken. An attacker then sends a handful of faulty PoT messages from different nodes early into the network (e.g. so that the first messages are the faulty ones).
If e.g. the 6th message is a valid, then the node has spent 5 * 300ms = 1.5 seconds for verification.
As the iterations to be performed are allowed to be 50% above the current iteration number on the chain, this becomes 2.25 seconds (
proof.slot_iterations.get() <= current_slot_iterations.get() * 3 / 2
).Attack scenario 2: If eight or more PoT messages were received, the node itself does the PoT generation which will cost 2.4 seconds.
A mitigation mechanism already in place bans nodes that send a faulty PoT. However, an attacker can easily spawn new instances with fresh IP addresses, especially when using IPv6.
Risk
A node, while busy with verification, can miss its opportunity to create a block or perform other actions on the chain.
Mitigation
Changes to the iteration can only occur at the begin of an era which is currently not checked when allowing the 50% increase on the number iterations. This would then allow for
EXPECTED_POT_VERIFICATION_SPEEDUP
to be increased by 1 and be at worst case have the same speed as calculating PoT itself.Additionally, if less than
EXPECTED_POT_VERIFICATION_SPEEDUP
messages are present, a random selection mechanism for the next PoT to be verified could also help reducing the impact of such a burst DoS attack.Lastly, this could operate in two modes: the existing implementation and a trusted mode. In the trusted mode, if no faulty PoT message has been received so far, the EXPECTED_POT_VERIFICATION_SPEEDUP could be doubled.
[1] https://github.com/subspace/subspace/blob/d31fe47911151223d4750bf740524265118e4241/crates/sc-proof-of-time/src/source/gossip.rs#L420
The text was updated successfully, but these errors were encountered: