-
Notifications
You must be signed in to change notification settings - Fork 1.6k
availability-recovery: prefer fetching small PoVs from backing group #6832
Comments
At a network level, we kinda expect fetching all data from fewer sources to be somehow cheaper, but how much does this mater? At least bittorrent really does not impose much overhead. libp2p is crap compared with libtorrent, but still how much do we really benefit by using fewer connections? We've a systemic erasure code so if we reconstruct entirely from the first f+1 chunks then we avoid the decoding step. We must still reencode to check the chunks merkle tree though, so clearly this saves something, but I've forgotten how much proportionally. We do not care from whom we fetch these first f+1 systemic chucks either, so we'd still benefit from systemic reconstruction if we fetch from a mix of backing validators and the f+1 validators who also know those f+1 chunks. We should already have a deterministic pseudo-random permutation in the assignment of validators to chunk indices, so doing this should not concentrate the load upon specific f+1 validators. If we benefit plenty from avoiding decoding but little from using fewer connections, then a complex but efficent scheme works like:
If we benefit from both decoding and fewer connections, then we might modify this to first ask backers for systemic chunks for which no availability voter exists. In fact, we'd maybe do this anyways since it'll help us choose to give up and take non-systemic chunks sooner. All this sounds like a rabbit hole, so we need some benchmarks that say if which parts of this, or say batch verification of signatures, or other optimizations makes more sense.. or if this even saves enough to be part of https://github.com/orgs/paritytech/projects/63 |
One Q: is it possible to separate + optimize the availability store in particular to make it more efficient overall, rather than having it in the same rocksdb instance as other data? |
FWIW right now we don't know why we are this slow. We have to dig further, maybe add some more instrumentation, but at first glance it doesn't look like we are being disk I/O pressured in any way, and we don't use much CPU for reconstructing. One gut feeling I have is that it is related to paritytech/polkadot-sdk#702 and I expect that fixing it would improve our metrics too. I am not optimistic about any separation of A/V data bringing in any speed improvement wrt to total availability recovery time. |
Less controversial and best we could do now is to fetch small PoVs (< 128kb for example) from backers and fallback to chunks for bigger ones. We can reason about the size of the PoV by the size of our own chunk. |
I'd conjecture that utilizing more connections would cost us little if our networking code was done properly. This opinion is based upon my belief that libtorrent is quite efficient, but this belief can be re-evaluated by asking people who know about developing bittorent clients. If this believe is correct, then there would be little benefit in reconstructing form fewer guys. If one wants an extreme hack, then one could build a testnet that literally uses the C++ libtorrent code base for chunk distribution, and literally puts a torrent file into the candidate receipt. lol It might not work however since LEDBAT would play second fiddle to all other traffic. I've no idea if any rust torrent crates provide libtorrent-like performance, probably no. |
Do you know what chunk size they are using? |
Couldn't agree more, but currently we are very far away from done properly :) This should be a medium term temporary fix to save some time and bandwidth. |
Currently we are not preferring the fast path and this starts to be a problem above 300 para validators. PoV recovery times distribution screenshot below. We need to dive deeper and investigate why it is so slow and switch to prefer the fast path (fetching from backers) until we improve it to be faster.
https://github.com/paritytech/polkadot/blob/master/node/service/src/overseer.rs#L227 should be using fast-path
The text was updated successfully, but these errors were encountered: