availability-recovery: prefer fetching small PoVs from backing group #6832

sandreim · 2023-03-06T18:28:59Z

Currently we are not preferring the fast path and this starts to be a problem above 300 para validators. PoV recovery times distribution screenshot below. We need to dive deeper and investigate why it is so slow and switch to prefer the fast path (fetching from backers) until we improve it to be faster.

https://github.com/paritytech/polkadot/blob/master/node/service/src/overseer.rs#L227 should be using fast-path

burdges · 2023-03-06T21:44:32Z

At a network level, we kinda expect fetching all data from fewer sources to be somehow cheaper, but how much does this mater? At least bittorrent really does not impose much overhead. libp2p is crap compared with libtorrent, but still how much do we really benefit by using fewer connections?

We've a systemic erasure code so if we reconstruct entirely from the first f+1 chunks then we avoid the decoding step. We must still reencode to check the chunks merkle tree though, so clearly this saves something, but I've forgotten how much proportionally.

We do not care from whom we fetch these first f+1 systemic chucks either, so we'd still benefit from systemic reconstruction if we fetch from a mix of backing validators and the f+1 validators who also know those f+1 chunks. We should already have a deterministic pseudo-random permutation in the assignment of validators to chunk indices, so doing this should not concentrate the load upon specific f+1 validators.

If we benefit plenty from avoiding decoding but little from using fewer connections, then a complex but efficent scheme works like:

Mark who knows what chunks, like an internal simulated bittorrent tracker. Initially we mark backers as having all their systemic chunks, and mark each availability voter as holding their chunk. We'd could later even mark other availability checkers as "maybe" having systemic chunks, like some probabilistic bittorrent tracker.
We start downloading chunks with a significant preference towards the first f+1 systemic chunks, but we do give up and take non-systemic chunks at some reasonable point.

If we benefit from both decoding and fewer connections, then we might modify this to first ask backers for systemic chunks for which no availability voter exists. In fact, we'd maybe do this anyways since it'll help us choose to give up and take non-systemic chunks sooner.

All this sounds like a rabbit hole, so we need some benchmarks that say if which parts of this, or say batch verification of signatures, or other optimizations makes more sense.. or if this even saves enough to be part of https://github.com/orgs/paritytech/projects/63

rphmeier · 2023-03-08T02:46:13Z

One Q: is it possible to separate + optimize the availability store in particular to make it more efficient overall, rather than having it in the same rocksdb instance as other data?

sandreim · 2023-03-10T11:33:06Z

FWIW right now we don't know why we are this slow. We have to dig further, maybe add some more instrumentation, but at first glance it doesn't look like we are being disk I/O pressured in any way, and we don't use much CPU for reconstructing. One gut feeling I have is that it is related to paritytech/polkadot-sdk#702 and I expect that fixing it would improve our metrics too.

I am not optimistic about any separation of A/V data bringing in any speed improvement wrt to total availability recovery time.

sandreim · 2023-04-27T14:21:46Z

Less controversial and best we could do now is to fetch small PoVs (< 128kb for example) from backers and fallback to chunks for bigger ones.

We can reason about the size of the PoV by the size of our own chunk.

burdges · 2023-04-27T17:17:32Z

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly. This opinion is based upon my belief that libtorrent is quite efficient, but this belief can be re-evaluated by asking people who know about developing bittorent clients. If this believe is correct, then there would be little benefit in reconstructing form fewer guys.

If one wants an extreme hack, then one could build a testnet that literally uses the C++ libtorrent code base for chunk distribution, and literally puts a torrent file into the candidate receipt. lol It might not work however since LEDBAT would play second fiddle to all other traffic. I've no idea if any rust torrent crates provide libtorrent-like performance, probably no.

bkchr · 2023-04-27T20:06:40Z

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly. This opinion is based upon my belief that libtorrent is quite efficient, but this belief can be re-evaluated by asking people who know about developing bittorent clients. If this believe is correct, then there would be little benefit in reconstructing form fewer guys.

Do you know what chunk size they are using?

sandreim · 2023-04-28T14:38:39Z

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly.

Couldn't agree more, but currently we are very far away from done properly :) This should be a medium term temporary fix to save some time and bandwidth.

sandreim added this to Road to 1k paravalidators and 200 parachains Mar 6, 2023

sandreim moved this to Todo in Road to 1k paravalidators and 200 parachains Mar 6, 2023

sandreim changed the title ~~availability-recovery: prefer fetching PoV from backing group~~ availability-recovery: prefer fetching small PoVs from backing group Apr 27, 2023

sandreim mentioned this issue May 3, 2023

Prefer fetching small PoVs from backing group #7173

Merged

sandreim moved this from Todo to In Progress in Road to 1k paravalidators and 200 parachains May 4, 2023

paritytech-processbot bot closed this as completed in #7173 May 5, 2023

github-project-automation bot moved this from In Progress to Done in Road to 1k paravalidators and 200 parachains May 5, 2023

This was referenced Jun 2, 2023

Network scalability: 300 paravalidators and 70 parachains paritytech/roadmap#25

Closed

Network scalability: 500 parachain validators and 100 cores (async backing enabled) paritytech/roadmap#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

availability-recovery: prefer fetching small PoVs from backing group #6832

availability-recovery: prefer fetching small PoVs from backing group #6832

sandreim commented Mar 6, 2023 •

edited

Loading

burdges commented Mar 6, 2023 •

edited

Loading

rphmeier commented Mar 8, 2023

sandreim commented Mar 10, 2023 •

edited

Loading

sandreim commented Apr 27, 2023 •

edited

Loading

burdges commented Apr 27, 2023

bkchr commented Apr 27, 2023

sandreim commented Apr 28, 2023

availability-recovery: prefer fetching small PoVs from backing group #6832

availability-recovery: prefer fetching small PoVs from backing group #6832

Comments

sandreim commented Mar 6, 2023 • edited Loading

burdges commented Mar 6, 2023 • edited Loading

rphmeier commented Mar 8, 2023

sandreim commented Mar 10, 2023 • edited Loading

sandreim commented Apr 27, 2023 • edited Loading

burdges commented Apr 27, 2023

bkchr commented Apr 27, 2023

sandreim commented Apr 28, 2023

sandreim commented Mar 6, 2023 •

edited

Loading

burdges commented Mar 6, 2023 •

edited

Loading

sandreim commented Mar 10, 2023 •

edited

Loading

sandreim commented Apr 27, 2023 •

edited

Loading