Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

availability-recovery: prefer fetching small PoVs from backing group #6832

Closed
Tracked by #26 ...
sandreim opened this issue Mar 6, 2023 · 7 comments · Fixed by #7173
Closed
Tracked by #26 ...

availability-recovery: prefer fetching small PoVs from backing group #6832

sandreim opened this issue Mar 6, 2023 · 7 comments · Fixed by #7173

Comments

@sandreim
Copy link
Contributor

sandreim commented Mar 6, 2023

Currently we are not preferring the fast path and this starts to be a problem above 300 para validators. PoV recovery times distribution screenshot below. We need to dive deeper and investigate why it is so slow and switch to prefer the fast path (fetching from backers) until we improve it to be faster.

https://github.com/paritytech/polkadot/blob/master/node/service/src/overseer.rs#L227 should be using fast-path

Screenshot 2023-03-06 at 20 22 21

@burdges
Copy link
Contributor

burdges commented Mar 6, 2023

At a network level, we kinda expect fetching all data from fewer sources to be somehow cheaper, but how much does this mater? At least bittorrent really does not impose much overhead. libp2p is crap compared with libtorrent, but still how much do we really benefit by using fewer connections?

We've a systemic erasure code so if we reconstruct entirely from the first f+1 chunks then we avoid the decoding step. We must still reencode to check the chunks merkle tree though, so clearly this saves something, but I've forgotten how much proportionally.

We do not care from whom we fetch these first f+1 systemic chucks either, so we'd still benefit from systemic reconstruction if we fetch from a mix of backing validators and the f+1 validators who also know those f+1 chunks. We should already have a deterministic pseudo-random permutation in the assignment of validators to chunk indices, so doing this should not concentrate the load upon specific f+1 validators.

If we benefit plenty from avoiding decoding but little from using fewer connections, then a complex but efficent scheme works like:

  • Mark who knows what chunks, like an internal simulated bittorrent tracker. Initially we mark backers as having all their systemic chunks, and mark each availability voter as holding their chunk. We'd could later even mark other availability checkers as "maybe" having systemic chunks, like some probabilistic bittorrent tracker.
  • We start downloading chunks with a significant preference towards the first f+1 systemic chunks, but we do give up and take non-systemic chunks at some reasonable point.

If we benefit from both decoding and fewer connections, then we might modify this to first ask backers for systemic chunks for which no availability voter exists. In fact, we'd maybe do this anyways since it'll help us choose to give up and take non-systemic chunks sooner.

All this sounds like a rabbit hole, so we need some benchmarks that say if which parts of this, or say batch verification of signatures, or other optimizations makes more sense.. or if this even saves enough to be part of https://github.com/orgs/paritytech/projects/63

@rphmeier
Copy link
Contributor

rphmeier commented Mar 8, 2023

One Q: is it possible to separate + optimize the availability store in particular to make it more efficient overall, rather than having it in the same rocksdb instance as other data?

@sandreim
Copy link
Contributor Author

sandreim commented Mar 10, 2023

FWIW right now we don't know why we are this slow. We have to dig further, maybe add some more instrumentation, but at first glance it doesn't look like we are being disk I/O pressured in any way, and we don't use much CPU for reconstructing. One gut feeling I have is that it is related to paritytech/polkadot-sdk#702 and I expect that fixing it would improve our metrics too.

I am not optimistic about any separation of A/V data bringing in any speed improvement wrt to total availability recovery time.

@sandreim
Copy link
Contributor Author

sandreim commented Apr 27, 2023

Less controversial and best we could do now is to fetch small PoVs (< 128kb for example) from backers and fallback to chunks for bigger ones.

We can reason about the size of the PoV by the size of our own chunk.

@sandreim sandreim changed the title availability-recovery: prefer fetching PoV from backing group availability-recovery: prefer fetching small PoVs from backing group Apr 27, 2023
@burdges
Copy link
Contributor

burdges commented Apr 27, 2023

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly. This opinion is based upon my belief that libtorrent is quite efficient, but this belief can be re-evaluated by asking people who know about developing bittorent clients. If this believe is correct, then there would be little benefit in reconstructing form fewer guys.

If one wants an extreme hack, then one could build a testnet that literally uses the C++ libtorrent code base for chunk distribution, and literally puts a torrent file into the candidate receipt. lol It might not work however since LEDBAT would play second fiddle to all other traffic. I've no idea if any rust torrent crates provide libtorrent-like performance, probably no.

@bkchr
Copy link
Member

bkchr commented Apr 27, 2023

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly. This opinion is based upon my belief that libtorrent is quite efficient, but this belief can be re-evaluated by asking people who know about developing bittorent clients. If this believe is correct, then there would be little benefit in reconstructing form fewer guys.

Do you know what chunk size they are using?

@sandreim
Copy link
Contributor Author

I'd conjecture that utilizing more connections would cost us little if our networking code was done properly.

Couldn't agree more, but currently we are very far away from done properly :) This should be a medium term temporary fix to save some time and bandwidth.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants