-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
approval-voting
: implement parallel processing of signature checks.
#731
Comments
We do have a batch verification for VRFs in https://github.com/w3f/schnorrkel/blob/master/src/vrf.rs#L536 which likely saves 40%, which works across multiple signers, but slightly increases gossip overhead by 32 bytes per message. I've an unimplemented variant that avoids this 32 byte overhead even. We could merge all the tranche zero VRFs by the same signer too. We've two options:
I doubt being secretive about refused assignments matters much. I doubt either 1 or 2 helps RelayVrfDelay much, but we should tune parameters so that RelayVrfModulo represents maybe 90% or 85% of assignments. Batch verification helps RelayVrfDelay just fine. All told, we should save over 80% by doing 2, double checking parameters, and maybe doing batch verifications. |
This issue (as well #732) are focused on improving performance from an engineering point of view, like solving the bottleneck of having a single threaded approach for processing distribution and import of assignments and votes.
IIUC in our case we have a single signer and this would mean that we could batch it's own RelayVrfDelay assignments for the same tranche (different candidates). If my understanding correct ?
2 sounds very good to me, but I am not cryptography guy. Can you detail a bit the pros and cons of having RelayVrfModulo represent 85% of assignments in tranche 0? I will create a ticket for further discussion of these improvements. |
Answered in the other thread. |
FWIW we could go even further by sharding the state and input by |
Yeah, I think something along those lines is possible. I don't remember all the details, but I think candidates have to be specifically approved under each fork, right? If so, we can shard by |
Since the assignments can also claim more candidates, I expect more latency when handling |
We approve candidates under each fork because assignment VRFs are seeded by relay chain VRFs. We could move assignments and votes across forks when relay chain block producers equivocate though, which maybe useful. You might've bigger fish to fry after you merge the tranche 0 assignments, but conversely all those delay assignments add up quickly whenever many no-shows happen. At a high level, we process gossip messages containing assignments and votes, which result in database writes and deduplication checks, and then our approvals loop reads this database. We should not afaik spend too much time in the approvals loop itself, so assignment VRF signatures could be checked by workers who then push valid assignments into a queue for insertion into the database. At the extreme this could be made no blocking, no? |
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.6 to 0.7.7. - [Release notes](https://github.com/jeromefroe/lru-rs/releases) - [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md) - [Commits](jeromefroe/lru-rs@0.7.6...0.7.7) --- updated-dependencies: - dependency-name: lru dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
I've been experimenting with using a thread pool to handle VRF signature checks which appear to be most expensive operation that we are doing in approval voting. After running some benchmarks I got these results on
AMD EPYC 7601 32-Core Processor
:I expect this change to work very well with #732 because it will allow us to multiplex all the CPU intensive work of the subsystem to multiple CPU cores, improving our current single threaded design.
Important note: The number of blocking threads used needs to be bounded and we would also need an upper limit at which we add backpressure.
The text was updated successfully, but these errors were encountered: