Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1. fix(perf): Run CPU-intensive state updates in parallel rayon threads #4802

Merged
merged 22 commits into from
Jul 22, 2022

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Jul 21, 2022

Motivation

We want to improve Zebra's sync speed to avoid test timeouts in CI (#4155).

This PR moves CPU-intensive verification in state updates to the rayon thread pool.

Close #4790

Partially addresses #4583

Depends-On: #4199
Depends-On: #4795

Specifications

https://docs.rs/rayon/latest/rayon/fn.in_place_scope_fifo.html

Solution

  • Run CPU-intensive chain validation and updates in parallel rayon threads
    • Run non-finalized validation and Chain updates in parallel
    • Run non-finalized sprout, sapling, and orchard tree updates and roots in parallel
    • After a non-finalized chain fork, do note commitment and history tree rebuilds in parallel
    • Run finalized sprout, sapling, and orchard tree updates and roots in parallel

Other performance improvements:

  • Wrap HistoryTree in an Arc in the state
  • Skip generating unused interstitial Sprout treestates

Related changes:

  • Remove redundant checks for empty shielded data

Test fixes:

  • Improve anchor validation debugging and error messages
  • Work around a test data bug, and save some CPU during tests

Review

Anyone can review this PR, it's urgent because CI is failing, and it blocks a release.

Reviewer Checklist

  • Code implements Specs and Designs
  • Tests for Expected Behaviour
  • Tests for Errors

Follow Up Work

@teor2345 teor2345 added C-bug Category: This is a bug P-High 🔥 I-slow Problems with performance or responsiveness I-integration-fail Continuous integration fails, including build and test failures A-state Area: State / database changes A-cryptography Area: Cryptography related labels Jul 21, 2022
@teor2345 teor2345 self-assigned this Jul 21, 2022
@teor2345 teor2345 changed the base branch from main to db-write-blocking July 21, 2022 00:06
@teor2345 teor2345 requested a review from conradoplg July 21, 2022 01:37
@codecov
Copy link

codecov bot commented Jul 21, 2022

Codecov Report

Merging #4802 (0baebfc) into main (7b1d452) will decrease coverage by 0.25%.
The diff coverage is 74.34%.

@@            Coverage Diff             @@
##             main    #4802      +/-   ##
==========================================
- Coverage   78.87%   78.61%   -0.26%     
==========================================
  Files         305      306       +1     
  Lines       38169    38362     +193     
==========================================
+ Hits        30104    30159      +55     
- Misses       8065     8203     +138     

@teor2345 teor2345 marked this pull request as ready for review July 21, 2022 03:36
@teor2345 teor2345 requested review from a team as code owners July 21, 2022 03:36
conradoplg
conradoplg previously approved these changes Jul 21, 2022
Copy link
Collaborator

@conradoplg conradoplg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I was a bit confused with a refactoring and I added some documentation suggestions to hopefully make it clearer

zebra-state/src/service/check/anchors.rs Outdated Show resolved Hide resolved
zebra-state/src/service/check/anchors.rs Outdated Show resolved Hide resolved
@teor2345 teor2345 requested a review from a team as a code owner July 21, 2022 20:58
@teor2345
Copy link
Contributor Author

Sorry, I forgot to push the commits for:
- Run non-finalized sprout, sapling, and orchard tree updates and roots in parallel
- After a non-finalized chain fork, do note commitment and history tree rebuilds in parallel
- Some minor cleanups

@teor2345 teor2345 changed the title 3. fix(perf): Run CPU-intensive state updates in blocking threads 3. fix(perf): Run CPU-intensive state updates in parallel rayon threads Jul 21, 2022
Base automatically changed from db-write-blocking to main July 21, 2022 23:16
@teor2345
Copy link
Contributor Author

I fixed some doc issues and added your suggestions, this is ready for another review.

@teor2345 teor2345 changed the title 3. fix(perf): Run CPU-intensive state updates in parallel rayon threads fix(perf): Run CPU-intensive state updates in parallel rayon threads Jul 21, 2022
@teor2345 teor2345 changed the title fix(perf): Run CPU-intensive state updates in parallel rayon threads 1. fix(perf): Run CPU-intensive state updates in parallel rayon threads Jul 22, 2022
@gustavovalverde gustavovalverde merged commit 6ad445e into main Jul 22, 2022
@gustavovalverde gustavovalverde deleted the db-cpu-rayon branch July 22, 2022 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cryptography Area: Cryptography related A-state Area: State / database changes C-bug Category: This is a bug I-integration-fail Continuous integration fails, including build and test failures I-slow Problems with performance or responsiveness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move note commitment tree updates to a dedicated blocking and CPU-heavy thread
3 participants