Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Optimize block download tasks with a simple task scheduler #1999

Merged
merged 5 commits into from
Apr 28, 2020

Conversation

driftluo
Copy link
Collaborator

@driftluo driftluo commented Apr 3, 2020

This implementation aims to optimize the task scheduling of the download block.

It contains a simple task counter to allocate the number of tasks for each node, record and filter the relatively good nodes for download.

After about a week of testing and continuous adjustments, the current PR data is relatively satisfactory, but the possibility of continued adjustments in the future is not ruled out

This PR changes a number of things, including but not limited:

  1. Raise the maximum inflight block limit per node to 32-128, but the default is 16, and dynamically adjust this data
  2. Remove redundant designs where the same block can be requested from two nodes
  3. When inserting a orphan block, the countdown for 1 second at the tip + 1 corresponding to trace_number, if still not completed, clear the task and send it to another node for download (exponentially decreasing the task limit of the corresponding node)
  4. Split the getBlockTransaction task from the getBlocks task, keeping the design that getBlockTransaction can request from 2 nodes
  5. Clearing out nodes that are peer_best_known < tip in IBD time
  6. Separating the block fetch process
  7. Clearing nodes that do not respond to getblock requests for 30 seconds
  8. mark timeout on all < tip +1 block request if request window > tip + 512
  9. Reduce the consumption of checking the maximum timeout time, from check all inflight to check all Less than tip + 20

Test machine configuration:
2 core 8G RAM
IP Location on Hong Kong

$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
2  Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz

before:

net state outbound average speed CPU occupancy Bandwidth consumption
relatively good 8 peer 102-120 block/s average 70.49%, max 92.77% average 4 Mbps, max 7.12 Mbps
relatively poor 8 peer 93-100 block/s average 49.99%, max 74.37% average 4 Mbps, max 7.12 Mbps

after:

net state outbound average speed CPU occupancy Bandwidth consumption
relatively good 8 peer 219-240 block/s average 78.34%, max 93.17% average 2Mbps, max 5.52 Mbps
relatively poor 8 peer 200-220 block/s average 70.06%, max 85.83% average 3 Mbps, max 19.55 Mbps

sync/src/types.rs Outdated Show resolved Hide resolved
sync/src/synchronizer/block_fetcher.rs Outdated Show resolved Hide resolved
@driftluo driftluo force-pushed the download-scheduler branch 6 times, most recently from 990661d to 372c173 Compare April 13, 2020 15:52
sync/src/types.rs Outdated Show resolved Hide resolved
sync/src/types.rs Outdated Show resolved Hide resolved
sync/src/types.rs Show resolved Hide resolved
sync/src/types.rs Outdated Show resolved Hide resolved
sync/src/types.rs Outdated Show resolved Hide resolved
sync/src/synchronizer/block_fetcher.rs Outdated Show resolved Hide resolved
sync/src/synchronizer/block_fetcher.rs Outdated Show resolved Hide resolved
sync/src/synchronizer/mod.rs Show resolved Hide resolved
sync/src/synchronizer/mod.rs Outdated Show resolved Hide resolved
sync/src/synchronizer/mod.rs Show resolved Hide resolved
@driftluo driftluo force-pushed the download-scheduler branch 7 times, most recently from 1d12b72 to 04f27aa Compare April 16, 2020 04:11
@driftluo driftluo requested review from doitian and quake April 16, 2020 04:11
@driftluo driftluo force-pushed the download-scheduler branch 3 times, most recently from 21dd64b to 942eef1 Compare April 16, 2020 12:39
sync/src/synchronizer/mod.rs Outdated Show resolved Hide resolved
sync/src/types.rs Outdated Show resolved Hide resolved
doitian
doitian previously approved these changes Apr 20, 2020
@driftluo
Copy link
Collaborator Author

benchmark

@yangby-cryptape
Copy link
Collaborator

Benchmark Result

  • TPS: 281.97
  • Samples Count: 51
  • CKB Version: 0829119
  • Instance Type: c5.xlarge
  • Instances Count: 3
  • Bench Type: 2in2out
  • CKB Logger Filter: info,ckb=debug

@driftluo
Copy link
Collaborator Author

benchmark

@yangby-cryptape
Copy link
Collaborator

Benchmark Result

  • TPS: 288.82
  • Samples Count: 51
  • CKB Version: 8dee03d
  • Instance Type: c5.xlarge
  • Instances Count: 3
  • Bench Type: 2in2out
  • CKB Logger Filter: info,ckb=debug

@driftluo
Copy link
Collaborator Author

benchmark

@yangby-cryptape
Copy link
Collaborator

Benchmark Result

  • TPS: 363.90
  • Samples Count: 50
  • CKB Version: faf983a
  • Instance Type: c5.xlarge
  • Instances Count: 3
  • Bench Type: 2in2out
  • CKB Logger Filter: info,ckb=debug

quake
quake previously approved these changes Apr 21, 2020
@driftluo
Copy link
Collaborator Author

bors r=quake, doitian

@bors
Copy link
Contributor

bors bot commented Apr 28, 2020

Build succeeded:

  • continuous-integration/travis-ci/push

@bors bors bot merged commit 573883f into nervosnetwork:develop Apr 28, 2020
@driftluo driftluo deleted the download-scheduler branch April 28, 2020 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants