Skip to content

Commit

Permalink
[5 / 5] Introduce approval-voting-parallel (paritytech#4849)
Browse files Browse the repository at this point in the history
This is the implementation of the approach described here:
paritytech#1617 (comment)
&
paritytech#1617 (comment)
&
paritytech#1617 (comment).

## Description of changes

The end goal is to have an architecture where we have single
subsystem(`approval-voting-parallel`) and multiple worker types that
would full-fill the work that currently is fulfilled by the
`approval-distribution` and `approval-voting` subsystems. The main loop
of the new subsystem would do just the distribution of work to the
workers.

The new subsystem will have:
- N approval-distribution workers: This would do the work that is
currently being done by the approval-distribution subsystem and in
addition to that will also perform the crypto-checks that an assignment
is valid and that a vote is correctly signed. Work is assigned via the
following formula: `worker_index = msg.validator % WORKER_COUNT`, this
guarantees that all assignments and approvals from the same validator
reach the same worker.
- 1 approval-voting worker: This would receive an already valid message
and do everything the approval-voting currently does, except the
crypto-checking that has been moved already to the approval-distribution
worker.

On the hot path of processing messages **no** synchronisation and
waiting is needed between approval-distribution and approval-voting
workers.

<img width="1431" alt="Screenshot 2024-06-07 at 11 28 08"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/a196199b-b705-4140-87d4-c6900ba8595e">



## Guidelines for reading

The full implementation is broken in 5 PRs and all of them are
self-contained and improve things incrementally even without the
parallelisation being implemented/enabled, the reason this approach was
taken instead of a big-bang PR, is to make things easier to review and
reduced the risk of breaking this critical subsystems.

After reading the full description of this PR, the changes should be
read in the following order:
1. paritytech#4848, some other
micro-optimizations for networks with a high number of validators. This
change gives us a speed up by itself without any other changes.
2. paritytech#4845 , this contains
only interface changes to decouple the subsystem from the `Context` and
be able to run multiple instances of the subsystem on different threads.
**No functional changes**
3. paritytech#4928, moving of the
crypto checks from approval-voting in approval-distribution, so that the
approval-distribution has no reason to wait after approval-voting
anymore. This change gives us a speed up by itself without any other
changes.
4. paritytech#4846, interface
changes to make approval-voting runnable on a separate thread. **No
functional changes**
5. This PR, where we instantiate an `approval-voting-parallel` subsystem
that runs on different workers the logic currently in
`approval-distribution` and `approval-voting`.
6. The next step after this changes get merged and deploy would be to
bring all the files from approval-distribution, approval-voting,
approval-voting-parallel into a single rust crate, to make it easier to
maintain and understand the structure.

## Results
Running subsystem-benchmarks with 1000 validators 100 fully ocuppied
cores and triggering all assignments and approvals for all tranches

#### Approval does not lags behind. 
 Master
```
Chain selection approved  after 72500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
```
With this PoC
```
Chain selection approved  after 3500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
```

#### Gathering enough assignments
 
Enough assignments are gathered in less than 500ms, so that gives un a
guarantee that un-necessary work does not get triggered, on master on
the same benchmark because the subsystems fall behind on work, that
number goes above 32 seconds on master.
 
<img width="2240" alt="Screenshot 2024-06-20 at 15 48 22"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/d2f2b29c-5ff6-44b4-a245-5b37ab8e58bc">


#### Cpu usage:
Master
```
CPU usage, seconds                     total   per block
approval-distribution                96.9436      9.6944
approval-voting                     117.4676     11.7468
test-environment                     44.0092      4.4009
```
With this PoC
```
CPU usage, seconds                     total   per block
approval-distribution                 0.0014      0.0001 --- unused
approval-voting                       0.0437      0.0044.  --- unused
approval-voting-parallel              5.9560      0.5956
approval-voting-parallel-0           22.9073      2.2907
approval-voting-parallel-1           23.0417      2.3042
approval-voting-parallel-2           22.0445      2.2045
approval-voting-parallel-3           22.7234      2.2723
approval-voting-parallel-4           21.9788      2.1979
approval-voting-parallel-5           23.0601      2.3060
approval-voting-parallel-6           22.4805      2.2481
approval-voting-parallel-7           21.8330      2.1833
approval-voting-parallel-db          37.1954      3.7195.  --- the approval-voting thread.
```

# Enablement strategy

Because just some trivial plumbing is needed in approval-distribution
and approval-voting to be able to run things in parallel and because
this subsystems plays a critical part in the system this PR proposes
that we keep both ways of running the approval work, as separated
subsystems and just a single subsystem(`approval-voting-parallel`) which
has multiple workers for the distribution work and one worker for the
approval-voting work and switch between them with a comandline flag.

The benefits for this is twofold.
1. With the same polkadot binary we can easily switch just a few
validators to use the parallel approach and gradually make this the
default way of running, if now issues arise.
2. In the worst case scenario were it becomes the default way of running
things, but we discover there are critical issues with it we have the
path to quickly disable it by asking validators to adjust their command
line flags.


# Next steps
- [x] Make sure through various testing we are not missing anything 
- [x] Polish the implementations to make them production ready
- [x] Add Unittest Tests for approval-voting-parallel.
- [x] Define and implement the strategy for rolling this change, so that
the blast radius is minimal(single validator) in case there are problems
with the implementation.
- [x]  Versi long running tests.
- [x] Add relevant metrics.

@ordian @eskimor @sandreim @AndreiEres, let me know what you think.

---------

Signed-off-by: Alexandru Gheorghe <[email protected]>
  • Loading branch information
alexggh authored Sep 26, 2024
1 parent f6d08e6 commit b16237a
Show file tree
Hide file tree
Showing 53 changed files with 3,565 additions and 217 deletions.
8 changes: 8 additions & 0 deletions .gitlab/pipeline/zombienet/polkadot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,14 @@ zombienet-polkadot-functional-0015-coretime-shared-core:
--local-dir="${LOCAL_DIR}/functional"
--test="0015-coretime-shared-core.zndsl"

zombienet-polkadot-functional-0016-approval-voting-parallel:
extends:
- .zombienet-polkadot-common
script:
- /home/nonroot/zombie-net/scripts/ci/run-test-local-env-manager.sh
--local-dir="${LOCAL_DIR}/functional"
--test="0016-approval-voting-parallel.zndsl"

zombienet-polkadot-smoke-0001-parachains-smoke-test:
extends:
- .zombienet-polkadot-common
Expand Down
46 changes: 46 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ members = [
"polkadot/erasure-coding/fuzzer",
"polkadot/node/collation-generation",
"polkadot/node/core/approval-voting",
"polkadot/node/core/approval-voting-parallel",
"polkadot/node/core/av-store",
"polkadot/node/core/backing",
"polkadot/node/core/bitfield-signing",
Expand Down Expand Up @@ -1032,6 +1033,7 @@ polkadot-gossip-support = { path = "polkadot/node/network/gossip-support", defau
polkadot-network-bridge = { path = "polkadot/node/network/bridge", default-features = false }
polkadot-node-collation-generation = { path = "polkadot/node/collation-generation", default-features = false }
polkadot-node-core-approval-voting = { path = "polkadot/node/core/approval-voting", default-features = false }
polkadot-node-core-approval-voting-parallel = { path = "polkadot/node/core/approval-voting-parallel", default-features = false }
polkadot-node-core-av-store = { path = "polkadot/node/core/av-store", default-features = false }
polkadot-node-core-backing = { path = "polkadot/node/core/backing", default-features = false }
polkadot-node-core-bitfield-signing = { path = "polkadot/node/core/bitfield-signing", default-features = false }
Expand Down
1 change: 1 addition & 0 deletions cumulus/client/relay-chain-inprocess-interface/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,7 @@ fn build_polkadot_full_node(
execute_workers_max_num: None,
prepare_workers_hard_max_num: None,
prepare_workers_soft_max_num: None,
enable_approval_voting_parallel: false,
},
)?;

Expand Down
7 changes: 7 additions & 0 deletions polkadot/cli/src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,13 @@ pub struct RunCmd {
/// TESTING ONLY: disable the version check between nodes and workers.
#[arg(long, hide = true)]
pub disable_worker_version_check: bool,

/// Enable approval-voting message processing in parallel.
///
///**Dangerous!** This is an experimental feature and should not be used in production, unless
/// explicitly advised to.
#[arg(long)]
pub enable_approval_voting_parallel: bool,
}

#[allow(missing_docs)]
Expand Down
1 change: 1 addition & 0 deletions polkadot/cli/src/command.rs
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,7 @@ where
execute_workers_max_num: cli.run.execute_workers_max_num,
prepare_workers_hard_max_num: cli.run.prepare_workers_hard_max_num,
prepare_workers_soft_max_num: cli.run.prepare_workers_soft_max_num,
enable_approval_voting_parallel: cli.run.enable_approval_voting_parallel,
},
)
.map(|full| full.task_manager)?;
Expand Down
55 changes: 55 additions & 0 deletions polkadot/node/core/approval-voting-parallel/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
[package]
name = "polkadot-node-core-approval-voting-parallel"
version = "7.0.0"
authors.workspace = true
edition.workspace = true
license.workspace = true
description = "Approval Voting Subsystem running approval work in parallel"

[lints]
workspace = true

[dependencies]
async-trait = { workspace = true }
futures = { workspace = true }
futures-timer = { workspace = true }
gum = { workspace = true }
itertools = { workspace = true }
thiserror = { workspace = true }

polkadot-node-core-approval-voting = { workspace = true, default-features = true }
polkadot-approval-distribution = { workspace = true, default-features = true }
polkadot-node-subsystem = { workspace = true, default-features = true }
polkadot-node-subsystem-util = { workspace = true, default-features = true }
polkadot-overseer = { workspace = true, default-features = true }
polkadot-primitives = { workspace = true, default-features = true }
polkadot-node-primitives = { workspace = true, default-features = true }
polkadot-node-jaeger = { workspace = true, default-features = true }
polkadot-node-network-protocol = { workspace = true, default-features = true }
polkadot-node-metrics = { workspace = true, default-features = true }

sc-keystore = { workspace = true, default-features = false }
sp-consensus = { workspace = true, default-features = false }
sp-consensus-slots = { workspace = true, default-features = false }
sp-application-crypto = { workspace = true, default-features = false, features = ["full_crypto"] }
sp-runtime = { workspace = true, default-features = false }

rand = { workspace = true }
rand_core = { workspace = true }
rand_chacha = { workspace = true }

[dev-dependencies]
async-trait = { workspace = true }
parking_lot = { workspace = true }
sp-keyring = { workspace = true, default-features = true }
sp-keystore = { workspace = true, default-features = true }
sp-core = { workspace = true, default-features = true }
sp-consensus-babe = { workspace = true, default-features = true }
sp-tracing = { workspace = true }
polkadot-node-subsystem-test-helpers = { workspace = true, default-features = true }
assert_matches = { workspace = true }
kvdb-memorydb = { workspace = true }
polkadot-primitives-test-helpers = { workspace = true, default-features = true }
log = { workspace = true, default-features = true }
polkadot-subsystem-bench = { workspace = true, default-features = true }
schnorrkel = { workspace = true, default-features = true }
Loading

0 comments on commit b16237a

Please sign in to comment.