Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimistic signature verification for signed batch info #14644

Merged
merged 9 commits into from
Oct 23, 2024

Conversation

vusirikala
Copy link
Contributor

@vusirikala vusirikala commented Sep 15, 2024

Description

This PR implements optimistic signature verification to reduce the time required to verify SignedBatchInfo.
When the optimistic signature verification feature flag is enabled, we will not verify these messages up front. We will accumulate the unverified messages, and when the accumulated voting power is higher than a threshold, we will aggregate all the signatures and verify the aggregated signature.
If the verification fails, we need to verify each individual signature. The ValidatorVerifier stores the list of authors that submitted bad messages, and will disable the optimistic signature verification for these malicious voters.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Sep 15, 2024

⏱️ 5h 51m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 1h 26m 🟩🟩🟩🟩
forge-compat-test / forge 1h 7m 🟩🟩🟩
forge-e2e-test / forge 52m 🟩🟩🟩
test-target-determinator 18m 🟩🟩🟩🟩
execution-performance / test-target-determinator 18m 🟩🟩🟩🟩
check 15m 🟩🟩🟩🟩
rust-move-tests 10m 🟩
rust-move-tests 10m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
general-lints 7m 🟩🟩🟩🟩
rust-cargo-deny 7m 🟩🟩🟩🟩
check-dynamic-deps 7m 🟩🟩🟩🟩🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 6m 🟥🟩🟩🟩
rust-doc-tests 5m 🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
permission-check 4m 3s +7640%

settingsfeedbackdocs ⋅ learn more about trunk.io

@vusirikala vusirikala changed the base branch from main to satya/osv_commit_votes September 15, 2024 01:42
@vusirikala vusirikala added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Sep 15, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

consensus/src/quorum_store/proof_coordinator.rs Outdated Show resolved Hide resolved
let all_voters = self.all_voters();
epoch_state
.verifier
.check_voting_power(all_voters.iter(), true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still don't use f+1 for quorum store cert? cc @zekun000

self.unverified_signatures = PartialSignatures::empty();
let aggregated_sig = epoch_state
.verifier
.aggregate_signatures(&self.verified_signatures)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check voting power before aggregating

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@vusirikala vusirikala changed the base branch from main to satya/refactor_sig_aggregator October 22, 2024 21:50

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@zekun000 zekun000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert the config before landing

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Base automatically changed from satya/refactor_sig_aggregator to main October 23, 2024 18:59
@vusirikala vusirikala enabled auto-merge (squash) October 23, 2024 19:58

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 8b3ae380311afb49a971f989e152b0cdc5699dbf

two traffics test: inner traffic : committed: 14100.12 txn/s, latency: 2816.21 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3600 ms), latency samples: 5361320
two traffics test : committed: 100.07 txn/s, latency: 1715.15 ms, (p50: 1400 ms, p70: 1500, p90: 1600 ms, p99: 10600 ms), latency samples: 1720
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.921, avg: 1.551", "ConsensusProposalToOrdered: max: 0.330, avg: 0.297", "ConsensusOrderedToCommit: max: 0.402, avg: 0.374", "ConsensusProposalToCommit: max: 0.697, avg: 0.672"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.87s no progress at version 2736115 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.03s no progress at version 2736111 (avg 8.03s) [limit 15].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf (PR)
Upgrade the nodes to version: 8b3ae380311afb49a971f989e152b0cdc5699dbf
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1202.31 txn/s, submitted: 1204.15 txn/s, failed submission: 1.85 txn/s, expired: 1.85 txn/s, latency: 2541.99 ms, (p50: 2100 ms, p70: 2400, p90: 4500 ms, p99: 6200 ms), latency samples: 104260
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1243.57 txn/s, submitted: 1246.89 txn/s, failed submission: 3.32 txn/s, expired: 3.32 txn/s, latency: 2421.92 ms, (p50: 2100 ms, p70: 2400, p90: 3600 ms, p99: 5700 ms), latency samples: 112360
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf passed
Upgrade the remaining nodes to version: 8b3ae380311afb49a971f989e152b0cdc5699dbf
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1246.56 txn/s, submitted: 1247.71 txn/s, failed submission: 1.15 txn/s, expired: 1.15 txn/s, latency: 2522.64 ms, (p50: 2200 ms, p70: 2700, p90: 4200 ms, p99: 5900 ms), latency samples: 108440
Test Ok

Copy link
Contributor

✅ Forge suite compat success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf (PR)
1. Check liveness of validators at old version: b29f09f57e898d8d211c8bc3e303f6e50bba2266
compatibility::simple-validator-upgrade::liveness-check : committed: 13883.67 txn/s, latency: 2427.67 ms, (p50: 2100 ms, p70: 2200, p90: 2700 ms, p99: 7800 ms), latency samples: 450460
2. Upgrading first Validator to new version: 8b3ae380311afb49a971f989e152b0cdc5699dbf
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 5888.34 txn/s, latency: 4874.74 ms, (p50: 5600 ms, p70: 5800, p90: 6000 ms, p99: 6100 ms), latency samples: 111620
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 5281.85 txn/s, latency: 5877.19 ms, (p50: 6200 ms, p70: 6300, p90: 6800 ms, p99: 7900 ms), latency samples: 198360
3. Upgrading rest of first batch to new version: 8b3ae380311afb49a971f989e152b0cdc5699dbf
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 5603.02 txn/s, latency: 5085.42 ms, (p50: 5800 ms, p70: 6100, p90: 6400 ms, p99: 6800 ms), latency samples: 106780
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 5473.33 txn/s, latency: 5951.63 ms, (p50: 6500 ms, p70: 6600, p90: 7100 ms, p99: 7900 ms), latency samples: 189660
4. upgrading second batch to new version: 8b3ae380311afb49a971f989e152b0cdc5699dbf
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 8167.18 txn/s, latency: 3259.74 ms, (p50: 3000 ms, p70: 4000, p90: 4600 ms, p99: 5300 ms), latency samples: 151380
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 7629.21 txn/s, latency: 4181.71 ms, (p50: 4300 ms, p70: 5000, p90: 5800 ms, p99: 7200 ms), latency samples: 259700
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 8b3ae380311afb49a971f989e152b0cdc5699dbf passed
Test Ok

@vusirikala vusirikala merged commit 38e33fe into main Oct 23, 2024
91 of 93 checks passed
@vusirikala vusirikala deleted the satya/osv_signed_batch_info branch October 23, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants