Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Consensus Observer] Add startup period before fallback. #14990

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

JoshLind
Copy link
Contributor

Description

This PR adds a grace period to consensus observer (CO) that prevents CO from entering fallback mode until the grace period is over. This helps to prevent fallback from occurring when a network first starts (e.g., forge).

Testing Plan

New and existing test infrastructure.

Copy link

trunk-io bot commented Oct 16, 2024

⏱️ 4h 43m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 1h 53m 🟩🟩🟩🟩🟩
test-target-determinator 23m 🟩🟩🟩🟩🟩
execution-performance / test-target-determinator 23m 🟩🟩🟩🟩🟩
check 18m 🟩🟩🟩🟩🟩
rust-cargo-deny 10m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 10m 🟩
rust-move-tests 10m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
fetch-last-released-docker-image-tag 8m 🟩🟩🟩🟩🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@JoshLind JoshLind added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Oct 16, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@zekun000
Copy link
Contributor

it looks like we had 100ms extra time between VN and VFN in experiment, but the network is only 50ms. we should figure out where the rest 50ms comes from, my gut feeling is that's from sequential verification but we should confirm

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@JoshLind JoshLind requested review from bchocho and hariria October 22, 2024 18:54
@JoshLind JoshLind enabled auto-merge (rebase) October 22, 2024 19:58

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b

two traffics test: inner traffic : committed: 13944.68 txn/s, latency: 2847.77 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3300 ms), latency samples: 5302080
two traffics test : committed: 99.94 txn/s, latency: 1673.35 ms, (p50: 1500 ms, p70: 1500, p90: 1600 ms, p99: 10600 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.250, avg: 0.226", "QsPosToProposal: max: 1.219, avg: 1.186", "ConsensusProposalToOrdered: max: 0.329, avg: 0.301", "ConsensusOrderedToCommit: max: 0.418, avg: 0.400", "ConsensusProposalToCommit: max: 0.715, avg: 0.701"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.10s no progress at version 2397542 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.57s no progress at version 2397540 (avg 8.57s) [limit 15].
Test Ok

Copy link
Contributor

✅ Forge suite compat success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b (PR)
1. Check liveness of validators at old version: b29f09f57e898d8d211c8bc3e303f6e50bba2266
compatibility::simple-validator-upgrade::liveness-check : committed: 13665.47 txn/s, latency: 2297.46 ms, (p50: 1600 ms, p70: 1800, p90: 2200 ms, p99: 25400 ms), latency samples: 543400
2. Upgrading first Validator to new version: 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 6577.28 txn/s, latency: 4179.27 ms, (p50: 4700 ms, p70: 5000, p90: 5500 ms, p99: 5600 ms), latency samples: 118420
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6970.74 txn/s, latency: 4587.53 ms, (p50: 4800 ms, p70: 5000, p90: 6300 ms, p99: 6800 ms), latency samples: 232060
3. Upgrading rest of first batch to new version: 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6913.98 txn/s, latency: 4099.60 ms, (p50: 4700 ms, p70: 4900, p90: 5000 ms, p99: 5100 ms), latency samples: 129160
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6417.98 txn/s, latency: 4638.41 ms, (p50: 5000 ms, p70: 5000, p90: 5200 ms, p99: 6100 ms), latency samples: 243480
4. upgrading second batch to new version: 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 8928.56 txn/s, latency: 3136.54 ms, (p50: 3300 ms, p70: 3500, p90: 4600 ms, p99: 5000 ms), latency samples: 158840
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8932.92 txn/s, latency: 3553.47 ms, (p50: 3400 ms, p70: 4300, p90: 4900 ms, p99: 5600 ms), latency samples: 292740
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b passed
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b (PR)
Upgrade the nodes to version: 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1185.37 txn/s, submitted: 1188.45 txn/s, failed submission: 3.08 txn/s, expired: 3.08 txn/s, latency: 2563.05 ms, (p50: 2400 ms, p70: 2700, p90: 3600 ms, p99: 5100 ms), latency samples: 107720
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1184.41 txn/s, submitted: 1186.90 txn/s, failed submission: 2.49 txn/s, expired: 2.49 txn/s, latency: 2580.85 ms, (p50: 2400 ms, p70: 2700, p90: 4200 ms, p99: 5700 ms), latency samples: 104760
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b passed
Upgrade the remaining nodes to version: 5bf7a93388da91d1a630d3b1f8f475cb0c7a7f4b
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1096.60 txn/s, submitted: 1098.81 txn/s, failed submission: 2.21 txn/s, expired: 2.21 txn/s, latency: 2711.47 ms, (p50: 2400 ms, p70: 3000, p90: 4000 ms, p99: 5500 ms), latency samples: 99200
Test Ok

@JoshLind JoshLind merged commit daa4616 into main Oct 22, 2024
48 checks passed
@JoshLind JoshLind deleted the duration_sync_5 branch October 22, 2024 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants