Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[execution window] use blocking txn provider #14584

Draft
wants to merge 1 commit into
base: brian/exec-window-exec-base
Choose a base branch
from

Conversation

bchocho
Copy link
Contributor

@bchocho bchocho commented Sep 10, 2024

Description

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Sep 10, 2024

⏱️ 5h 28m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
forge-e2e-test / forge 3h 50m 🟥🟥🟥🟥 (+2 more)
test-target-determinator 26m 🟩🟩🟩🟩
rust-images / rust-all 14m 🟩
rust-move-tests 8m 🟩
rust-move-tests 8m 🟥
rust-cargo-deny 8m 🟩🟩🟩🟩
rust-move-tests 8m 🟩
general-lints 7m 🟩🟩🟩🟩
check-dynamic-deps 7m 🟩🟩🟩🟩🟩 (+2 more)
rust-move-tests 5m 🟥
semgrep/ci 2m 🟩🟩🟩🟩🟩
file_change_determinator 53s 🟩🟩🟩🟩
file_change_determinator 48s 🟩🟩🟩🟩
file_change_determinator 46s 🟩🟩🟩🟩
permission-check 19s 🟩🟩🟩🟩🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@bchocho bchocho added the CICD:run-forge-e2e-perf Run the e2e perf forge only label Sep 10, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho changed the title Brian/exec window exec optimize 2 [execution window] optimized 2 Sep 11, 2024

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-exec-optimize branch from 16da500 to 893fde7 Compare September 25, 2024 17:27
@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from 725d66e to 3d99ab6 Compare September 26, 2024 17:00

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho changed the title [execution window] optimized 2 [execution window] use blocking txn provider Sep 26, 2024
@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from 3d99ab6 to fac9a47 Compare September 30, 2024 18:41

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from 49ca098 to b3d33c6 Compare September 30, 2024 23:18

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho added CICD:build-failpoints-images Build failpoints docker image CICD:build-performance-images build performance docker image variants labels Oct 15, 2024

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from 88df965 to 5b71c29 Compare October 16, 2024 18:07
@bchocho bchocho changed the base branch from brian/exec-window-exec-optimize to brian/exec-window-exec-base October 16, 2024 18:07

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from d0db51e to e395dca Compare October 17, 2024 20:36

This comment has been minimized.

This comment has been minimized.

… round-1 in execution phase.

This requires using blocking txn provider to quickly provide shuffled txns in the execution phase.

Various improvements to avoid cloning transactions.
@bchocho bchocho force-pushed the brian/exec-window-exec-optimize-2 branch from e395dca to a3feb00 Compare October 30, 2024 18:20

This comment has been minimized.

Copy link
Contributor

❌ Forge suite realistic_env_max_load failure on a3feb008e0f9fed5e9d0d5900d56b6cd79d99eef

two traffics test: inner traffic : committed: 12084.50 txn/s, latency: 3289.89 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 13300 ms), latency samples: 4594980
two traffics test : committed: 100.07 txn/s, latency: 4105.77 ms, (p50: 2300 ms, p70: 2400, p90: 7000 ms, p99: 33300 ms), latency samples: 1860
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 8.483, avg: 1.423", "ConsensusProposalToOrdered: max: 0.392, avg: 0.310", "ConsensusOrderedToCommit: max: 1.459, avg: 0.514", "ConsensusProposalToCommit: max: 1.851, avg: 0.824"]
Test Failed: check for success

Caused by:
    "MempoolToBlockCreation" metric violated threshold of 2.85, max_breach_pct: 5, breach_pct: 7 

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.89/src/error.rs:85:36
   1: aptos_forge::success_criteria::MetricsThreshold::ensure_metrics_threshold
   2: aptos_forge::success_criteria::LatencyBreakdownThreshold::ensure_threshold
             at ./testsuite/forge/src/success_criteria.rs:156:13
   3: aptos_forge::success_criteria::SuccessCriteriaChecker::check_for_success::{{closure}}
             at ./testsuite/forge/src/success_criteria.rs:314:13
   4: aptos_forge::interface::network::NetworkContext::check_for_success::{{closure}}
             at ./testsuite/forge/src/interface/network.rs:112:10
   5: <dyn aptos_testcases::NetworkLoadTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
             at ./testsuite/testcases/src/lib.rs:325:14
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
   7: <aptos_testcases::two_traffics_test::TwoTrafficsTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
             at ./testsuite/testcases/src/two_traffics_test.rs:77:47
   8: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
   9: <aptos_testcases::CompositeNetworkTest as aptos_forge::interface::network::NetworkTest>::run::{{closure}}
             at ./testsuite/testcases/src/lib.rs:631:37
  10: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
  11: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/park.rs:281:63
  12: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:107:5
  13: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:73:5
  14: tokio::runtime::park::CachedParkThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/park.rs:281:31
  15: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/blocking.rs:66:9
  16: tokio::runtime::handle::Handle::block_on_inner::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/handle.rs:324:22
  17: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/runtime.rs:65:16
  18: tokio::runtime::handle::Handle::block_on_inner
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/handle.rs:323:9
  19: tokio::runtime::handle::Handle::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/handle.rs:302:18
  20: aptos_forge::runner::Forge<F>::run::{{closure}}
             at ./testsuite/forge/src/runner.rs:637:42
  21: aptos_forge::runner::run_test
             at ./testsuite/forge/src/runner.rs:710:11
  22: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:637:30
  23: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:431:11
  24: forge::main
             at ./testsuite/forge-cli/src/main.rs:311:21
  25: core::ops::function::FnOnce::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
  26: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
  27: std::rt::lang_start::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:166:18
  28: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:284:13
  29: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  30: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  31: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  32: std::rt::lang_start_internal::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:48
  33: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  34: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  35: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  36: std::rt::lang_start_internal
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
  37: main
  38: __libc_start_main
  39: _start
Trailing Log Lines:
  34: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  35: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  36: std::rt::lang_start_internal
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
  37: main
  38: __libc_start_main
  39: _start


Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:381"},"thread_name":"main","hostname":"forge-e2e-pr-14584-1730313769-a3feb008e0f9fed5e9d0d5900d56b6cd7","timestamp":"2024-10-30T18:56:23.666906Z","message":"Deleting namespace forge-e2e-pr-14584: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:398"},"thread_name":"main","hostname":"forge-e2e-pr-14584-1730313769-a3feb008e0f9fed5e9d0d5900d56b6cd7","timestamp":"2024-10-30T18:56:23.666942Z","message":"aptos-node resources for Forge removed in namespace: forge-e2e-pr-14584"}

failures:
    CompositeNetworkTest

test result: FAILED. 0 passed; 1 failed; 0 filtered out

Failed to run tests:
Tests Failed
Error: Tests Failed

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.89/src/error.rs:85:36
   1: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:662:13
   2: forge::run_forge
             at ./testsuite/forge-cli/src/main.rs:431:11
   3: forge::main
             at ./testsuite/forge-cli/src/main.rs:311:21
   4: core::ops::function::FnOnce::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
   5: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
   6: std::rt::lang_start::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:166:18
   7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:284:13
   8: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
   9: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  10: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  11: std::rt::lang_start_internal::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:48
  12: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  13: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  14: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  15: std::rt::lang_start_internal
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
  16: main
  17: __libc_start_main
  18: _start
Debugging output:
NAME                                    READY   STATUS      RESTARTS        AGE
aptos-node-0-fullnode-eforge191-0       1/1     Running     0               13m
aptos-node-0-validator-0                1/1     Running     1 (6m19s ago)   13m
aptos-node-1-fullnode-eforge191-0       1/1     Running     0               13m
aptos-node-1-validator-0                1/1     Running     1 (6m35s ago)   13m
aptos-node-2-fullnode-eforge191-0       1/1     Running     0               13m
aptos-node-2-validator-0                1/1     Running     1 (6m20s ago)   13m
aptos-node-3-fullnode-eforge191-0       1/1     Running     0               13m
aptos-node-3-validator-0                1/1     Running     1 (9m3s ago)    13m
aptos-node-4-fullnode-eforge191-0       1/1     Running     0               13m
aptos-node-4-validator-0                1/1     Running     1 (9m13s ago)   13m
aptos-node-5-validator-0                1/1     Running     0               13m
aptos-node-6-validator-0                1/1     Running     0               13m
genesis-aptos-genesis-eforge191-txx76   0/1     Completed   0               13m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:build-failpoints-images Build failpoints docker image CICD:build-performance-images build performance docker image variants CICD:run-forge-e2e-perf Run the e2e perf forge only
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant