Execution speed backpressure to handle consistent gas miscalibration #13829

igor-aptos · 2024-06-26T17:32:47Z

Description

We can have gas miscalibration for various reasons, leading to a block taking longer than intended 300ms even after block gas limit has been applied.
And so here we make sure on consensus side to gracefully handle it.

We track execution speed of historical blocks, and their sizes - and compute what block size should've been such that it lasted 300ms. We then take p50 of previous 10 blocks - and use that as the block limit for the next block.
Once we have improved re-ordering and execution pool, we can additionally use reordering_ovarpacking_factor to create larger blocks - but only execute wanted number of transactions.

Currently execution stage is the bottleneck almost always, and so this PR only touches that. But practically - we can do the same for each stage, to make sure to reduce the block if for example commit stage is slow.

Type of Change

Performance improvement

Which Components or Systems Does This Change Impact?

Validator Node

How Has This Been Tested?

forge test from #13789

Key Areas to Review

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

trunk-io · 2024-06-26T17:32:51Z

⏱️ 4h 16m total CI duration on this PR

Job	Cumulative Duration	Recent Runs
test-fuzzers	1h 50m	🟩 🟩 🟩
rust-images-performance / rust-all	23m	🟩
forge-framework-upgrade-test / forge	17m	🟩
forge-compat-test / forge	15m	🟩
forge-e2e-test / forge	14m	🟩
rust-targeted-unit-tests	13m	🟩
rust-images-failpoints / rust-all	12m	🟩
rust-images / rust-all	12m	🟩
rust-move-tests	7m	🟩
general-lints	6m	🟩 🟩 🟩
rust-lints	5m	🟩
run-tests-main-branch	4m	🟩
test-target-determinator	4m	🟩
rust-move-tests	4m	🟩
rust-move-tests	3m	🟩
check-dynamic-deps	2m	🟩 🟩 🟩
semgrep/ci	2m	🟩 🟩 🟩 🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose	2m	🟩
file_change_determinator	35s	🟩 🟩 🟩
file_change_determinator	28s	🟩 🟩 🟩
permission-check	10s	🟩 🟩 🟩
file_change_determinator	10s	🟩
permission-check	9s	🟩 🟩 🟩
permission-check	8s	🟩 🟩
permission-check	6s	🟩 🟩
permission-check	3s	🟩
determine-docker-build-metadata	2s	🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job	Duration	vs 7d avg	Delta
forge-framework-upgrade-test / forge	17m	12m

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

msmouse · 2024-07-18T00:31:40Z

consensus/src/liveness/proposal_generator.rs

+                    }
+                })
+                .sorted()
+                // .sorted_by_key(|key| key.unwrap_or(u64::MAX))


nit: remove

msmouse · 2024-07-18T00:42:46Z

execution/executor-types/src/lib.rs

+        self.compute_status_for_input_txns()
+            .iter()
+            .filter(|status| if let TransactionStatus::Keep(_) = status { true } else { false })
+            .count()


hmm.. why has this changed?

it was wrong, and it is used in tests.

it was passing , because those tests had no block limit.

hmmm.. wouldn't this make execution pipeline fail any time the block limit is hit??

this function (transactions_to_commit_len) is only used in tests.

transactions_to_commit is used throughout, and that one is correct.

guy-goren

Left some comments. Nothing serious or blocking as far as I'm concerned.
The main issue with the slowdown is that it's a local decision which raises the free-riding concern. However, I do not think it is a severe risk -- just one to be aware of.

guy-goren · 2024-07-17T13:50:01Z

config/src/config/consensus_config.rs

                },
                PipelineBackpressureValues {
-                    back_pressure_pipeline_latency_limit_ms: 3500,
+                    back_pressure_pipeline_latency_limit_ms: 6000,
                    max_sending_block_txns_override: 1000,
                    max_sending_block_bytes_override: MIN_BLOCK_BYTES_OVERRIDE,
                    backpressure_proposal_delay_ms: 500,


In this block of PipelineBackpressureValues I don't understand why the gas is not limited. Wouldn't that affect the execution time better than # of txs?
Also, what is the logic behind the chosen numbers?

Also, what is the logic behind the chosen numbers?

nothing more than gradual reduction as values change. we could've also just had formula instead of multiple discrete values, not a big deal either way

In this block of PipelineBackpressureValues I don't understand why the gas is not limited. Wouldn't that affect the execution time better than # of txs?

number of txns is known upfront (i.e. when block is being created), and how block gas limit affects the block is only know after execution. So if proposer wants to create smaller block for smaller messages, # of txns need to be adjusted.

Once we have execution pool, and latency overhead of transaction being included in the block, but hitting the limit, being drastically reduced, plan is to change most backpressures to reduce block gas limit, instead of transactions. (i.e. max_sending_block_txns_override will be in txns, and max_txns_from_block_to_execute will be changed to be max_block_gas_limit_override)

guy-goren · 2024-07-17T14:12:16Z

consensus/consensus-types/src/pipelined_block.rs

@@ -104,9 +106,28 @@ impl PipelinedBlock {
        mut self,
        input_transactions: Vec<SignedTransaction>,
        result: StateComputeResult,
+        execution_time: Duration,


I suppose a validator can write any value she wishes here. Can this value be verified to be true?
The validator might have an incentive to lie due to a "free-riding" argument. That is, let other validators slow down (and receive less rewards), their slow down will also "solve the problem for me" while not slowing down my rewards.

this is a local value, from it's own computation. there is no need for it to lie here, validator can just adjust block creation computation in any way they want.

that's why longterm - getting incentives/protocol right is needed here for best results

guy-goren · 2024-07-18T10:00:56Z

consensus/src/execution_pipeline.rs

+                            parent_block_id,
+                            block_executor_onchain_config,
+                        )
+                        .map(|output| (output, start.elapsed()))


Is this start.elapsed() value being propagated to other validators (is it in the block data)? And if so, does writing a different (incorrect) value affects the block's validity or txs validity?

no, this is local value, only used by a validator to decide how big of a block to create.

guy-goren · 2024-07-18T10:04:23Z

consensus/src/liveness/proposal_generator.rs

@@ -17,18 +17,18 @@ use crate::{
    util::time_service::TimeService,
 };
 use anyhow::{bail, ensure, format_err, Context};
-use aptos_config::config::{ChainHealthBackoffValues, PipelineBackpressureValues};
+use aptos_config::config::{
+    ChainHealthBackoffValues, ExecutionBackpressureConfig, PipelineBackpressureValues,


What is the reason for the different name convention? (Backoff != Backpressure and Values != Config)

Backoff vs Backpressure is old artifact

Config here generally means a set of configs (i.e. struct with fields), and Values here means a vector of sorts - either of raw values or of config structs

guy-goren · 2024-07-18T11:00:26Z

consensus/src/liveness/proposal_generator.rs

+            } else {
+                0.0
+            },
+        );


This seems to be a main place where the logic is applied. I would have benefited from some commenting here. But maybe it's just me and my (lack of) coding skill ;-)

zekun000 · 2024-07-24T00:11:06Z

consensus/src/liveness/proposal_generator.rs

+                }
+            }
+        }
+        EXECUTION_BACKPRESSURE_ON_PROPOSAL_TRIGGERED.observe(


do we need both this and pipeline backpressure based on timestamp? aren't they achieving similar goal?

this drastically reduces when pipeline backpressure kicks in. but:

this only considers execution. if commit or another stage becomes a bottleneck, we need to fall back to pipeline backpressure

this is only local. pipeline backpressure sees if building commit cert is delayed, and kicks in - even if current node is processing thing very fast

sitalkedia · 2024-07-24T16:37:37Z

config/src/config/consensus_config.rs

+    pub back_pressure_pipeline_latency_limit_ms: u64,
+    pub num_blocks_to_look_at: usize,
+    pub min_blocks_to_activate: usize,
+    pub percentile: f64,


nit: many of the configs could use some comments explaining the behavior.

sitalkedia · 2024-07-24T16:42:01Z

config/src/config/consensus_config.rs

+                num_blocks_to_look_at: 12,
+                min_blocks_to_activate: 4,
+                percentile: 0.5,
+                target_block_time_ms: 300,


Should we reduce the target block time to 250 ms given we can do ~3.5-4 blocks per second under load.

Can you address this? Apart from it, the PR looks good to me.

oops thought I did

Does this still have to be changed?

sitalkedia · 2024-07-24T16:55:40Z

consensus/src/execution_pipeline.rs

-                        block_executor_onchain_config,
-                    )
+                    let start = Instant::now();
+                    executor


Hmm wondering if its better to measure both the execution and state checkpoint latency here or just the execution latency ?

sitalkedia · 2024-07-24T16:58:33Z

consensus/src/liveness/proposal_generator.rs

+            block_execution_times
+        );
+
+        self.execution.as_ref().and_then(|config| {


This function could use some comment explaining the backoff behavior.

added comments

sitalkedia · 2024-07-24T17:04:46Z

consensus/src/liveness/proposal_generator.rs

+                .iter()
+                .flat_map(|summary| {
+                    let execution_time_ms = summary.execution_time.as_millis();
+                    if execution_time_ms > config.min_block_time_ms_to_activate as u128 {


I am wondering if this is the optimal way to calculate the block size. Instead of this, how about calculating some sort of normalize time to execute per transaction in the entire window and use that to compute the block size?

updated comment why I am doing it this way

igor-aptos · 2024-07-24T19:59:49Z

config/src/config/consensus_config.rs

-const MAX_SENDING_BLOCK_UNIQUE_TXNS: u64 = 1900;
-const MAX_SENDING_BLOCK_TXNS: u64 = 4500;


updating block size here

github-actions · 2024-07-24T23:09:04Z

✅ Forge suite `compat` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119 (PR)
1. Check liveness of validators at old version: 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5
compatibility::simple-validator-upgrade::liveness-check : committed: 7094.3538299060265 txn/s, latency: 4509.015389327526 ms, (p50: 3600 ms, p90: 5000 ms, p99: 30000 ms), latency samples: 293840
2. Upgrading first Validator to new version: 14d7a7bec99b6643b1a036c1a3fdcd23d4260119
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7055.582046739028 txn/s, latency: 3584.588829281543 ms, (p50: 3900 ms, p90: 4600 ms, p99: 4900 ms), latency samples: 147260
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6770.448883882847 txn/s, latency: 4402.684050118523 ms, (p50: 4000 ms, p90: 7300 ms, p99: 7900 ms), latency samples: 236240
3. Upgrading rest of first batch to new version: 14d7a7bec99b6643b1a036c1a3fdcd23d4260119
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6337.082570203247 txn/s, latency: 4184.062984848485 ms, (p50: 4400 ms, p90: 5800 ms, p99: 6000 ms), latency samples: 132000
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6745.363646733425 txn/s, latency: 4542.086414141414 ms, (p50: 4200 ms, p90: 7100 ms, p99: 7600 ms), latency samples: 237600
4. upgrading second batch to new version: 14d7a7bec99b6643b1a036c1a3fdcd23d4260119
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 3885.600109795848 txn/s, latency: 6504.483040112596 ms, (p50: 6200 ms, p90: 10600 ms, p99: 10900 ms), latency samples: 85260
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8810.864502409922 txn/s, latency: 4300.138118958667 ms, (p50: 3500 ms, p90: 7600 ms, p99: 11800 ms), latency samples: 337260
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119 passed
Test Ok

github-actions · 2024-07-24T23:10:08Z

✅ Forge suite `realistic_env_max_load` success on `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`

two traffics test: inner traffic : committed: 9180.498385919194 txn/s, submitted: 9437.242207233749 txn/s, failed submission: 0.4628858077377725 txn/s, expired: 256.74382131455314 txn/s, latency: 2782.0563529897095 ms, (p50: 2700 ms, p90: 3300 ms, p99: 4800 ms), latency samples: 3490640
two traffics test : committed: 100.00119940587238 txn/s, latency: 2090.9995 ms, (p50: 2100 ms, p90: 2400 ms, p99: 2900 ms), latency samples: 2000
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.253, avg: 0.217", "QsPosToProposal: max: 0.956, avg: 0.390", "ConsensusProposalToOrdered: max: 0.330, avg: 0.297", "ConsensusOrderedToCommit: max: 0.391, avg: 0.375", "ConsensusProposalToCommit: max: 0.682, avg: 0.672"]
Max round gap was 1 [limit 4] at version 1908973. Max no progress secs was 5.641538 [limit 15] at version 1908973.
Test Ok

github-actions · 2024-07-24T23:10:27Z

✅ Forge suite `framework_upgrade` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119 (PR)
Upgrade the nodes to version: 14d7a7bec99b6643b1a036c1a3fdcd23d4260119
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1076.4254986383735 txn/s, submitted: 1079.7193343135214 txn/s, failed submission: 3.2938356751480216 txn/s, expired: 3.2938356751480216 txn/s, latency: 2837.436148510812 ms, (p50: 2100 ms, p90: 5400 ms, p99: 9600 ms), latency samples: 98040
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1045.4683999432552 txn/s, submitted: 1048.1095832694277 txn/s, failed submission: 2.6411833261724342 txn/s, expired: 2.6411833261724342 txn/s, latency: 3087.9588315789474 ms, (p50: 2100 ms, p90: 6100 ms, p99: 10900 ms), latency samples: 95000
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119 passed
Upgrade the remaining nodes to version: 14d7a7bec99b6643b1a036c1a3fdcd23d4260119
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1156.3398722809857 txn/s, submitted: 1159.0330204061372 txn/s, failed submission: 2.693148125151565 txn/s, expired: 2.693148125151565 txn/s, latency: 2903.5106711835697 ms, (p50: 2400 ms, p90: 5100 ms, p99: 8100 ms), latency samples: 94460
Test Ok

igor-aptos requested review from bchocho, vusirikala, sitalkedia and zekun000 June 26, 2024 17:32

igor-aptos requested review from sasha8, gelash, ibalajiarun, JoshLind and gregnazario as code owners June 26, 2024 17:32

igor-aptos force-pushed the igor/execution_backpressure branch from 5bc70e3 to 55e3c5c Compare June 28, 2024 20:20

igor-aptos requested review from msmouse, lightmark and grao1991 as code owners June 28, 2024 20:20

igor-aptos force-pushed the igor/execution_backpressure branch from 55e3c5c to 680a819 Compare June 28, 2024 21:02

igor-aptos force-pushed the igor/execution_backpressure branch from 680a819 to 01404b6 Compare July 11, 2024 20:51

msmouse reviewed Jul 18, 2024

View reviewed changes

guy-goren approved these changes Jul 18, 2024

View reviewed changes

zekun000 approved these changes Jul 24, 2024

View reviewed changes

sitalkedia reviewed Jul 24, 2024

View reviewed changes

igor-aptos force-pushed the igor/execution_backpressure branch from 01404b6 to 651c559 Compare July 24, 2024 19:42

igor-aptos commented Jul 24, 2024

View reviewed changes

igor-aptos enabled auto-merge (squash) July 24, 2024 20:13

igor-aptos added 4 commits July 24, 2024 13:24

Execution speed backpressure

e0210e4

usize -> u64

0217444

modifying how we compute block size

23565f4

use commit / retry ratio in estimation

e8e80ba

igor-aptos force-pushed the igor/execution_backpressure branch from 651c559 to b6e6120 Compare July 24, 2024 20:24

This comment has been minimized.

Sign in to view

review comments.

14d7a7b

igor-aptos force-pushed the igor/execution_backpressure branch from b6e6120 to 14d7a7b Compare July 24, 2024 22:40

This comment has been minimized.

Sign in to view

igor-aptos merged commit 8e75c7c into main Jul 24, 2024
47 checks passed

igor-aptos deleted the igor/execution_backpressure branch July 24, 2024 23:10

		const MAX_SENDING_BLOCK_UNIQUE_TXNS: u64 = 1900;
		const MAX_SENDING_BLOCK_TXNS: u64 = 4500;

Execution speed backpressure to handle consistent gas miscalibration #13829

Execution speed backpressure to handle consistent gas miscalibration #13829

Conversation

igor-aptos commented Jun 26, 2024

Description

Type of Change

Which Components or Systems Does This Change Impact?

How Has This Been Tested?

Key Areas to Review

Checklist

trunk-io bot commented Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guy-goren left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Jul 24, 2024

✅ Forge suite compat success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119

github-actions bot commented Jul 24, 2024

✅ Forge suite realistic_env_max_load success on 14d7a7bec99b6643b1a036c1a3fdcd23d4260119

github-actions bot commented Jul 24, 2024

✅ Forge suite framework_upgrade success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> 14d7a7bec99b6643b1a036c1a3fdcd23d4260119

trunk-io bot commented Jun 26, 2024 •

edited

Loading

✅ Forge suite `compat` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`

✅ Forge suite `realistic_env_max_load` success on `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`

✅ Forge suite `framework_upgrade` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `14d7a7bec99b6643b1a036c1a3fdcd23d4260119`