[release-builder] local simulation of governance proposals #13949

vgao1996 · 2024-07-09T11:44:45Z

Description

This introduces a new release builder command that enables the simulation of governance proposals. Currently only multi-step proposals are supported.

It utilizes the the remote debugger infrastructure to fetch real chain states for local simulation, but adds another in-memory database to store the new side effects generated by the governance scripts.

Normally, governance scripts needs to be approved through on-chain governance before they could be executed. This process involves setting up various states (e.g., staking pool, delegated voter), which can be quite complex.

This simulation bypasses these challenges by patching specific Move functions with mock versions, most notably fun resolve_multi_step_proposal, thus allowing the governance process to be skipped altogether. In other words, this simulation is intended for checking whether a governance proposal will execute successfully, assuming it gets approved.

How to run simulation

First generate the proposal

cargo run -p aptos-release-builder generate-proposals --release-config data/release.yaml --output-dir 
output

Then run simulation via the following command

cargo run -p aptos-release-builder simulate-multi-step-proposal --network mainnet --proposal-dir output/sources/v1.14/step_1_upgrade_framework/

Here's how the output should look like

Found 2 scripts
    output/sources/v1.14/step_1_upgrade_framework/0-gas-schedule.move
    output/sources/v1.14/step_1_upgrade_framework/1-features.move
Compiling scripts...
Compiling, may take a little while to download git dependencies...
INCLUDING DEPENDENCY AptosFramework
INCLUDING DEPENDENCY AptosStdlib
INCLUDING DEPENDENCY MoveStdlib
BUILDING script
Compiling, may take a little while to download git dependencies...
INCLUDING DEPENDENCY AptosFramework
INCLUDING DEPENDENCY AptosStdlib
INCLUDING DEPENDENCY MoveStdlib
BUILDING script
Patching framework functions to bypass governance.. done
Creating and funding sender account.. done
Executing governance scripts...
    0-gas-schedule.move
        Keep(
            Success,
        )
    1-features.move
        Keep(
            Success,
        )
All scripts succeeded!

Type of Change

Which Components or Systems Does This Change Impact?

Key Areas to Review

simulate.rs

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

trunk-io · 2024-07-09T11:44:49Z

⏱️ 2h 21m total CI duration on this PR

Job	Cumulative Duration	Recent Runs
test-fuzzers	1h 51m	🟩 🟩 🟩
rust-move-tests	6m	🟩
general-lints	6m	🟩 🟩 🟩
rust-move-tests	5m	🟩
check-dynamic-deps	4m	🟩 🟩 🟩
rust-cargo-deny	3m	🟩 🟩
rust-move-tests	2m	🟩
semgrep/ci	1m	🟩 🟩 🟩
file_change_determinator	36s	🟩 🟩 🟩
file_change_determinator	30s	🟩 🟩 🟩
permission-check	12s	🟩 🟩 🟩
permission-check	9s	🟩 🟩 🟩
permission-check	9s	🟩 🟩 🟩
permission-check	8s	🟩 🟩 🟩

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

aptos-move/aptos-release-builder/src/main.rs

aptos-move/aptos-release-builder/src/simulate.rs

georgemitenkov · 2024-07-12T13:18:26Z

aptos-move/aptos-release-builder/src/simulate.rs

+    let txn_gas_params = &mut gas_params.vm.txn;
+    // Use the alternative limits for governance proposals
+    // TODO: In the future, consider adding the execution hashes of the scripts to the approval list.
+    txn_gas_params.max_execution_gas = txn_gas_params.max_execution_gas_gov;


Doesn't your recent change to automatically bump these numbers to _gov if it is a governance proposal work here?

This is one of the more ugly parts of the current implementation.

Doesn't your recent change to automatically bump these numbers to _gov if it is a governance proposal work here?

Yes, but the alt limits will only kick in if the script has its hash added to the list of approved execution hashes, which gets skipped by the mock version of resolve_multi_step_proposal as well.

It's possible for us to manually add it in Rust and I tried it, but there are some complexities involved.

aptos-move/aptos-release-builder/src/simulate.rs

vgao1996 · 2024-07-17T01:36:45Z

Made a major update to the PR

@georgemitenkov I addressed all your comments and added the should_restart_execution check you requested.
@runtian-zhou I implemented the check you requested, ensuring that the last script cannot have a next execution hash.
Also fixed two bugs
- The warm vm cache is now flushed every time we execute a script, so that we always load the latest framework code cc @msmouse
- The patching of the framework functions is also done every time we execute a script -- this is needed in case the framework has been overwritten by a previous script.

georgemitenkov · 2024-07-17T12:18:04Z

aptos-move/aptos-release-builder/src/simulate.rs

+            &log_context,
+        );
+        // We require all governance scripts to trigger reconfiguration so check it here.
+        if AptosVM::should_restart_execution(vm_output.events()) {


Should it be the negation of this?

Good catch.

However looks like this would not be an easy fix. With dkg, reconfiguration is started by the script but may not actually happen until the next epoch.

I guess I'll have to remove this check for now, and we'll need to discuss what the proper solution might be. I'm thinking that maybe reconfiguration_with_dkg::try_start should emit its own event indicating this.

georgemitenkov · 2024-07-17T12:20:00Z

aptos-move/aptos-release-builder/src/simulate.rs

+        let script_name = script_path.file_name().unwrap().to_string_lossy();
+        println!("    {}", script_name);
+
+        // Create a new VM to ensure the loader is clean.


Have you checked this? Because I think warm vm cache does load PackageMetadata for core packages to see if it has changed... or this is the reason, we patch code and package metadata is the same?

Yes, I spent a few hours debugging this yesterday.

or this is the reason, we patch code and package metadata is the same?

Exactly, the patching done here does not change the metadata, so the warm vm cache fails to see it needs to reload everything.

perryjrandall

This is fucking magical, a really important tool for release verification <3

aptos-move/aptos-release-builder/data/release.yaml

perryjrandall · 2024-07-17T19:54:59Z

aptos-move/aptos-release-builder/src/main.rs

+        ///
+        /// Possible values: devnet, testnet, mainnet, <url to rest endpoint>
+        #[clap(long)]
+        network: NetworkSelection,


nice! validate proposal also has an "endpoint" arguement, it would be great could replace its usage of URL with network so we dont have to specify the testnet / mainnet / devnet url all the time there either

aptos-move/aptos-release-builder/src/main.rs

aptos-move/aptos-release-builder/src/simulate.rs

runtian-zhou

I think the overall logic makes sense to me. The patching logic looked ugly but I don't see a way round for now. We still need a separate PR in the aptos-network to use this command btw.

aptos-move/aptos-release-builder/src/main.rs

aptos-move/aptos-release-builder/src/simulate.rs

runtian-zhou · 2024-07-17T22:08:39Z

aptos-move/aptos-release-builder/src/simulate.rs

+        state_view.apply_write_set(write_set);
+    }
+
+    println!("All scripts succeeded!");


We need to check there's no pending script hash there. Can you add the check or at least add a todo here? I think it would be important for our release safety.

I think I've implemented this check, no?

Oh nvm. I thought you didn't.

vgao1996 · 2024-07-17T22:15:36Z

aptos-move/aptos-release-builder/src/simulate.rs

+            if forbid_next_execution_hash {
+                // If it is needed to forbid a next execution hash, inject additional Move
+                // code at the beginning that aborts with a magic number if the vector
+                // representing the hash is not empty.
+                //
+                //     if (!vector::is_empty(&next_execution_hash)) {
+                //         abort MAGIC_FAILED_NEXT_EXECUTION_HASH_CHECK;
+                //     }
+                //
+                // The magic number can later be checked in Rust to determine if such violation
+                // has happened.
+                code.code.extend([
+                    ImmBorrowLoc(2),
+                    VecLen(sig_u8_idx),
+                    LdU64(0),
+                    Eq,
+                    BrTrue(7),
+                    LdU64(MAGIC_FAILED_NEXT_EXECUTION_HASH_CHECK),
+                    Abort,
+                ]);
+            }


@runtian-zhou here I check if the next execution hash is empty, and if so, abort with a magic number

vgao1996 · 2024-07-18T20:58:54Z

@perryjrandall I've made some updates to the PR

Fixed the bug that caused the gas schedule hash to change, by adding script hashes to the approved list properly.
Added .context(..) and .with_context(..) in many places to improve error reporting
Added a helper to modify on-chain config

Additionally I've also created this issue #14044 for tracking follow-up items that I don't plan to address immediately.

aptos-move/aptos-release-builder/src/main.rs

vgao1996 · 2024-07-23T18:05:54Z

Update the PR again.

@perryjrandall I've renamed the command to "simulate" as you requested. Now it also searches the whole directory recursively, so you can pass in the output dir or any of its sub directories containing the proposals.
@georgemitenkov I had to change the way I inject create_signer cuz my previous implementation broke module compatibility. This results in an extra flag being passed into the VM, which is a bit ugly, but I guess this is something we can live with and refactor later.

Given that the PR is rather polished right now I'll proceed to landing. If there are additional feature requests I'll address them separately.

github-actions · 2024-07-23T19:04:08Z

✅ Forge suite `realistic_env_max_load` success on `a57745955e2a99457654d0214db1e840448ed85e`

two traffics test: inner traffic : committed: 9312.372942499891 txn/s, latency: 4276.053475807454 ms, (p50: 4200 ms, p90: 4700 ms, p99: 10500 ms), latency samples: 3540760
two traffics test : committed: 99.99821397799764 txn/s, latency: 2277.031111111111 ms, (p50: 2100 ms, p90: 2400 ms, p99: 8900 ms), latency samples: 1800
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.242, avg: 0.224", "QsPosToProposal: max: 1.865, avg: 1.823", "ConsensusProposalToOrdered: max: 0.316, avg: 0.294", "ConsensusOrderedToCommit: max: 0.421, avg: 0.406", "ConsensusProposalToCommit: max: 0.712, avg: 0.699"]
Max round gap was 1 [limit 4] at version 1938233. Max no progress secs was 5.740667 [limit 15] at version 1938233.
Test Ok

github-actions · 2024-07-23T19:05:18Z

✅ Forge suite `framework_upgrade` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `a57745955e2a99457654d0214db1e840448ed85e`

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e (PR)
Upgrade the nodes to version: a57745955e2a99457654d0214db1e840448ed85e
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1337.7657697781233 txn/s, submitted: 1340.6729029250514 txn/s, failed submission: 2.9071331469281922 txn/s, expired: 2.9071331469281922 txn/s, latency: 2641.978069540022 ms, (p50: 2100 ms, p90: 4800 ms, p99: 10200 ms), latency samples: 110440
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1068.9917850611882 txn/s, submitted: 1071.180547929864 txn/s, failed submission: 2.1887628686756515 txn/s, expired: 2.1887628686756515 txn/s, latency: 2809.9613124488124 ms, (p50: 2100 ms, p90: 5200 ms, p99: 10500 ms), latency samples: 97680
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e passed
Upgrade the remaining nodes to version: a57745955e2a99457654d0214db1e840448ed85e
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1055.6579369475385 txn/s, submitted: 1057.8521958062945 txn/s, failed submission: 2.1942588587560556 txn/s, expired: 2.1942588587560556 txn/s, latency: 2995.9792766576597 ms, (p50: 2200 ms, p90: 5400 ms, p99: 10500 ms), latency samples: 96220
Test Ok

github-actions · 2024-07-23T19:05:23Z

✅ Forge suite `compat` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `a57745955e2a99457654d0214db1e840448ed85e`

Compatibility test results for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e (PR)
1. Check liveness of validators at old version: 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5
compatibility::simple-validator-upgrade::liveness-check : committed: 7526.7189752704135 txn/s, latency: 3789.1083689890797 ms, (p50: 2800 ms, p90: 4800 ms, p99: 27200 ms), latency samples: 313180
2. Upgrading first Validator to new version: a57745955e2a99457654d0214db1e840448ed85e
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7427.550906292737 txn/s, latency: 3572.8080405598403 ms, (p50: 4000 ms, p90: 4200 ms, p99: 4400 ms), latency samples: 140040
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6434.249928332043 txn/s, latency: 4715.601646791596 ms, (p50: 4600 ms, p90: 5400 ms, p99: 8400 ms), latency samples: 246540
3. Upgrading rest of first batch to new version: a57745955e2a99457654d0214db1e840448ed85e
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7196.484471921162 txn/s, latency: 3660.051948614319 ms, (p50: 4100 ms, p90: 4400 ms, p99: 4600 ms), latency samples: 138560
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6797.026845312571 txn/s, latency: 4693.143285443909 ms, (p50: 4800 ms, p90: 5400 ms, p99: 6100 ms), latency samples: 232480
4. upgrading second batch to new version: a57745955e2a99457654d0214db1e840448ed85e
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 2361.955587406585 txn/s, latency: 10972.883252517717 ms, (p50: 13700 ms, p90: 17500 ms, p99: 18300 ms), latency samples: 53620
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 9257.193034621272 txn/s, latency: 3469.895835780571 ms, (p50: 3000 ms, p90: 6900 ms, p99: 9300 ms), latency samples: 340520
5. check swarm health
Compatibility test for 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e passed
Test Ok

vgao1996 requested review from perryjrandall, junkil-park and runtian-zhou July 9, 2024 11:44

vgao1996 requested review from gregnazario and banool as code owners July 9, 2024 11:44

vgao1996 requested a review from davidiw July 9, 2024 16:48

vgao1996 force-pushed the gov-sim branch 2 times, most recently from a62dd2d to ffe4265 Compare July 11, 2024 23:52

georgemitenkov reviewed Jul 12, 2024

View reviewed changes

vgao1996 force-pushed the gov-sim branch from ffe4265 to fc65ac9 Compare July 17, 2024 01:32

vgao1996 requested review from wrwg and zekun000 as code owners July 17, 2024 01:32

vgao1996 requested a review from msmouse July 17, 2024 01:40

vgao1996 force-pushed the gov-sim branch from fc65ac9 to b09f03d Compare July 17, 2024 02:03

georgemitenkov reviewed Jul 17, 2024

View reviewed changes

vgao1996 force-pushed the gov-sim branch 3 times, most recently from 030dd06 to 8f3c907 Compare July 17, 2024 19:10

perryjrandall approved these changes Jul 17, 2024

View reviewed changes

runtian-zhou reviewed Jul 17, 2024

View reviewed changes

aptos-move/aptos-release-builder/src/main.rs Outdated Show resolved Hide resolved

aptos-move/aptos-release-builder/src/simulate.rs Show resolved Hide resolved

runtian-zhou approved these changes Jul 17, 2024

View reviewed changes

vgao1996 commented Jul 17, 2024

View reviewed changes

vgao1996 force-pushed the gov-sim branch from 8f3c907 to 52553af Compare July 18, 2024 20:36

vgao1996 force-pushed the gov-sim branch from 52553af to 528fcc7 Compare July 19, 2024 01:45

perryjrandall reviewed Jul 20, 2024

View reviewed changes

aptos-move/aptos-release-builder/src/main.rs Outdated Show resolved Hide resolved

perryjrandall reviewed Jul 20, 2024

View reviewed changes

aptos-move/aptos-release-builder/src/main.rs Outdated Show resolved Hide resolved

vgao1996 force-pushed the gov-sim branch from 528fcc7 to b7ae374 Compare July 20, 2024 10:04

vgao1996 requested a review from movekevin as a code owner July 20, 2024 10:04

vgao1996 force-pushed the gov-sim branch from b7ae374 to bda9f8a Compare July 23, 2024 17:56

vgao1996 enabled auto-merge (squash) July 23, 2024 18:06

[release-builder] local simulation of governance proposals

a577459

vgao1996 force-pushed the gov-sim branch from bda9f8a to a577459 Compare July 23, 2024 18:35

This comment has been minimized.

Sign in to view

vgao1996 merged commit 9301d80 into aptos-labs:main Jul 23, 2024
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release-builder] local simulation of governance proposals #13949

[release-builder] local simulation of governance proposals #13949

vgao1996 commented Jul 9, 2024 •

edited

Loading

trunk-io bot commented Jul 9, 2024 •

edited

Loading

georgemitenkov Jul 12, 2024

vgao1996 Jul 12, 2024

vgao1996 commented Jul 17, 2024

georgemitenkov Jul 17, 2024

vgao1996 Jul 17, 2024

georgemitenkov Jul 17, 2024

vgao1996 Jul 17, 2024

perryjrandall left a comment

perryjrandall Jul 17, 2024

runtian-zhou left a comment

runtian-zhou Jul 17, 2024

vgao1996 Jul 17, 2024

runtian-zhou Jul 17, 2024

vgao1996 Jul 17, 2024

runtian-zhou Jul 17, 2024

vgao1996 commented Jul 18, 2024

vgao1996 commented Jul 23, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Jul 23, 2024

github-actions bot commented Jul 23, 2024

github-actions bot commented Jul 23, 2024

[release-builder] local simulation of governance proposals #13949

[release-builder] local simulation of governance proposals #13949

Conversation

vgao1996 commented Jul 9, 2024 • edited Loading

Description

How to run simulation

Type of Change

Which Components or Systems Does This Change Impact?

Key Areas to Review

Checklist

trunk-io bot commented Jul 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vgao1996 commented Jul 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

perryjrandall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

runtian-zhou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vgao1996 commented Jul 18, 2024

vgao1996 commented Jul 23, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Jul 23, 2024

✅ Forge suite realistic_env_max_load success on a57745955e2a99457654d0214db1e840448ed85e

github-actions bot commented Jul 23, 2024

✅ Forge suite framework_upgrade success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e

github-actions bot commented Jul 23, 2024

✅ Forge suite compat success on 1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5 ==> a57745955e2a99457654d0214db1e840448ed85e

vgao1996 commented Jul 9, 2024 •

edited

Loading

trunk-io bot commented Jul 9, 2024 •

edited

Loading

vgao1996 commented Jul 23, 2024 •

edited

Loading

✅ Forge suite `realistic_env_max_load` success on `a57745955e2a99457654d0214db1e840448ed85e`

✅ Forge suite `framework_upgrade` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `a57745955e2a99457654d0214db1e840448ed85e`

✅ Forge suite `compat` success on `1c2ee7082d6eff8c811ee25d6f5a7d00860a75d5` ==> `a57745955e2a99457654d0214db1e840448ed85e`