Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

randomness #6: dkg manager update from randomnet #12226

Merged
merged 5 commits into from
Feb 29, 2024

Conversation

zjma
Copy link
Contributor

@zjma zjma commented Feb 25, 2024

Description

Test Plan

Copy link

trunk-io bot commented Feb 25, 2024

⏱️ 26h 42m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-unit-coverage 12h 🟥
rust-smoke-coverage 3h 16m 🟩
execution-performance / single-node-performance 2h 8m 🟥🟩🟥🟩🟩
windows-build 1h 52m 🟩🟩🟩🟩🟩
rust-unit-tests 1h 44m 🟩🟩🟩
rust-smoke-tests 1h 40m 🟩🟩🟩
rust-images / rust-all 49m 🟩🟩🟩
forge-e2e-test / forge 43m 🟩🟩🟩
forge-compat-test / forge 41m 🟩🟩🟩
rust-lints 28m 🟩🟩🟩
run-tests-main-branch 20m 🟥🟥🟥
cli-e2e-tests / run-cli-tests 17m 🟩🟩🟩
check 13m 🟩🟩🟩
check-dynamic-deps 10m 🟩🟩🟩🟩🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 7m 🟩🟩🟩
general-lints 6m 🟩🟩🟩
node-api-compatibility-tests / node-api-compatibility-tests 3m 🟩🟩🟩
semgrep/ci 2m 🟩🟩🟩🟩🟩
file_change_determinator 52s 🟩🟩🟩
file_change_determinator 43s 🟩🟩🟩
execution-performance / file_change_determinator 42s 🟩🟩🟩🟩
file_change_determinator 34s 🟩🟩🟩
permission-check 17s 🟩🟩🟩
permission-check 15s 🟩🟩🟩🟩🟩
permission-check 13s 🟩🟩🟩🟩🟩
permission-check 13s 🟩🟩🟩🟩🟩
permission-check 11s 🟩🟩🟩🟩🟩
determine-docker-build-metadata 5s 🟩🟩🟩

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
rust-unit-coverage 12h 4h 26m +170%
run-tests-main-branch 7m 4m +71%

settingsfeedbackdocs ⋅ learn more about trunk.io

@zjma zjma marked this pull request as ready for review February 25, 2024 12:03
@zjma zjma added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Feb 25, 2024
@zjma zjma changed the title dkg manager update from randomnet randomness #6: dkg manager update from randomnet Feb 25, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

},
close_req = close_rx.select_next_some() => {
self.process_close_cmd(close_req.ok())
}
},
_ = interval.tick().fuse() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the best practice for logging.. @zekun000 thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's okay but this will print out the whole transcript bytes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see there's a Debug impl for the transcript

);
self.setup_deal_broadcast(start_time_us, &metadata)
.await
.expect("[DKG] setup_deal_broadcast() should be infallible");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe alert instead of expect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only error is if inner state is not NotStarted?

@@ -27,15 +29,26 @@ impl<S: DKGTrait> Default for TranscriptAggregator<S> {
}

pub struct TranscriptAggregationState<DKG: DKGTrait> {
start_time: Duration,
my_addr: AccountAddress,
valid_peer_transcript_seen: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is never set to be true?

@zjma zjma force-pushed the zjma/consensus_update_from_randomnet branch from 911292f to 0db1642 Compare February 28, 2024 04:15
@zjma zjma force-pushed the zjma/consensus_update_from_randomnet branch from 166f1b6 to 2392d2a Compare February 28, 2024 14:31
@zjma zjma force-pushed the zjma/dkg_manager_update_from_randomnet branch from a205b1e to e675e73 Compare February 28, 2024 14:38

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Base automatically changed from zjma/consensus_update_from_randomnet to main February 28, 2024 22:34
);
self.setup_deal_broadcast(start_time_us, &metadata)
.await
.expect("[DKG] setup_deal_broadcast() should be infallible");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only error is if inner state is not NotStarted?

},
close_req = close_rx.select_next_some() => {
self.process_close_cmd(close_req.ok())
}
},
_ = interval.tick().fuse() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's okay but this will print out the whole transcript bytes?

},
close_req = close_rx.select_next_some() => {
self.process_close_cmd(close_req.ok())
}
},
_ = interval.tick().fuse() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see there's a Debug impl for the transcript

"[DKG] txn executed and entering new epoch.",
);

drop(vtxn_guard);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for explicit drop?

@zjma zjma enabled auto-merge (squash) February 29, 2024 02:01

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 879a77039da081fcaee3bfb268ce650c0e5dc5d8

Compatibility test results for aptos-node-v1.9.5 ==> 879a77039da081fcaee3bfb268ce650c0e5dc5d8 (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 5265 txn/s, latency: 5005 ms, (p50: 4800 ms, p90: 8400 ms, p99: 11900 ms), latency samples: 231700
2. Upgrading first Validator to new version: 879a77039da081fcaee3bfb268ce650c0e5dc5d8
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 931 txn/s, latency: 30428 ms, (p50: 35400 ms, p90: 44800 ms, p99: 47000 ms), latency samples: 58700
3. Upgrading rest of first batch to new version: 879a77039da081fcaee3bfb268ce650c0e5dc5d8
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 422 txn/s, submitted: 784 txn/s, expired: 361 txn/s, latency: 26131 ms, (p50: 13100 ms, p90: 57200 ms, p99: 57600 ms), latency samples: 29604
4. upgrading second batch to new version: 879a77039da081fcaee3bfb268ce650c0e5dc5d8
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2021 txn/s, latency: 15163 ms, (p50: 16500 ms, p90: 18700 ms, p99: 20400 ms), latency samples: 88940
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 879a77039da081fcaee3bfb268ce650c0e5dc5d8 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 879a77039da081fcaee3bfb268ce650c0e5dc5d8

two traffics test: inner traffic : committed: 7473 txn/s, latency: 5238 ms, (p50: 5100 ms, p90: 6000 ms, p99: 11700 ms), latency samples: 3236180
two traffics test : committed: 100 txn/s, latency: 1883 ms, (p50: 1800 ms, p90: 2000 ms, p99: 7000 ms), latency samples: 1800
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.241, avg: 0.204", "QsPosToProposal: max: 0.314, avg: 0.280", "ConsensusProposalToOrdered: max: 0.508, avg: 0.456", "ConsensusOrderedToCommit: max: 0.329, avg: 0.304", "ConsensusProposalToCommit: max: 0.783, avg: 0.760"]
Max round gap was 1 [limit 4] at version 1392074. Max no progress secs was 3.8778481 [limit 15] at version 1392074.
Test Ok

@zjma zjma merged commit cd35002 into main Feb 29, 2024
59 of 81 checks passed
@zjma zjma deleted the zjma/dkg_manager_update_from_randomnet branch February 29, 2024 02:34
zjma added a commit that referenced this pull request Mar 1, 2024
* consensus update from randomnet

* update

* update execution client api

* dkg manager update from randomnet

* avoid panic
zjma added a commit that referenced this pull request Mar 2, 2024
* consensus update from randomnet

* update

* update execution client api

* dkg manager update from randomnet

* avoid panic
zjma added a commit that referenced this pull request Mar 3, 2024
* randomness type update 3 (#12202)

* randomness #4: RandManager update from randomnet (#12224)

* RandManager update from randomnet

* lint

* lint

* randomness #5: consensus update from randomnet (#12225)

* consensus update from randomnet

* update

* randomness #6: dkg manager update from randomnet (#12226)

* consensus update from randomnet

* update

* update execution client api

* dkg manager update from randomnet

* avoid panic

* make api, indexer, fake aptos db aware of block metadata ext txns (#12227)

* randomness #8: framework update from randomnet (#12228)

* framework update from randomnet

Squashed commit of the following to fix jwk smoke tests:

commit 3bd0154
Author: zhoujun.ma <[email protected]>
Date:   Tue Feb 27 02:47:57 2024 -0800

    update

commit 2eb6add
Author: zhoujun.ma <[email protected]>
Date:   Tue Feb 27 02:12:27 2024 -0800

    update

commit 9d82151
Author: zhoujun.ma <[email protected]>
Date:   Tue Feb 27 01:51:08 2024 -0800

    debug

fix doc test

fix spec

fix doc

update initialization in genesis

update features.move

initialize randomness in genesis

update golden files

private entry fun check and vm updates

* postpone release builder changes

* update goldenfiles

* fix is_safe_call spec

* randomness #9: smoke tests from randomnet (#12282)

* smoke test deps and 1st case from randomnet

* update

* more smoke tests

* randomness #10: randomness API update from randomnet (#12335)

* [move] fixes to `randomness.move` (#12250)

* [move] fixes to randomness.move

* Fixed the Prover spec

Fixed the spec to unblock the PR.

Need to prove the introduced assumptions with proper loop invariants, which should be provable.

* lint

---------

Co-authored-by: Junkil Park <[email protected]>
Co-authored-by: danielxiangzl <[email protected]>

* fix specs

---------

Co-authored-by: Alin Tomescu <[email protected]>
Co-authored-by: Junkil Park <[email protected]>
Co-authored-by: danielxiangzl <[email protected]>

* lint

* update genesis

* on-chain resources to indicate dkg/randomness failure injection (#12345)

* dkg/randomness failure injection

* update

* smoke test

* update

* update

* fix scripts

---------

Co-authored-by: Alin Tomescu <[email protected]>
Co-authored-by: Junkil Park <[email protected]>
Co-authored-by: danielxiangzl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants