Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Executor] Merge sequential & parallel execution flow #4683

Merged
merged 2 commits into from
Nov 29, 2022
Merged

Conversation

gelash
Copy link
Contributor

@gelash gelash commented Sep 30, 2022

De-spagettify the aptos-vm execution flow

  • remove unused status (only used in harness tests, fix the tests to have the same behavior as in prod).
  • merge sequential with parallel executor crate, re-using code but keep the same algorithm for redundancy / testing & fallback (i.e. Block-STM could also run sequential with one thread with probably not too much overhead).

Benefits:

  • Cleaner flow
  • Can hook to the same executor regardless of whether we are executing sequential or parallel.
  • Can use the powerful testing framework we have for parallel executor for sequential execution.
  • Can re-use the executor rayon threadpool for block signature checks.

EDIT: These are coming in a different diff.
Additional improvements to tests:
- Test Over / Underflows based on storage values (close to 0 or close to u128::MAX)
- test different number of cores
- Refactors: ModulePath enum, isolate BaselineState for generating baseline with different configurations (i.e. aggregator values materialized or not), check delta sequence, etc.
- Make sure StorageError precedence over DeltaApplicationFailure in speculative executions when aggregator is deleted but deltas on top also fail. Not relevant for current use-case (no under/over flows and deletes), also also the proper algorithm for handling general case may not have the issue anyway.
After the testing PR, @perryjrandall we should start running on different platforms too.

@dariorussi
Copy link
Contributor

dariorussi commented Oct 1, 2022

just a quick comment as I am reviewing this, and it will be slow for me as it touches a lot of core stuff I have not seen before. Though a wonderful exercise, thanks!
I am not sure when we cut the branch (if we did not already) and all of the main-net implications, but this is touching a bunch of core stuff, as far as I can tell and we want it in definitely after we are done with main-net and all of that, right?
Can you share your thoughts?

Also please review errors which look legit?

@gelash
Copy link
Contributor Author

gelash commented Oct 4, 2022

@dariorussi - yes, if we like it, we should roll this in after main-net. Should be nothing breaking, more tests, common flow and should facilitate lots of future improvements (e.g. even just the virtue of having a single executor object for callbacks or what not).

Will def fix all the linter and related errors. And (TODO to self): will also experiment with Dashset/Dashmap configuration settings (num shards mainly), we use defaults and maybe there is some perf to squeeze.

@gelash gelash requested a review from danielxiangzl October 27, 2022 00:33
@gelash gelash force-pushed the seqinpar branch 2 times, most recently from e7ebf45 to 1511c1e Compare November 7, 2022 01:46
Copy link
Contributor

@zekun000 zekun000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early comments, need to do another pass

aptos-move/e2e-tests/src/executor.rs Outdated Show resolved Hide resolved
aptos-move/aptos-vm/src/parallel_executor/vm_wrapper.rs Outdated Show resolved Hide resolved
aptos-move/aptos-vm/src/data_cache.rs Outdated Show resolved Hide resolved
aptos-move/aptos-vm/src/parallel_executor/mod.rs Outdated Show resolved Hide resolved
@gelash gelash requested a review from movekevin as a code owner November 17, 2022 00:55
@gelash gelash force-pushed the seqinpar branch 2 times, most recently from f597fd3 to c53ccd5 Compare November 17, 2022 07:20
Copy link
Contributor

@dariorussi dariorussi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are splitting this up in multiple PRs which is a very good idea, so love it and maybe you want to mark it somehow.
Just commenting given I have a small comment I think you may enjoy.

aptos-move/aptos-vm/src/parallel_executor/mod.rs Outdated Show resolved Hide resolved
@gelash
Copy link
Contributor Author

gelash commented Nov 18, 2022

I think you are splitting this up in multiple PRs which is a very good idea, so love it and maybe you want to mark it somehow. Just commenting given I have a small comment I think you may enjoy.

Will do ASAP. To document here, the plan is to do proptest changes in a separate diff.

I will make sure to stack the proptest PR on top though and not land the first PR without ensuring the second one (with new and stronger tests) passes.

@gelash gelash force-pushed the seqinpar branch 2 times, most recently from dc6df84 to e3cddb9 Compare November 19, 2022 22:03
@gelash
Copy link
Contributor Author

gelash commented Nov 19, 2022

Split the PR, this is the first one that refactors the aptos-vm execution flow, removing unused status and merging sequential & parallel execution. Renamed parallel_executor -> block_executor as per @dariorussi 's suggestion, but since that's a lot of lines, made it a separate commit for the ease of reviewing.

@runtian-zhou can you have a look now?

@gelash
Copy link
Contributor Author

gelash commented Nov 19, 2022

@zekun000 @sasha8 ping ping

@gelash gelash changed the title [Executor] Merge sequential & parallel execution flow, refactor, test [Executor] Merge sequential & parallel execution flow Nov 19, 2022
@gelash gelash added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Nov 21, 2022
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@wrwg wrwg requested a review from vgao1996 November 26, 2022 08:06
@wrwg
Copy link
Contributor

wrwg commented Nov 26, 2022

Lets give some more time for review from Move team members. (I know you filed this a while ago so sorry for no earlier attention.)

@gelash
Copy link
Contributor Author

gelash commented Nov 26, 2022

Lets give some more time for review from Move team members. (I know you filed this a while ago so sorry for no earlier attention.)

I was never going to land this without @runtian-zhou 's approval, but appreciate more eyes.

For context:
I already have 3 follow-ups implemented (2 drafts linked 2 in the comments, or can see here: main...seqparoutput, third for adding proptests which I separated from here for clarity) which intend to improve and simplify the whole executor <-> aptos-vm integration (+ sequential now), delta resolution, etc.

But it all starts here, and once we have all executor flow nicely in the block_executor, opens up all the extension and gas hooks that we need for different outstanding tasks (can explain offline).

Copy link
Contributor

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally seems to go into the right direction (that is unifying sequential/parallel block execution). We need a larger refactoring of the adapter (see comment below), but this PR can be an incremental step for this. Leaving detail review to folks more acquainted with PE.

}

// Wrapper to avoid orphan rule
pub(crate) struct AptosTransactionOutput(TransactionOutputExt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of avoiding the orphan rule which is there for a reason (and not a technical one). It makes perhaps notations a bit more convenient, but generally the code harder to understand. Not saying this needs to be differently done, but just an opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike this as well and tried to get rid of it, can solve some issues but the problem I couldn't easily overcome is a potential circular dependency between aptos-aggregator (where TransactionOutputExt is defined) and block-executor crates (that defines traits, and also uses aggregators).. Hopefully we will get to a place where we restructure this wrapper, or in general restructure the crates and factor out the common types (like we do currently at the aptos-core level, something similar maybe we could do the same at the aptos-vm or aptos-move).


pub(crate) struct AptosVMWrapper<'a, S> {
vm: AptosVM,
base_view: &'a S,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This S is always a StateView, right? Then this is really strange and the need for this wrapper type just demonstrates how broken the AptosVM is (and requires a larger overhaul). Because if you look at AptosVM, you see that it wraps AptosVMImpl, which in turn is created from a state view. That state view is then hidden inside of the storage adapter. The need to be able to get hand a StateView already created acrobatic code in other places.

This is probably another set of PRs after this one, but I really wish we could drastically simplify the architecture here:

  • Only one AptosVM (no VMImpl, BlockVM, MoveVmExt, and other complications)
  • That AptosVM implements all the parallel and sequential execution logic. Multiple impl AptosVM split over files can help to tam the complexity.

Because parallel execution is build into our Move adapter there is really no need to maintain a code layer underneath without PE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a great place to get to.

Next PR will help a bit with the StateView wrapping business, but it won't fully solve it - more like unify what we currently use (StateViewCache and VersionedView) to represent storage stateview + some writes from the block, and isolate it within the block_executor as implementation details (together with helping delta resolution also become an implementation detail) as a better temporary place. But I fully subscribe to revamping these boundaries as soon as possible and will try to make it more feasible with the current queue of changes.

aptos-move/aptos-vm/src/block_executor/vm_wrapper.rs Outdated Show resolved Hide resolved
@gelash
Copy link
Contributor Author

gelash commented Nov 28, 2022

This generally seems to go into the right direction (that is unifying sequential/parallel block execution). We need a larger refactoring of the adapter (see comment below), but this PR can be an incremental step for this. Leaving detail review to folks more acquainted with PE.

Totally agreed, that's precisely the intention. There are some specific fixes, but regarding the overall refactoring aspect, my rationale at the moment is to try and make some incremental local simplifications / improvements while hopefully going in the good global direction (allow having a single "executor" that can be cleanly connected to other parts of aptos-vm, re-used across blocks, etc). Hopefully with some iteration (and there are indeed 2/3 PRs coming right after), we will also start seeing the big picture better. One heuristic for me for now is to push some pieces of logic to the block_executor side (e.g. in the next PR, the StateView wrapper business currently done in different ways in aptos-vm), since that code is newer and should have more fixed structure / flow.

However, then we should absolutely look into precisely the question of how the block executor is incorporated into aptos_vm. For some context, I believe the current state of affairs is due to two things - @runtian-zhou can confirm or deny as the author and the ultimate expert on the wrapper flow:
(a) make parallel executor standalone crate with generic parameters - be able to test without Move-VM.
(b) a clear separator of abstractions to facilitate the development at the time.
The testing (and prop-testing) framework is probably the best thing that we got out of building it this way. But now all these layers in aptos-vm / aptos_vm_impl / adapter / wrapper need to eventually also be simplified, especially since we have a lot of use-cases where we'd need hooks to & from other parts of aptos-vm.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@runtian-zhou
Copy link
Contributor

The way how parallel executor is structured like this is exactly what @gelash suggests. The whole intention of this parallel_executor crate abstraction is to:

  1. Help testing core scheduling logic without worrying about the Aptos VM implementation.
  2. Being able to run benchmarks that are independent of the AptosVM.

The testing aspect is probably more important in this context cuz it's generally quite hard to test parallel code like this.

Copy link
Contributor

@runtian-zhou runtian-zhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost looks good to me! Agreed with @wrwg that we need more refactoring here but this is a great step moving forward!

@@ -3,9 +3,6 @@

#[derive(Debug, PartialEq, Eq)]
pub enum Error<E> {
/// Invariant violation that happens internally inside of scheduler, usually an indication of
/// implementation error.
InvariantViolation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this error removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was never used - happy to bring it back whenever needed.

aptos-move/block-executor/src/executor.rs Outdated Show resolved Hide resolved
) -> Result<
(
Vec<E::Output>,
OutputDeltaResolver<<T as Transaction>::Key, <T as Transaction>::Value>,
),
E::Error,
> {
assert!(self.concurrency_level > 1, "Must use sequential execution");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering if those assertion would fail in production as they are a good source for the network availability attack. Would it be better to return an error when such condition is violated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, must be true due to
https://github.com/aptos-labs/aptos-core/blob/e197a64f990b349b888eec4624c1adf945d0ef67/aptos-move/aptos-vm/src/block_executor/mod.rs#L150, and second follow-up PR moves this dispatching inside block_executor crate:

let mut ret = if self.concurrency_level > 1 {
, and execute_parallel becomes pub(crate), so we could also just delete the assert at that point. In fact, I like that idea to delete just this assert then.

If you really think we should return an error here, let me know what kind of error and what should happen in that case, we skip the whole block? But it will complicate some places and we should worry about determinism and probably not worth for some invariant that trivially should never happen.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> c419c2153ba3336fac166f9b1184ec28869179d5

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> c419c2153ba3336fac166f9b1184ec28869179d5 (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7405 TPS, 5244 ms latency, 7000 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: c419c2153ba3336fac166f9b1184ec28869179d5
compatibility::simple-validator-upgrade::single-validator-upgrade : 4806 TPS, 8401 ms latency, 12200 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: c419c2153ba3336fac166f9b1184ec28869179d5
compatibility::simple-validator-upgrade::half-validator-upgrade : 4763 TPS, 8435 ms latency, 11000 ms p99 latency,no expired txns
4. upgrading second batch to new version: c419c2153ba3336fac166f9b1184ec28869179d5
compatibility::simple-validator-upgrade::rest-validator-upgrade : 6904 TPS, 5806 ms latency, 11100 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> c419c2153ba3336fac166f9b1184ec28869179d5 passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite land_blocking success on c419c2153ba3336fac166f9b1184ec28869179d5

performance benchmark with full nodes : 6943 TPS, 5706 ms latency, 8700 ms p99 latency,(!) expired 540 out of 2965300 txns
Test Ok

@runtian-zhou runtian-zhou merged commit feec33f into main Nov 29, 2022
@runtian-zhou runtian-zhou deleted the seqinpar branch November 29, 2022 19:57
@Markuze Markuze mentioned this pull request Dec 5, 2022
areshand pushed a commit to areshand/aptos-core-1 that referenced this pull request Dec 18, 2022
* Merge sequential and parallel flows

* rename parallel to block
@Markuze Markuze mentioned this pull request Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants