Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-v2] Test cases reduced from the framework showcasing need for stack optimizations #14800

Merged
merged 2 commits into from
Oct 28, 2024

Conversation

vineethk
Copy link
Contributor

@vineethk vineethk commented Sep 30, 2024

Description

In this PR, we add a bunch of test cases that show opportunities for optimization of generated stack-based bytecode. These tests were reduced from aptos-framework.

I have put them in a folder called eager-pushes because they can generally be resolved by eagerly pushing the operand on to the stack at the appropriate point.

How Has This Been Tested?

New tests are added.

Key Areas to Review

Tests and the generated bytecode. It might be helpful to look at the stackless bytecode generated as well.

Type of Change

  • Tests

Which Components or Systems Does This Change Impact?

  • Move Compiler

Copy link

trunk-io bot commented Sep 30, 2024

⏱️ 55m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-unit-coverage 16m 🟩
rust-move-unit-coverage 10m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
check 4m 🟩
rust-cargo-deny 4m 🟩🟩
check-dynamic-deps 1m 🟩🟩
general-lints 1m 🟩🟩
semgrep/ci 41s 🟩🟩
file_change_determinator 22s 🟩🟩
permission-check 5s 🟩🟩
permission-check 5s 🟩🟩
permission-check 4s 🟩🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor Author

vineethk commented Sep 30, 2024

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.8%. Comparing base (3c9937d) to head (f384f11).

Additional details and impacted files
@@                      Coverage Diff                       @@
##           vk/variable-window-peephole   #14800     +/-   ##
==============================================================
- Coverage                         59.8%    59.8%   -0.1%     
==============================================================
  Files                              853      853             
  Lines                           207923   207923             
==============================================================
- Hits                            124365   124348     -17     
- Misses                           83558    83575     +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vineethk vineethk force-pushed the vk/variable-window-peephole branch from 17a143c to 3c9937d Compare October 1, 2024 20:56
@vineethk vineethk force-pushed the vk/framework-reduced-tests branch from e8f3b89 to f384f11 Compare October 1, 2024 21:00
@vineethk vineethk marked this pull request as ready for review October 1, 2024 21:04
Copy link
Contributor

@brmataptos brmataptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eager_load_03.exp disassembly looks as good as we can do without IPA. I guess having another test doesn't really hurt, but I'm not sure why it's here.

Others do look like they could benefit from a more stack-aware stack-code-generation phase, that builds expression trees and rearranges computations to push values on the stack in the order needed for the next operation.

1: MutBorrowLoc[0](Arg0: u64)
2: Call bar(&mut u64)
3: StLoc[1](loc0: u64)
4: MoveLoc[0](Arg0: u64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without interprocedural analysis, this code looks pretty optimal. You can't load Arg0 earlier because of the call to bar(&mut Arg0). Meanwhile, loc0 comes from the call to one() and you have to rearrange the two arguments before calling baz. A swap operation would be useful here, but without that this looks good.

Of course, if you can inline the other calls you can do much better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was a test case to show that you should not pre-load Arg0 because of the &mut to it, like you point out.

I should have mentioned in the PR description that there are some negative test cases as well - this one is actually handcrafted and not reduced from framework code.

Base automatically changed from vk/variable-window-peephole to main October 12, 2024 00:48
@vineethk vineethk force-pushed the vk/framework-reduced-tests branch 5 times, most recently from 49c643f to 510d05e Compare October 22, 2024 23:07
@vineethk vineethk enabled auto-merge (squash) October 22, 2024 23:34

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@vineethk vineethk disabled auto-merge October 24, 2024 17:32
@vineethk vineethk force-pushed the vk/framework-reduced-tests branch from cbacd4a to 75e38e4 Compare October 25, 2024 17:28
@vineethk vineethk enabled auto-merge (squash) October 28, 2024 14:33

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on a9afe5bbd161f86281a6a8ca898905e3133f5e8a

two traffics test: inner traffic : committed: 14388.56 txn/s, latency: 2762.45 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3300 ms), latency samples: 5470960
two traffics test : committed: 99.94 txn/s, latency: 1603.66 ms, (p50: 1400 ms, p70: 1400, p90: 1500 ms, p99: 9400 ms), latency samples: 1800
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.021, avg: 1.585", "ConsensusProposalToOrdered: max: 0.326, avg: 0.294", "ConsensusOrderedToCommit: max: 0.362, avg: 0.349", "ConsensusProposalToCommit: max: 0.653, avg: 0.643"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.00s no progress at version 29367 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.48s no progress at version 2603800 (avg 8.48s) [limit 15].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a

Compatibility test results for f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a (PR)
Upgrade the nodes to version: a9afe5bbd161f86281a6a8ca898905e3133f5e8a
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1213.46 txn/s, submitted: 1216.20 txn/s, failed submission: 2.74 txn/s, expired: 2.74 txn/s, latency: 2548.96 ms, (p50: 2400 ms, p70: 2700, p90: 3600 ms, p99: 4800 ms), latency samples: 106460
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1339.98 txn/s, submitted: 1342.63 txn/s, failed submission: 2.65 txn/s, expired: 2.65 txn/s, latency: 2245.65 ms, (p50: 2100 ms, p70: 2400, p90: 3000 ms, p99: 4800 ms), latency samples: 121180
5. check swarm health
Compatibility test for f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a passed
Upgrade the remaining nodes to version: a9afe5bbd161f86281a6a8ca898905e3133f5e8a
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1245.77 txn/s, submitted: 1249.51 txn/s, failed submission: 3.74 txn/s, expired: 3.74 txn/s, latency: 2360.90 ms, (p50: 2100 ms, p70: 2400, p90: 3300 ms, p99: 5000 ms), latency samples: 113200
Test Ok

Copy link
Contributor

✅ Forge suite compat success on f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a

Compatibility test results for f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a (PR)
1. Check liveness of validators at old version: f38ec72e975a4dff4e9919e7a1d8118a75858bab
compatibility::simple-validator-upgrade::liveness-check : committed: 14543.84 txn/s, latency: 2243.56 ms, (p50: 1900 ms, p70: 2100, p90: 2700 ms, p99: 8500 ms), latency samples: 559600
2. Upgrading first Validator to new version: a9afe5bbd161f86281a6a8ca898905e3133f5e8a
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 6070.01 txn/s, latency: 4666.37 ms, (p50: 5200 ms, p70: 5500, p90: 5700 ms, p99: 5900 ms), latency samples: 114760
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6092.89 txn/s, latency: 5326.57 ms, (p50: 5700 ms, p70: 6000, p90: 6900 ms, p99: 7100 ms), latency samples: 207120
3. Upgrading rest of first batch to new version: a9afe5bbd161f86281a6a8ca898905e3133f5e8a
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6217.15 txn/s, latency: 4591.84 ms, (p50: 5300 ms, p70: 5500, p90: 5600 ms, p99: 5700 ms), latency samples: 113640
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6375.59 txn/s, latency: 5156.66 ms, (p50: 5700 ms, p70: 5700, p90: 5900 ms, p99: 6000 ms), latency samples: 216380
4. upgrading second batch to new version: a9afe5bbd161f86281a6a8ca898905e3133f5e8a
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 8945.92 txn/s, latency: 3091.68 ms, (p50: 3200 ms, p70: 3500, p90: 4600 ms, p99: 4900 ms), latency samples: 158440
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8538.19 txn/s, latency: 3752.34 ms, (p50: 3400 ms, p70: 4700, p90: 5200 ms, p99: 6500 ms), latency samples: 280040
5. check swarm health
Compatibility test for f38ec72e975a4dff4e9919e7a1d8118a75858bab ==> a9afe5bbd161f86281a6a8ca898905e3133f5e8a passed
Test Ok

@vineethk vineethk merged commit 374f2cd into main Oct 28, 2024
84 of 92 checks passed
@vineethk vineethk deleted the vk/framework-reduced-tests branch October 28, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants