-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-v2] Test cases reduced from the framework showcasing need for stack optimizations #14800
Conversation
⏱️ 55m total CI duration on this PR
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## vk/variable-window-peephole #14800 +/- ##
==============================================================
- Coverage 59.8% 59.8% -0.1%
==============================================================
Files 853 853
Lines 207923 207923
==============================================================
- Hits 124365 124348 -17
- Misses 83558 83575 +17 ☔ View full report in Codecov by Sentry. |
17a143c
to
3c9937d
Compare
e8f3b89
to
f384f11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The eager_load_03.exp
disassembly looks as good as we can do without IPA. I guess having another test doesn't really hurt, but I'm not sure why it's here.
Others do look like they could benefit from a more stack-aware stack-code-generation phase, that builds expression trees and rearranges computations to push values on the stack in the order needed for the next operation.
1: MutBorrowLoc[0](Arg0: u64) | ||
2: Call bar(&mut u64) | ||
3: StLoc[1](loc0: u64) | ||
4: MoveLoc[0](Arg0: u64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without interprocedural analysis, this code looks pretty optimal. You can't load Arg0
earlier because of the call to bar(&mut Arg0)
. Meanwhile, loc0
comes from the call to one()
and you have to rearrange the two arguments before calling baz
. A swap
operation would be useful here, but without that this looks good.
Of course, if you can inline the other calls you can do much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was a test case to show that you should not pre-load Arg0
because of the &mut
to it, like you point out.
I should have mentioned in the PR description that there are some negative test cases as well - this one is actually handcrafted and not reduced from framework code.
49c643f
to
510d05e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…or stack optimizations
cbacd4a
to
75e38e4
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
✅ Forge suite
|
Description
In this PR, we add a bunch of test cases that show opportunities for optimization of generated stack-based bytecode. These tests were reduced from
aptos-framework
.I have put them in a folder called
eager-pushes
because they can generally be resolved by eagerly pushing the operand on to the stack at the appropriate point.How Has This Been Tested?
New tests are added.
Key Areas to Review
Tests and the generated bytecode. It might be helpful to look at the stackless bytecode generated as well.
Type of Change
Which Components or Systems Does This Change Impact?