Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global flag optimizations #4027

Merged
merged 19 commits into from
Sep 8, 2024
Merged

Conversation

alyssarosenzweig
Copy link
Collaborator

Soup up our flag opt pass to optimize globally for a speed up with multiblock. Then use that framework to implement peephole passes to fuse comparisons & axflag into branches when the flags don't otherwise escape.

Perf #s aren't as impressive as i'd hoped, but Billy found a 0.9% improvement on geekbench on x13s on an early version of this series. Hopefully better now. At this point would like to cut my losses and get this in since it's no worse and XTA does something similar so I was going to get to it eventually anyways.

@alyssarosenzweig alyssarosenzweig force-pushed the opt/global-flag branch 5 times, most recently from 45fbec5 to 1c2ecf0 Compare September 6, 2024 15:21
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
more optimized than TestNZ if we don't care about the sign bit.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
so we can get the new axflag optimizations on billy's x13s.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
so we can optimize it globally

Signed-off-by: Alyssa Rosenzweig <[email protected]>
so we can gate optimizations efficiently

Signed-off-by: Alyssa Rosenzweig <[email protected]>
@alyssarosenzweig alyssarosenzweig force-pushed the opt/global-flag branch 2 times, most recently from 699e861 to cf055aa Compare September 7, 2024 14:45
this makes reasoning about them a little easier, e.g. for flags.  about 1% win
in nodejs.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
not needed and getting in the way

Signed-off-by: Alyssa Rosenzweig <[email protected]>
nonzero <==> not dummy

Signed-off-by: Alyssa Rosenzweig <[email protected]>
Gather a control flow graph and use it to propagate flags throughout the
program.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
now that we know whether flags are killed on the edge, we can improve branch
isel

Signed-off-by: Alyssa Rosenzweig <[email protected]>
this saves uops.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
another global CFG-based optimization -- if we know that the raw PF is already
1-bit we can skip parity evaluation, saving work with floating point compares.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
it's only really load bearing for pf/af, which is handled as a global flag opt
now. this mitigates some of the compile time hit from globalizing flag opts.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
@Sonicadvance1
Copy link
Member

Nice little opts

@Sonicadvance1 Sonicadvance1 merged commit 1c59bfe into FEX-Emu:main Sep 8, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants