[AIEX] Delay metalizing of multi-slot until iterative scheduling is converged #182

krishnamtibrewala · 2024-09-06T21:58:57Z

This PR allows Multi-Slot Instr. to be used during iterative scheduling of loop.
Before this PR we were materializing Multi-Slot Instr. to selected OpCode/Slot after first iteration of iterative scheduling.
Now we wait until PostRA scheduling is converged to an acceptable schedule and we have decided to commit the schedule and move to next MBB.

When is the materialization triggered now? : When we leave a MBB.
Does this change depending on the region type? : The changes are agnostic to region type.
Could you think of the case where the materialization is not triggered before moving on to another MBB? : None that I can think of

Note : Given the information of what Alternate opcode/desc was selected is stored in Hazard Recognizer for a region.
And by the time we come to the end of MBB ( i.e leaveMBB() ) we do not have access to the instance of those Hazard Recognizer, therefore we need to make the Alternate opcode/desc part of the BlockState

krishnamtibrewala · 2024-09-06T22:01:38Z

@martien-de-jong , @andcarminati.
Kindly review and provide an early feedback toward the approach

llvm/lib/Target/AIE/AIEHazardRecognizer.h

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

andcarminati · 2024-09-09T07:47:42Z

Hi @krishnamtibrewala, nice work! Do you have some results for the PixelShuffle*/PixelUnshuffle* benchmarks? If I remember correctly, we have some suboptimal mov desc assignments (movx should be selected instead of mova to not shift loads ups).

llvm/lib/Target/AIE/AIEMachineScheduler.cpp

martien-de-jong

Could we have a test example where we see improvement?

llvm/lib/Target/AIE/AIEInterBlockScheduling.h

krishnamtibrewala · 2024-09-11T22:45:17Z

Do you have some results for the PixelShuffle*/PixelUnshuffle* benchmarks? If I remember correctly, we have some suboptimal mov desc assignments (movx should be selected instead of mova to not shift loads ups).

@andcarminati I tried but I did not see any change, still investigating why.

Could we have a test example where we see improvement?

@martien-de-jong still figuring out.

Based on discussion with @gbossu we were expecting some impact but with current implementation QoR have no change.

krishnamtibrewala · 2024-09-13T00:56:24Z

llvm/test/CodeGen/AIE/aie2/schedule/loopaware/loop-multiSlot.mir

+    $wh10 = VMOV_mv_w $wl0
+    JNZ $r3, %bb.1
+    DelayedSchedBarrier
+  bb.2:


Attached is a diff, I am still trying to figure out how things are interacting.
Also will try to come up with a smaller test case.

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

andcarminati · 2024-09-19T14:56:54Z

As a general advice, I think we should have a class to manage the description handling. We can encapsulate and use it in HazardRecognizer and InterBlockScheduling.

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

llvm/lib/Target/AIE/AIEAlternateDescriptors.h

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

llvm/lib/Target/AIE/AIEMachineScheduler.cpp

gbossu · 2024-11-04T08:05:59Z

Could you summarize what this PR does? Maybe in the PR description. I'm particularly interested in:

When is the materialization triggered now?
Does this change depending on the region type?
Could you think of the case where the materialization is not triggered before moving on to another MBB?

llvm/lib/Target/AIE/AIEMachineScheduler.cpp

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

llvm/lib/Target/AIE/AIEMachineScheduler.cpp

gbossu · 2024-11-19T12:41:36Z

llvm/test/CodeGen/AIE/aie2/schedule/loopaware/loop-multiSlot.mir

+    liveins: $r1, $r2
+    successors: %bb.3
+    $r2 = OR $r2, $r1
+  bb.3:


That test is unfortunately very hard to read. Could you think of something smaller that shows a diff? I'd also suggest avoiding two labels like on/off and stick to unique CHECK lines unless the test is very concise.

krishnamtibrewala · 2024-11-20T01:26:25Z

QoR Results are as followed, there are few regression reset all benchmarks have same QoR

Select_aie2_bf16 409 440 REGR(+7.58%)
BitwiseXor_aie2_int8 731 776 REGR(+6.16%)
BilinearInterpolation_2 996 1018 REGR(+2.21%)
BilinearInterpolation_3 996 1018 REGR(+2.21%)
BilinearInterpolation_4 996 1018 REGR(+2.21%)
BilinearInterpolation_0 780 794 REGR(+1.79%)
Conv2D_bf16_2 19089 19281 REGR(+1.01%)
Conv2D_bf16_5 19089 19281 REGR(+1.01%)
Conv2D_bf16_8 20607 20799 REGR(+0.93%)
BilinearInterpolation_1 474 478 REGR(+0.84%)
Conv2D_bf16_59 6253 6301 REGR(+0.77%)

andcarminati reviewed Sep 9, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEHazardRecognizer.h Outdated Show resolved Hide resolved

andcarminati reviewed Sep 9, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEHazardRecognizer.h Outdated Show resolved Hide resolved

andcarminati reviewed Sep 9, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Outdated Show resolved Hide resolved

martien-de-jong reviewed Sep 10, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

martien-de-jong reviewed Sep 10, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.h Outdated Show resolved Hide resolved

llvm/lib/Target/AIE/AIEInterBlockScheduling.h Outdated Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from 46df116 to 54977ed Compare September 10, 2024 23:45

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from 54977ed to 12024f9 Compare September 13, 2024 00:52

krishnamtibrewala commented Sep 13, 2024

View reviewed changes

andcarminati reviewed Sep 19, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Outdated Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch 3 times, most recently from 98514f1 to 4033a60 Compare September 24, 2024 04:29

krishnamtibrewala mentioned this pull request Oct 1, 2024

[AIEX] NFC: Refactor Alternate Descriptor codebase #193

Merged

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from 4033a60 to aaf0aa8 Compare October 3, 2024 12:24

krishnamtibrewala marked this pull request as ready for review October 3, 2024 12:24

krishnamtibrewala requested review from abhinay-anubola, abnikant, gbossu, khallouh, konstantinschwarz, SagarMaheshwari99 and stephenneuendorffer as code owners October 3, 2024 12:24

krishnamtibrewala commented Oct 3, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from aaf0aa8 to 904d379 Compare October 5, 2024 12:19

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from 904d379 to 49f3ad1 Compare October 14, 2024 06:12

krishnamtibrewala changed the title ~~[AIEX] Re-assign multi-slot instructions during iterative scheduling~~ [AIEX] Delay metalizing of multi-slot until iterative scheduling is converged Oct 14, 2024

gbossu reviewed Oct 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEAlternateDescriptors.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Outdated Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch 2 times, most recently from 02d0b50 to 8bf356f Compare October 14, 2024 11:24

gbossu reviewed Oct 15, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 18, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch 2 times, most recently from 7a241b2 to fdab2af Compare October 22, 2024 18:32

gbossu reviewed Nov 4, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from fdab2af to 74da8ee Compare November 7, 2024 15:57

gbossu reviewed Nov 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

gbossu reviewed Nov 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Outdated Show resolved Hide resolved

gbossu reviewed Nov 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Show resolved Hide resolved

gbossu reviewed Nov 14, 2024

View reviewed changes

llvm/lib/Target/AIE/AIEMachineScheduler.cpp Outdated Show resolved Hide resolved

[AIEX] Re-assign multi-slot instructions during iterative scheduling

f7ef02c

krishnamtibrewala force-pushed the aie-loop-multiOpcode branch from 74da8ee to f7ef02c Compare November 18, 2024 21:20

gbossu reviewed Nov 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIEX] Delay metalizing of multi-slot until iterative scheduling is converged #182

[AIEX] Delay metalizing of multi-slot until iterative scheduling is converged #182

krishnamtibrewala commented Sep 6, 2024 •

edited

Loading

krishnamtibrewala commented Sep 6, 2024

andcarminati commented Sep 9, 2024

martien-de-jong left a comment

krishnamtibrewala commented Sep 11, 2024

krishnamtibrewala Sep 13, 2024

andcarminati commented Sep 19, 2024

gbossu commented Nov 4, 2024

gbossu Nov 19, 2024

krishnamtibrewala commented Nov 20, 2024

[AIEX] Delay metalizing of multi-slot until iterative scheduling is converged #182

Are you sure you want to change the base?

[AIEX] Delay metalizing of multi-slot until iterative scheduling is converged #182

Conversation

krishnamtibrewala commented Sep 6, 2024 • edited Loading

krishnamtibrewala commented Sep 6, 2024

andcarminati commented Sep 9, 2024

martien-de-jong left a comment

Choose a reason for hiding this comment

krishnamtibrewala commented Sep 11, 2024

krishnamtibrewala Sep 13, 2024

Choose a reason for hiding this comment

andcarminati commented Sep 19, 2024

gbossu commented Nov 4, 2024

gbossu Nov 19, 2024

Choose a reason for hiding this comment

krishnamtibrewala commented Nov 20, 2024

krishnamtibrewala commented Sep 6, 2024 •

edited

Loading