Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2] Rewrite G_CONCAT_VECTOR selection as table-gen #48

Merged
merged 3 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions llvm/lib/Target/AIE/AIE2InstrPatterns.td
Original file line number Diff line number Diff line change
Expand Up @@ -961,6 +961,22 @@ def : Concat256<int_aie2_concat_512_256_acc, ACC512, ACC256>;
def : Concat512<int_aie2_concat_1024_512_acc, ACC1024, ACC512>;
def : Concat1024<int_aie2_concat_1024_256_acc, ACC1024, ACC256>;

// concat_vector generic opcode
def : Pat<(v16i32 (concat_vectors (v8i32 VEC256:$src0), (v8i32 VEC256:$src1))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are patterns for CONCAT above. Can we replace the intrinsics int_aie2_concat_I512_I256 and friends with the generic ones?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they should be mostly be the same. Would you like me to update the intrinsics definition so that they rely on concat_vectors or just replace them out right? The former is a bit of breaking change for people that might depend on those intrinsics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot define intrinsics like that these concat_vectors cannot be on the output side of a table-gen file. So my suggestion would be just keeping it place to give a transition time and remove it later.

(v16i32 (REG_SEQUENCE VEC512, VEC256:$src0, sub_256_lo, VEC256:$src1, sub_256_hi))>;
def : Pat<(v32i32 (concat_vectors (v16i32 VEC512:$src0), (v16i32 VEC512:$src1))),
(v32i32 (REG_SEQUENCE VEC1024, VEC512:$src0, sub_512_lo, VEC512:$src1, sub_512_hi))>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL G_CONCAT_VECTOR exists. Where are such opcodes created, is this from a generic combiner that identifies a certain G_SHUFFLE_VECTOR pattern?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

G_CONCAT_VECTOR is created by the legalizer whenever it needs to combine two vectors together. So, for example, if you ask for 1024-bit vector you need to build 2 512-bit vectors and concat them together. There is also a combiner for it, which I enable in #41.


def : Pat<(v32i16 (concat_vectors (v16i16 VEC256:$src0), (v16i16 VEC256:$src1))),
(v32i16 (REG_SEQUENCE VEC512, VEC256:$src0, sub_256_lo, VEC256:$src1, sub_256_hi))>;
def : Pat<(v64i16 (concat_vectors (v32i16 VEC512:$src0), (v32i16 VEC512:$src1))),
(v64i16 (REG_SEQUENCE VEC1024, VEC512:$src0, sub_512_lo, VEC512:$src1, sub_512_hi))>;

def : Pat<(v64i8 (concat_vectors (v32i8 VEC256:$src0), (v32i8 VEC256:$src1))),
(v64i8 (REG_SEQUENCE VEC512, VEC256:$src0, sub_256_lo, VEC256:$src1, sub_256_hi))>;
def : Pat<(v128i8 (concat_vectors (v64i8 VEC512:$src0), (v64i8 VEC512:$src1))),
(v128i8 (REG_SEQUENCE VEC1024, VEC512:$src0, sub_512_lo, VEC512:$src1, sub_512_hi))>;

// Extract
def : Pat<(int_aie2_ext_I256_I512 VEC512:$src, 0x0),
(v8i32 (EXTRACT_SUBREG VEC512:$src, sub_256_lo))>;
Expand Down
49 changes: 0 additions & 49 deletions llvm/lib/Target/AIE/AIE2InstructionSelector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,6 @@ class AIE2InstructionSelector : public InstructionSelector {
bool isWrite);
bool selectG_AIE_ADD_VECTOR_ELT_LEFT(MachineInstr &I,
MachineRegisterInfo &MRI);
bool selectG_CONCAT_VECTORS(MachineInstr &I, MachineRegisterInfo &MRI);
bool selectG_BRCOND(MachineInstr &I, MachineRegisterInfo &MRI);
bool selectG_BRINDIRECT(MachineInstr &I, MachineRegisterInfo &MRI);
bool selectG_JUMP_TABLE(MachineInstr &I, MachineRegisterInfo &MRI);
Expand Down Expand Up @@ -742,8 +741,6 @@ bool AIE2InstructionSelector::select(MachineInstr &I) {
return selectG_UNMERGE_VALUES(I, MRI);
case AIE2::G_AIE_ADD_VECTOR_ELT_LEFT:
return selectG_AIE_ADD_VECTOR_ELT_LEFT(I, MRI);
case G_CONCAT_VECTORS:
return selectG_CONCAT_VECTORS(I, MRI);
case AIE2::G_AIE_OFFSET_STORE:
case AIE2::G_AIE_POSTINC_STORE:
case AIE2::G_AIE_POSTINC_2D_STORE:
Expand Down Expand Up @@ -843,52 +840,6 @@ bool AIE2InstructionSelector::selectG_AIE_ADD_VECTOR_ELT_LEFT(
return constrainSelectedInstRegOperands(MI, TII, TRI, RBI);
}

// WIP: Implement this as a tablegen pattern instead, it is very similar to the
// definition used for instrinsics.
bool AIE2InstructionSelector::selectG_CONCAT_VECTORS(MachineInstr &I,
MachineRegisterInfo &MRI) {
const Register DstVecReg = I.getOperand(0).getReg();
const Register Src1VecReg = I.getOperand(1).getReg();
const Register Src2VecReg = I.getOperand(2).getReg();

const unsigned DstVecSize = MRI.getType(DstVecReg).getSizeInBits();
const unsigned Src1VecSize = MRI.getType(Src1VecReg).getSizeInBits();
const unsigned Src2VecSize = MRI.getType(Src2VecReg).getSizeInBits();

assert(
Src1VecSize == Src2VecSize && (DstVecSize == 2 * Src1VecSize) &&
(I.getNumOperands() == 3) &&
"Vectors can only be concatenated if the size of the two operands are "
"the same and are, if added together, equal to the destination vector");

// We are using a Reg Sequence here instead of copies since using
// subregisters causes the SSA violations to occur since it sees %x_hi and
// %x_lo as the same register.
MachineInstrBuilder RegSeq;
const TargetRegisterClass *TRC = nullptr;
if (DstVecSize == 512) {
RegSeq = MIB.buildInstr(AIE2::REG_SEQUENCE, {DstVecReg}, {})
.addReg(Src1VecReg)
.addImm(AIE2::sub_256_lo)
.addReg(Src2VecReg)
.addImm(AIE2::sub_256_hi);
TRC = &AIE2::VEC512RegClass;
} else if (DstVecSize == 1024) {
RegSeq = MIB.buildInstr(AIE2::REG_SEQUENCE, {DstVecReg}, {})
.addReg(Src1VecReg)
.addImm(AIE2::sub_512_lo)
.addReg(Src2VecReg)
.addImm(AIE2::sub_512_hi);
TRC = &AIE2::VEC1024RegClass;
}

constrainOperandRegClass(*MF, TRI, MRI, TII, RBI, *RegSeq, *TRC,
RegSeq->getOperand(0));

I.eraseFromParent();
return true;
}

bool AIE2InstructionSelector::selectG_BRCOND(MachineInstr &I,
MachineRegisterInfo &MRI) {
Register CondReg = I.getOperand(0).getReg();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ stack:
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_1024
; CHECK: [[DEF:%[0-9]+]]:vec512 = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:vec512 = IMPLICIT_DEF
; CHECK: [[DEF:%[0-9]+]]:exe = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:exo = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec1024 = REG_SEQUENCE [[DEF]], %subreg.sub_512_lo, [[DEF1]], %subreg.sub_512_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<16 x s32>) = G_IMPLICIT_DEF
Expand All @@ -29,20 +29,100 @@ body: |
...

---
name: vconcat_512
name: vconcat_512_32
legalized: true
regBankSelected: true
tracksRegLiveness: true
stack:
- { id: 0, name: "", size: 128, alignment: 32 }
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_512
; CHECK: [[DEF:%[0-9]+]]:vec256 = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:vec256 = IMPLICIT_DEF
; CHECK-LABEL: name: vconcat_512_32
; CHECK: [[DEF:%[0-9]+]]:ewl = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:ewh = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec512 = REG_SEQUENCE [[DEF]], %subreg.sub_256_lo, [[DEF1]], %subreg.sub_256_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<8 x s32>) = G_IMPLICIT_DEF
%1:vregbank(<8 x s32>) = G_IMPLICIT_DEF
%2:vregbank(<16 x s32>) = G_CONCAT_VECTORS %0(<8 x s32>), %1(<8 x s32>)
PseudoRET implicit $lr, implicit %2
...

---
name: vconcat_512_16
legalized: true
regBankSelected: true
tracksRegLiveness: true
stack:
- { id: 0, name: "", size: 128, alignment: 32 }
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_512_16
; CHECK: [[DEF:%[0-9]+]]:ewl = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:ewh = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec512 = REG_SEQUENCE [[DEF]], %subreg.sub_256_lo, [[DEF1]], %subreg.sub_256_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<16 x s16>) = G_IMPLICIT_DEF
%1:vregbank(<16 x s16>) = G_IMPLICIT_DEF
%2:vregbank(<32 x s16>) = G_CONCAT_VECTORS %0(<16 x s16>), %1(<16 x s16>)
PseudoRET implicit $lr, implicit %2
...

---
name: vconcat_1024_16
legalized: true
regBankSelected: true
tracksRegLiveness: true
stack:
- { id: 0, name: "", size: 128, alignment: 32 }
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_1024_16
; CHECK: [[DEF:%[0-9]+]]:exe = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:exo = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec1024 = REG_SEQUENCE [[DEF]], %subreg.sub_512_lo, [[DEF1]], %subreg.sub_512_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<32 x s16>) = G_IMPLICIT_DEF
%1:vregbank(<32 x s16>) = G_IMPLICIT_DEF
%2:vregbank(<64 x s16>) = G_CONCAT_VECTORS %0(<32 x s16>), %1(<32 x s16>)
PseudoRET implicit $lr, implicit %2
...

---
name: vconcat_512_8
legalized: true
regBankSelected: true
tracksRegLiveness: true
stack:
- { id: 0, name: "", size: 128, alignment: 32 }
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_512_8
; CHECK: [[DEF:%[0-9]+]]:ewl = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:ewh = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec512 = REG_SEQUENCE [[DEF]], %subreg.sub_256_lo, [[DEF1]], %subreg.sub_256_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<32 x s8>) = G_IMPLICIT_DEF
%1:vregbank(<32 x s8>) = G_IMPLICIT_DEF
%2:vregbank(<64 x s8>) = G_CONCAT_VECTORS %0(<32 x s8>), %1(<32 x s8>)
PseudoRET implicit $lr, implicit %2
...

---
name: vconcat_1024_8
legalized: true
regBankSelected: true
tracksRegLiveness: true
stack:
- { id: 0, name: "", size: 128, alignment: 32 }
body: |
bb.0.entry:
; CHECK-LABEL: name: vconcat_1024_8
; CHECK: [[DEF:%[0-9]+]]:exe = IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:exo = IMPLICIT_DEF
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vec1024 = REG_SEQUENCE [[DEF]], %subreg.sub_512_lo, [[DEF1]], %subreg.sub_512_hi
; CHECK-NEXT: PseudoRET implicit $lr, implicit [[REG_SEQUENCE]]
%0:vregbank(<64 x s8>) = G_IMPLICIT_DEF
%1:vregbank(<64 x s8>) = G_IMPLICIT_DEF
%2:vregbank(<128 x s8>) = G_CONCAT_VECTORS %0(<64 x s8>), %1(<64 x s8>)
PseudoRET implicit $lr, implicit %2
Loading