Skip to content

Commit

Permalink
[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (llvm#77438)
Browse files Browse the repository at this point in the history
Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
separate LOADCNT, SAMPLECNT and BVHCNT counters.
  • Loading branch information
jayfoad authored and ampandey-1995 committed Jan 19, 2024
1 parent 5af0b8d commit cc728c1
Show file tree
Hide file tree
Showing 109 changed files with 5,939 additions and 3,912 deletions.
3 changes: 2 additions & 1 deletion llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1242,7 +1242,8 @@ bool GCNHazardRecognizer::fixSMEMtoVectorWriteHazards(MachineInstr *MI) {
case AMDGPU::S_WAITCNT: {
const int64_t Imm = MI.getOperand(0).getImm();
AMDGPU::Waitcnt Decoded = AMDGPU::decodeWaitcnt(IV, Imm);
return (Decoded.LgkmCnt == 0);
// DsCnt corresponds to LGKMCnt here.
return (Decoded.DsCnt == 0);
}
default:
// SOPP instructions cannot mitigate the hazard.
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Target/AMDGPU/GCNSubtarget.h
Original file line number Diff line number Diff line change
Expand Up @@ -1200,6 +1200,10 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,

bool hasRestrictedSOffset() const { return HasRestrictedSOffset; }

/// \returns true if the target uses LOADcnt/SAMPLEcnt/BVHcnt, DScnt/KMcnt
/// and STOREcnt rather than VMcnt, LGKMcnt and VScnt respectively.
bool hasExtendedWaitCounts() const { return getGeneration() >= GFX12; }

/// Return the maximum number of waves per SIMD for kernels using \p SGPRs
/// SGPRs
unsigned getOccupancyWithNumSGPRs(unsigned SGPRs) const;
Expand Down
Loading

0 comments on commit cc728c1

Please sign in to comment.