Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add GFX12 WMMA and SWMMAC instructions #77795

Merged
merged 10 commits into from
Jan 24, 2024

Conversation

mbrkusanin
Copy link
Collaborator

No description provided.

@mbrkusanin
Copy link
Collaborator Author

mbrkusanin commented Jan 11, 2024

Note that the first commit in this PR is: #77785 (merged and removed from this PR)

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen mc Machine (object) code llvm:globalisel llvm:ir llvm:analysis labels Jan 11, 2024
@llvmbot
Copy link

llvmbot commented Jan 11, 2024

@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang-codegen

Author: Mirko Brkušanin (mbrkusanin)

Changes

Patch is 1002.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77795.diff

62 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+61)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+159-13)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109)
  • (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+93-28)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+27-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+331-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+11-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+214-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+14-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+141-5)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8)
  • (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+5-3)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4)
  • (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3)
  • (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1)
  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+31-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+12-1)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5)
  • (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+498-10)
  • (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3)
  • (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+118-18)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+358)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+359)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529)
  • (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628)
  • (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628)
diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index e562ef04a30194..026c0af65c92bb 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -423,6 +423,67 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts")
 
+//===----------------------------------------------------------------------===//
+// WMMA builtins.
+// Postfix w32 indicates the builtin requires wavefront size of 32.
+// Postfix w64 indicates the builtin requires wavefront size of 64.
+//
+// Some of these are very similar to their GFX11 counterparts, but they don't
+// require replication of the A,B matrices, so they use fewer vector elements.
+// Therefore, we add an "_gfx12" suffix to distinguish them from the existing
+// builtins.
+//===----------------------------------------------------------------------===//
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12, "V8fV8hV8hV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12, "V8fV8sV8sV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12, "V8hV8hV8hV8h", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12, "V8sV8sV8sV8s", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12, "V8iIbiIbiV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+// These are gfx12-only, but for consistency with the other WMMA variants we're
+// keeping the "_gfx12" suffix.
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12, "V4fV4hV4hV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12, "V4fV4sV4sV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12, "V4hV4hV4hV4h", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12, "V4sV4sV4sV4s", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+// These are gfx12-only, but for consistency with the other WMMA variants we're
+// keeping the "_gfx12" suffix.
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32, "V8fV8hV16hV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32, "V8fV8sV16sV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32, "V8hV8hV16hV8hs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32, "V8sV8sV16sV8ss", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32, "V8iIbV2iIbV4iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32, "V8iIbiIbV2iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32, "V8iIbV2iIbV4iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64, "V4fV4hV8hV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64, "V4fV4sV8sV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64, "V4hV4hV8hV4hs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64, "V4sV4sV8sV4ss", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64, "V4iIbiIbV2iV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64, "V4iIbiIbiV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64, "V4iIbiIbV2iV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
 
 #undef BUILTIN
 #undef TARGET_BUILTIN
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 998fcc3af58175..c588b32f698bf5 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18240,65 +18240,211 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32:
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64:
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32:
-  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64: {
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64:
+  case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64: {
 
     // These operations perform a matrix multiplication and accumulation of
     // the form:
     //             D = A * B + C
-    // The return type always matches the type of matrix C.
-    unsigned ArgForMatchingRetType;
+    // We need to specify one type for matrices AB and one for matrices CD.
+    SmallVector<unsigned, 2> ArgsForMatchingMatrixTypes;
+    // Some intrinsics expect "false" as an extra bool argument.
+    bool AppendExtraBoolArg = false;
     unsigned BuiltinWMMAOp;
 
     switch (BuiltinID) {
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64:
-      ArgForMatchingRetType = 2;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_f16;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64:
-      ArgForMatchingRetType = 2;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf16;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12:
+      AppendExtraBoolArg = true;
+      LLVM_FALLTHROUGH;
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f16_16x16x16_f16;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12:
+      AppendExtraBoolArg = true;
+      LLVM_FALLTHROUGH;
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_bf16_16x16x16_bf16;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f16_16x16x16_f16_tied;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_bf16_16x16x16_bf16_tied;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64:
-      ArgForMatchingRetType = 4;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x16_iu8;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64:
-      ArgForMatchingRetType = 4;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x16_iu4;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_fp8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_fp8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x32_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_f16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f16_16x16x32_f16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_bf16_16x16x32_bf16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x32_iu8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x32_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x64_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_fp8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_fp8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf8_bf8;
+      break;
     }
 
     SmallVector<Value *, 6> Args;
     for (int i = 0, e = E->getNumArgs(); i != e; ++i)
       Args.push_back(EmitScalarExpr(E->getArg(i)));
+    if (AppendExtraBoolArg)
+      Args.push_back(Builder.getFalse());
 
-    Function *F = CGM.getIntrinsic(BuiltinWMMAOp,
-                                   {Args[ArgForMatchingRetType]->getType()});
+    SmallVector<llvm::Type *, 6> ArgTypes;
+    for (auto ArgIdx : ArgsForMatchingMatrixTypes)
+      ArgTypes.push_back(Args[ArgIdx]->getType());
 
+    Function *F = CGM.getIntrinsic(BuiltinWMMAOp, ArgTypes);
     return Builder.CreateCall(F, Args);
   }
 
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl
new file mode 100644
index 00000000000000..a5d8bb34a7842d
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl
@@ -0,0 +1,156 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1200 -target-feature +wavefrontsize32 -S -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-GFX1200
+
+typedef int    v2i   __attribute__((ext_vector_type(2)));
+typedef float  v8f   __attribute__((ext_vector_type(8)));
+typedef half   v8h   __attribute__((ext_vector_type(8)));
+typedef short  v8s   __attribute__((ext_vector_type(8)));
+typedef int    v8i   __attribute__((ext_vector_type(8)));
+
+// Wave32
+
+//
+// amdgcn_wmma_f32_16x16x16_f16
+//
+
+// CHECK-GFX1200-LABEL: @test_amdgcn_wmma_f32_16x16x16_f16_w32(
+// CHECK-GFX1200-NEXT:  entry:
+// CHECK-GFX1200-NEXT:    [[TMP0:%.*]] = tail call <8 x float> @llvm.amdgcn.wmma.f32.16x16x16.f16.v8f16.v8f32(<8 x half> [[A:%.*]], <8 x half> [[B:%.*]], <8 x float> [[C:%.*]])
+// CHECK-GFX1200-NEXT:    store <8 x float> [[TMP0]], ptr addrspace(1) [[OUT:%.*]], align 32, !tbaa [[TBAA4:![0-9]+]]
+// CHECK-GFX1200-NEXT:    ret void
+//
+void test_amdgcn_wmma_f32_16x16x16_f16_w32(global v8f* out, v8h a, v8h b, v8f c)
+{
+  *out = __builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12(a, b, c);
+}
+
+//
+// amdgcn_wmma_f...
[truncated]

@llvmbot
Copy link

llvmbot commented Jan 11, 2024

@llvm/pr-subscribers-mc

Author: Mirko Brkušanin (mbrkusanin)

Changes

Patch is 1002.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77795.diff

62 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+61)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+159-13)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110)
  • (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109)
  • (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+93-28)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+27-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+331-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+11-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+214-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+14-1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+141-5)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8)
  • (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+5-3)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4)
  • (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3)
  • (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1)
  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+31-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+12-1)
  • (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5)
  • (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+498-10)
  • (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3)
  • (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+118-18)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472)
  • (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+358)
  • (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+359)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529)
  • (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628)
  • (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628)
diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index e562ef04a30194..026c0af65c92bb 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -423,6 +423,67 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts")
 
+//===----------------------------------------------------------------------===//
+// WMMA builtins.
+// Postfix w32 indicates the builtin requires wavefront size of 32.
+// Postfix w64 indicates the builtin requires wavefront size of 64.
+//
+// Some of these are very similar to their GFX11 counterparts, but they don't
+// require replication of the A,B matrices, so they use fewer vector elements.
+// Therefore, we add an "_gfx12" suffix to distinguish them from the existing
+// builtins.
+//===----------------------------------------------------------------------===//
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12, "V8fV8hV8hV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12, "V8fV8sV8sV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12, "V8hV8hV8hV8h", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12, "V8sV8sV8sV8s", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12, "V8iIbiIbiV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+// These are gfx12-only, but for consistency with the other WMMA variants we're
+// keeping the "_gfx12" suffix.
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12, "V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12, "V4fV4hV4hV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12, "V4fV4sV4sV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12, "V4hV4hV4hV4h", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12, "V4sV4sV4sV4s", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+// These are gfx12-only, but for consistency with the other WMMA variants we're
+// keeping the "_gfx12" suffix.
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12, "V4fiiV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12, "V4iIbiIbiV4iIb", "nc", "gfx12-insts,wavefrontsize64")
+
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32, "V8fV8hV16hV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32, "V8fV8sV16sV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32, "V8hV8hV16hV8hs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32, "V8sV8sV16sV8ss", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32, "V8iIbV2iIbV4iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32, "V8iIbiIbV2iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32, "V8iIbV2iIbV4iV8isIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32, "V8fV2iV4iV8fs", "nc", "gfx12-insts,wavefrontsize32")
+
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64, "V4fV4hV8hV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64, "V4fV4sV8sV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64, "V4hV4hV8hV4hs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64, "V4sV4sV8sV4ss", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64, "V4iIbiIbV2iV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64, "V4iIbiIbiV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64, "V4iIbiIbV2iV4isIb", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64, "V4fiV2iV4fs", "nc", "gfx12-insts,wavefrontsize64")
 
 #undef BUILTIN
 #undef TARGET_BUILTIN
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 998fcc3af58175..c588b32f698bf5 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18240,65 +18240,211 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32:
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64:
   case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32:
-  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64: {
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64:
+  case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32:
+  case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64: {
 
     // These operations perform a matrix multiplication and accumulation of
     // the form:
     //             D = A * B + C
-    // The return type always matches the type of matrix C.
-    unsigned ArgForMatchingRetType;
+    // We need to specify one type for matrices AB and one for matrices CD.
+    SmallVector<unsigned, 2> ArgsForMatchingMatrixTypes;
+    // Some intrinsics expect "false" as an extra bool argument.
+    bool AppendExtraBoolArg = false;
     unsigned BuiltinWMMAOp;
 
     switch (BuiltinID) {
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64:
-      ArgForMatchingRetType = 2;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_f16;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64:
-      ArgForMatchingRetType = 2;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf16;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64_gfx12:
+      AppendExtraBoolArg = true;
+      LLVM_FALLTHROUGH;
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f16_16x16x16_f16;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64_gfx12:
+      AppendExtraBoolArg = true;
+      LLVM_FALLTHROUGH;
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_bf16_16x16x16_bf16;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_f16_16x16x16_f16_tied_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f16_16x16x16_f16_tied;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_bf16_16x16x16_bf16_tied_w64:
-      ArgForMatchingRetType = 2;
+      ArgsForMatchingMatrixTypes = {0, 2};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_bf16_16x16x16_bf16_tied;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64:
-      ArgForMatchingRetType = 4;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x16_iu8;
       break;
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32:
     case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64:
-      ArgForMatchingRetType = 4;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
       BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x16_iu4;
       break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_fp8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_fp8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {0, 2};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_f32_16x16x16_bf8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12:
+    case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x32_iu4_w64_gfx12:
+      ArgsForMatchingMatrixTypes = {1, 4};
+      BuiltinWMMAOp = Intrinsic::amdgcn_wmma_i32_16x16x32_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_f16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_f16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f16_16x16x32_f16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f16_16x16x32_f16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_bf16_16x16x32_bf16_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_bf16_16x16x32_bf16;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu8_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x32_iu8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x32_iu4_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x32_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_i32_16x16x64_iu4_w64:
+      ArgsForMatchingMatrixTypes = {1, 3, 4, 5};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_i32_16x16x64_iu4;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_fp8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_fp8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_fp8_bf8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_fp8_bf8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_fp8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf8_fp8;
+      break;
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w32:
+    case AMDGPU::BI__builtin_amdgcn_swmmac_f32_16x16x32_bf8_bf8_w64:
+      ArgsForMatchingMatrixTypes = {0, 1, 2, 3};
+      BuiltinWMMAOp = Intrinsic::amdgcn_swmmac_f32_16x16x32_bf8_bf8;
+      break;
     }
 
     SmallVector<Value *, 6> Args;
     for (int i = 0, e = E->getNumArgs(); i != e; ++i)
       Args.push_back(EmitScalarExpr(E->getArg(i)));
+    if (AppendExtraBoolArg)
+      Args.push_back(Builder.getFalse());
 
-    Function *F = CGM.getIntrinsic(BuiltinWMMAOp,
-                                   {Args[ArgForMatchingRetType]->getType()});
+    SmallVector<llvm::Type *, 6> ArgTypes;
+    for (auto ArgIdx : ArgsForMatchingMatrixTypes)
+      ArgTypes.push_back(Args[ArgIdx]->getType());
 
+    Function *F = CGM.getIntrinsic(BuiltinWMMAOp, ArgTypes);
     return Builder.CreateCall(F, Args);
   }
 
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl
new file mode 100644
index 00000000000000..a5d8bb34a7842d
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl
@@ -0,0 +1,156 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1200 -target-feature +wavefrontsize32 -S -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-GFX1200
+
+typedef int    v2i   __attribute__((ext_vector_type(2)));
+typedef float  v8f   __attribute__((ext_vector_type(8)));
+typedef half   v8h   __attribute__((ext_vector_type(8)));
+typedef short  v8s   __attribute__((ext_vector_type(8)));
+typedef int    v8i   __attribute__((ext_vector_type(8)));
+
+// Wave32
+
+//
+// amdgcn_wmma_f32_16x16x16_f16
+//
+
+// CHECK-GFX1200-LABEL: @test_amdgcn_wmma_f32_16x16x16_f16_w32(
+// CHECK-GFX1200-NEXT:  entry:
+// CHECK-GFX1200-NEXT:    [[TMP0:%.*]] = tail call <8 x float> @llvm.amdgcn.wmma.f32.16x16x16.f16.v8f16.v8f32(<8 x half> [[A:%.*]], <8 x half> [[B:%.*]], <8 x float> [[C:%.*]])
+// CHECK-GFX1200-NEXT:    store <8 x float> [[TMP0]], ptr addrspace(1) [[OUT:%.*]], align 32, !tbaa [[TBAA4:![0-9]+]]
+// CHECK-GFX1200-NEXT:    ret void
+//
+void test_amdgcn_wmma_f32_16x16x16_f16_w32(global v8f* out, v8h a, v8h b, v8f c)
+{
+  *out = __builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12(a, b, c);
+}
+
+//
+// amdgcn_wmma_f...
[truncated]

Comment on lines +440 to +454
TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12, "V8iIbiIbiV8iIb", "nc", "gfx12-insts,wavefrontsize32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the signed-unsigned fusion with control bit is a bit ugly, would have been nicer to have separate signed and unsigned variants with the types changed. I suppose this was already the mistake made with the gfx11 builtins though

clang/lib/CodeGen/CGBuiltin.cpp Outdated Show resolved Hide resolved
TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12, "V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")

TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12, "V4fV4hV4hV4f", "nc", "gfx12-insts,wavefrontsize64")
TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12, "V4fV4sV4sV4f", "nc", "gfx12-insts,wavefrontsize64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch new bf16 types to use the natural __bf16?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to bfloat but GlobalISel does not handle it properly yet. Should we use i16 for now until we update GlobalISel?

@jayfoad
Copy link
Contributor

jayfoad commented Jan 19, 2024

Some of the tests in this patch need regenerating now that #77438 has been merged.

@mbrkusanin
Copy link
Collaborator Author

Rebased.

@mbrkusanin mbrkusanin force-pushed the gfx12-wmma-swmmac branch 2 times, most recently from 46509ac to 732186b Compare January 22, 2024 17:08
@mbrkusanin
Copy link
Collaborator Author

If there are no further comments, should I merge this?

@mbrkusanin
Copy link
Collaborator Author

Ping

// The content of the other 16-bit half is undefined.
// GFX12: The op_sel bit must be 0.
def int_amdgcn_wmma_f16_16x16x16_f16 : AMDGPUWmmaIntrinsicOPSEL<llvm_anyfloat_ty, llvm_anyfloat_ty>;
def int_amdgcn_wmma_bf16_16x16x16_bf16 : AMDGPUWmmaIntrinsicOPSEL<llvm_any_ty, llvm_any_ty>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using any_Ty? Should just be the one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sizes are halved. GFX11 basically contained same matrix twice.

This is how intrinsics look like at the moment:
gfx11:
declare <16 x i16> @llvm.amdgcn.wmma.bf16.16x16x16.bf16(<16 x i16>, <16 x i16> , <16 x i16>, i1 immarg)
gfx12:
declare <8 x bfloat> @llvm.amdgcn.wmma.bf16.16x16x16.bf16(<8 x bfloat>, <8 x bfloat>, <8 x bfloat>, i1 immarg)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of horrible. It's not at all clear you're supposed to use one type for one target and a different one for another. I wonder if they should just be renamed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we take a step back, and push the previous version of the patch where the bf16 intrinsics used i16 for consistency with gfx11.

Then, in a follow-up commit we will add new bf16 intrinsics with the proper bfloat type (I realize the naming could be contentious, but we could discuss it in the follow-up review).

@mbrkusanin
Copy link
Collaborator Author

Rebased and reverted bfloat

@mbrkusanin
Copy link
Collaborator Author

Rebased and updated after #76143

@mbrkusanin mbrkusanin merged commit 7fdf608 into llvm:main Jan 24, 2024
3 of 4 checks passed
@mbrkusanin mbrkusanin deleted the gfx12-wmma-swmmac branch January 24, 2024 12:43
@anlunx
Copy link
Member

anlunx commented Jan 25, 2024

Also need to be updated:

def SMEMOffsetMod : NamedIntOperand<i32, "offset", 0>;

@jayfoad
Copy link
Contributor

jayfoad commented Jan 25, 2024

Also need to be updated:

def SMEMOffsetMod : NamedIntOperand<i32, "offset", 0>;

What needs to be updated and why?

mbrkusanin added a commit that referenced this pull request Jan 25, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
mbrkusanin added a commit to mbrkusanin/llvm-project that referenced this pull request Jan 26, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
tstellar pushed a commit that referenced this pull request Jan 27, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Mar 28, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
Change-Id: I6ab1132823033fb047665f3a527cff748ff69589
@pointhex pointhex mentioned this pull request May 7, 2024
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Aug 23, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Aug 23, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Sep 5, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Sep 6, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Sep 9, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Sep 10, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>

[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (llvm#78414)

…bf8 instructions

    Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16
    instructions that were supported on GFX940 (MI300):
    - V_CVT_F32_FP8
    - V_CVT_F32_BF8
    - V_CVT_PK_F32_FP8
    - V_CVT_PK_F32_BF8
    - V_CVT_PK_FP8_F32
    - V_CVT_PK_BF8_F32
    - V_CVT_SR_FP8_F32
    - V_CVT_SR_BF8_F32

---------

Co-authored-by: Mateja Marjanovic <[email protected]>
Co-authored-by: Mirko Brkušanin <[email protected]>
(cherry picked from commit cfddb59)

[RISCV] Support __riscv_v_fixed_vlen for vbool types. (llvm#76551)

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.

[Docs] Fix documentation build.

Missing ending `` after c92ad41

Backport '[clang] static operators should evaluate object argument (reland)' to release/18.x (llvm#80109)

Cherry picked from commit ee01a2c.

Closes llvm#80041, backport llvm#80108.

Co-authored-by: Shafik Yaghmour <[email protected]>
Co-authored-by: cor3ntin <[email protected]>
Co-authored-by: Aaron Ballman <[email protected]>

PR for llvm#79568 (llvm#80120)

Backporting llvm#79568 to clang 18.

[docs] Add release notes for Windows specific changes in 18.x (llvm#80011)

[AArch64] Add some release notes items (llvm#79983)

[C++20] [Modules] Don't perform ODR checks in GMF

Close llvm#79240.

See the linked issue for details. Given the frequency of issue reporting
about false positive ODR checks (I received private issue reports too),
I'd like to backport this to 18.x too.

[clang] Fix unexpected `-Wconstant-logical-operand` in C23 (llvm#80724)

C23 has `bool`, but logical operators still return int. Check that
we're not in C to avoid false-positive -Wconstant-logical-operand.

Fixes llvm#64356

(cherry picked from commit a18e92d)

[18.x][Docs] Add release note about Clang-defined target OS macros (llvm#80044)

The change is included in the 18.x release. Move the release note to the
release branch and reformat.

(cherry picked from commit b40d5b1)

ReleaseNotes: mention -mtls-dialect=desc (llvm#82731)

[Clang] Fixes to immediate-escalating functions (llvm#82281)

* Consider that immediate escalating function can appear at global
scope, fixing a crash

* Lambda conversion to function pointer was sometimes not performed in
an immediate function context when it should be.

Fixes llvm#82258

(cherry picked from commit baf6bd3)

[Clang] [Sema] Handle placeholders in '.*' expressions (llvm#83103)

When analysing whether we should handle a binary expression as an
overloaded operator call or a builtin operator, we were calling
`checkPlaceholderForOverload()`, which takes care of any placeholders
that are not overload sets—which would usually make sense since those
need to be handled as part of overload resolution.

Unfortunately, we were also doing that for `.*`, which is not
overloadable, and then proceeding to create a builtin operator anyway,
which would crash if the RHS happened to be an unresolved overload set
(due hitting an assertion in `CreateBuiltinBinOp()`—specifically, in one
of its callees—in the `.*` case that makes sure its arguments aren’t
placeholders).

This pr instead makes it so we check for *all* placeholders early if the
operator is `.*`.

It’s worth noting that,
1. In the `.*` case, we now additionally also check for *any*
placeholders (not just non-overload-sets) in the LHS; this shouldn’t
make a difference, however—at least I couldn’t think of a way to trigger
the assertion with an overload set as the LHS of `.*`; it is worth
noting that the assertion in question would also complain if the LHS
happened to be of placeholder type, though.
2. There is another case in which we also don’t perform overload
resolution—namely `=` if the LHS is not of class or enumeration type
after handling non-overload-set placeholders—as in the `.*` case, but
similarly to 1., I first couldn’t think of a way of getting this case to
crash, and secondly, `CreateBuiltinBinOp()` doesn’t seem to care about
placeholders in the LHS or RHS in the `=` case (from what I can tell,
it, or rather one of its callees, only checks that the LHS is not a
pseudo-object type, but those will have already been handled by the call
to `checkPlaceholderForOverload()` by the time we get to this function),
so I don’t think this case suffers from the same problem.

This fixes llvm#53815.

---------

Co-authored-by: Aaron Ballman <[email protected]>

[InstCombine] Fix miscompilation in PR83947 (llvm#83993)

https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes llvm#83947.

SystemZ release notes for 18.x. (llvm#84560)

Remove support for EXPORTAS in def files to maintain ABI compatibility for COFFShortExport

[clang][Sema] Fix a CTAD regression after 42239d2 (llvm#86914)

The most recent declaration of a template as a friend can introduce a
different template parameter depth compared to what we anticipate from a
CTAD guide.

Fixes llvm#86769

[clang] Avoid -Wshadow warning when init-capture named same as class field (llvm#74512)

Shadowing warning doesn't make much sense since field is not available
in lambda's body without capturing this.

Fixes llvm#71976

[SLP]Fix a crash if the argument of call was affected by minbitwidth analysis.

Need to support proper type conversion for function arguments to avoid
compiler crash.

Fix override keyword being print to the left side

Previously, the `override` keyword in C++ was being print in the left
side of a method decl, which is unsupported by C++ standard. This commit
fixes that by setting the `CanPrintOnLeft` field to 0, forcing it to be
print on the right side of the decl.

Signed-off-by: Giuliano Belinassi <[email protected]>

[clang codegen] Fix MS ABI detection of user-provided constructors. (llvm#90151)

In the context of determining whether a class counts as an "aggregate",
a constructor template counts as a user-provided constructor.

Fixes llvm#86384

(cherry picked from commit 3ab4ae9)

release/18.x: [libclc] Fix linking against libIRReader

Fixes llvm#91551

Update llvm/test/Transforms/InstCombine/bit_ceil.ll

Co-authored-by: Yingwei Zheng <[email protected]>

[RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.

This is ORed with the fast-unaligned-access feature which applies
to scalar and vector together.:

Regular squash.

Cobol/PlI changes from 785ddc60@https://gitlab.phidani.be/Chirag.Patel/lldb.git

Cobol/PLI support added from 6cefb217f097ac@https://gitlab.phidani.be/Chirag.Patel/llvm.git

[LLDB] added lldb rpmbuild spec file

[RPMBuild] Added lldb rpmbuild support.

[LLDBRpm] added version for yum update.

[lldb_rpm] minor cleanup.

Build fix rleated to RTTI.

build fix.

[DWARFASTParserLegacy] Initial support for union type.

[lldb][LegacyTypeSystem] Changed struct/union member offset from bytes to bits to support DW_AT_data_bit_offset.

[lldbrpm] changed version suffix.

[LegacyASTContext] Fix var string length display.

[LLDB][CobolUserExpression] Added AST node Function call place holder.

build fix.

[LLVM][Test] Fixed assembler round trip test.

[LLDB][CobolUserExpression] Added ast evaluation for sizeof operator.

[LLDB][CobolUserExpression] Added placeholder in parser for func call.

[LLDB][CobolUserExpression] Added sizeof operator, temporary placeholder for LENGTH OF.

[LLDB][PLIUserExpression] Fixed array indexing.

[LLDB][ValueObjectPrinter] Skip summary if custom format is requested.

[LLDB][PLILanguage] changed bitset size, read from array type name.

[LLDB][ValueObject] Fixed pli var string length read.

[LLDB][PLILanguage] Fixed support for var string summary formatter.

[LLVM][DIBuilder][C-API] Added changes to add lexical scope info for auto variable for functions.

[LLVM][AsmCodegen] Added raincode extention AT_lexical_scope.

[LLDB][SymbolFileDWARF] Added support for RAINCODE_lexical_scope attribute.

[LLDB][PLIUserExpression] Added sizeof operator support, it will be renamed to proper functiona name later.

[LLDB][ValueObject] minor cleanup.

[LLDB][PLIUserExpression] Added STORAGE/STG builtin func support.

[LLDB][PLIUserExpression][CobolUserExpression] Added LENGTH() for Cobol and STG/STORAGE() for PL/I, removed sizeof operator for both.

[LLDB][DWARFASTParserLegacy] Added placeholder for DW_TAG_reference_type.

[LLVM][CodeGen][AsmPrinter] Added DW_AT_name attribute to TAG_array_type.

[LLDB][DWARFExpression] minor cleanup.

[LLDB][PLILanguage] Fixed bitset array size read from array type.

[LLDB][LegacyASTContext] Moved bit array calculation to type system.

[LLVM][AsmPrinter] Fixed TAG_array duplicate attribute export.

[LLVM][C-API] Changes array type api.

[LLDB][PLI/Cobol UserExpression] Fixed array indexing, removed c style [] index access.

[LLDB][PLIUserExpression] minor fix.

[LLDB] Fixed bug relating struct member name access for Cobol/PL1.

fixed coding style/whitespace/typos.

[LLDB][LegacyTypeSystem] Place holder beautification for edited types.

[LLDB] rpm version upgrade.

[LLDB][CobolUserExpression] Fixed pic string ref modifier.

[LLDB][TypeSystem] Added support to mutate existing type length, fixed cobolUserExpression refmod for display type.

[LLDB][CobolUserExpression] Fixed lower bound ref modifier.

[LLDB][Cobol/PLI UserExpression] Fixed error while searching for var, do not fully resolve type.

[LLDB][RPM] fixed rpm version.

cleanup, reduce number of changes from trunk.

cleanup, reduce number of changes fomr trunk.

build fix.

Build fix.

Build fix.

[LegacyASTContext] Fixed bug relating packed decimal going through ebcdic iconv.

[LLVM][DebugInfo] Fixed DIExpression node uniquness issue.

build fix.

cleanup.

Refactoring, moved LegacyASTContext to Plugin typeSystem Legacy.

Refactoring, renamed TypeSystem class.

build fix.

build fix, cleanup.

cleanup

fixed assert failure.

[CobolUserExpression] Added placeholder for simpe assignment operator.

[CobolUserExpression] Added basic support for assigment to variables.

[CobolUserExpression] Added cobol move-to set-to syntex support.

[LegacyTypeSystem][CobolUserExpression] Added literal type double,string. Added TypeSystem Encoding helper functions.

[CobolLexer] Added string,float literal type support.

[PLIUserExpression] Added assignment operator support.

[CobolUserExpression] Added assignment comp-3 support.

[PLIUserExpression] Assignment added pli display type support.

temporary build fix. DIExpression asmprinter, print null for invalid entry.

[Cobol/PLI UserExpression] Assignment endianity bug fix.

[Cobol/PLIUserExpression] fixed data extractpr assert, fixed int precision convertion written as zero.

[Cobol/PLIUserExpression] Assignment display type assert failure.

build fix.

[PLIUserExpression] Added support for var string assignment.

[LegacyUserExpression] Simple semantic check place-holder.

[CobolUserexpression] Assignment expression fixed display, array types.

[TypeSystem] fixed encode int precision bug for i64 to i32.

[PLI/Cobol UserExpression] fixed support for assinment into refmod data type.

[CobolUserExpression] Fixed assignment display/comp-3 regression.

temporary build fix. python3 lib on server needs few changes for this build.

[TypeSystemLegacy] Fixed edited type display, skip formatting for edited type.

[lldbrpm] package lldb-python-script too.

[CobolUserExpression] Fixed SelctorOf expression with array index access e.g (lldb)p LastName1 of VAR(1) of TAB.

[CobolUserExpression] Assignment to packed decimal fixed, added digit count read support from dwarf instead of runtime calculation.

[CoboUserExpression] Fixed assignment string invalid byte order.

[PLIUserExpression] Fixed string padding with space.

[CobolUserExpression] Fixed Assignment string space padding.

[LegacyTypeSysten] fixed crash in encoding due to long length the assignment.

[CobolUserExpression][PLIUserExpression] fixed segfault.

[DebugInfo] export identifier case as insensitive for PLI/Cobol compiled units.

[TypeSystemLegacy] Fixed minor bug with dataencoding.

rebase build fix.

[StackFrame] fixed support for cobol/pli modref select syntex.

case-insensitive breakpoint resolution for PLI/Cobol languages.

cleanup.

build fix.

added initial support for TAG_dynamic_type.

added c/c++ api to create dynamic type debug info.

[DebugInfo] Added support to generate dwarf attribute  DW_AT_allocated for DW_TAG_dynamic_type

[PLIUserExpression][CobolUserExpression] Fixed name variable lookup for few cases.

[DWARFASTParserLegacy] Initial support to parse TAG_dynamic_type.

[AsmPrinter] Fix minor mistake for TAG_dynamic DW_AT_allocated.

[TypeSystemLegacy] Added dynamic type place holder.

[LLVM][AsmPrinter] Allow OP_call2/4 expression on local variable location.

build fix.

[LLDB][CompilerType] Added support to fetch dynamic type info.

[lldb][ValueObjectVariable] Added dynamic variable read support.

[LLDB][ValueObjectVariable] Added allocated check for dynamic types.

[LLDB][ValueObjectVariable] fixed TAG_dynamic type attributes optional.

[LLVM][DebugInfoMetadata] Fixed minor function call.

[LLDB][TypeSystemLegacy] Added dynamic type info support.

build fix.

for jekins, use python3  sharedlibs

lldbrpm use python3.

temporary build fix.

[LLDB][DWARFExpression] Added temporary operation extension for address calculation with file address in dwarf v5.

[LLVM][CodeGen] Fixed dynamic type dwarf expression call2/call4 assert.

[LLVM][Verifier] Added dynamic type check.

[LLVM][Verifier] Added debugInfo verifier dynamic type extra checks.

[LLDB][TypeSystemLegacy] Added check to avoid direct nested dynaic types.

[LLVM][DebugInfo] Adding DW_OP_call2/4 support in TAG_subrange attributes DW_AT_lower_bound, DW_AT_upper_bound.

[LLDB] Added option to hide frames with invalid line entry target.hide-invalid-legacy-frames, this is a temporary placeholder and it will be moved to more suitable location in future.

[LLDB][DataFormatters] Fixed printing of char arrays with non-default format.

[LLDB][StackFrame] Added check for member name lookup to reject array of structs.

[lldb][DataFormatters] fixed multi-dimesional string formatting.

[LLDB][ValueObjectVariable] cleanup: proper error message.

[LLVM][DwarfUnit] Added DW_OP_call2/call4 support for array type.

[LLVM][DwarfCompileUnit] fixed assert failure with DW_OP_call2/call4.

[DIBuilder] Added DW_AT_static_link support.

[LLVM][C-API][DebugInfo] Added support for DW_AT_static_link.

[DebugInfo] fixed minor bug with Staticlink attribute generation.

[DebugInfo] static link cleanup.

rebase build fix.

[LLDB][DWARFParser] Added initial support to parse DW_AT_static_link.

[LLDB][StackFrame] Added support to read static link address.

[LLDB][StackFrameList] Added helper function to search stack list using static link.

[LLDB][ValueObjectPrinter] regression fix for hex format value print.

[LLDB] build fix.

[LLDB][ExpressionParser] bug fixed for positive int expression e.g. p move +3 to var.

[LLDB][TypeSystemLegacy] Fixed bcd signed preferred value encoding.

[LLVM][DebuggerTuning] default tune for lldb.

[LLDB][TypeSystemLegacy] iconv try approximate and ignore if not possible, for character decoding.

rebase build fix.

[LLDB][CobolUserExpression][PLIUserExpression] fixed variable name overwriting.

[LLDB][UserExpression] Temporary revert variable name bug.

rebase build fix.

rebase build fix.

rebase build fix.

initial placeholder for DW_AT_RAINCODE_static_link_recv.

[LLDB][CobolUserExpression][PLIUserExpression] fixed variable name overwrite.

[LLDB][Test] fixed UnsupportedLanguage test failure.

[LLDB][CobolUserExpression] Place holder for compare operations.

lldbrpm, temporary skip python dir.

[CobolUserExpression] Adding placeholder for equality comparision.

[PLIUserExpression] PLILexer, added partial support for comparision operators.

[LLDB][DataExtractor] bytes compare func.

rebase build fix.

rebase build fix.

Added DW_AT_RAINCODE_frame_base

Patch by Amin!

[LLDB][DWARFParser] Added support to parse DW_AT_RAINCODE_frame_base.

build fix.

[LLVM] Fix dynamic type

[LLVM-C][API] Add api to create a dynamic DISubrange

[LLDB] Add support for DW_AT_count as a DWARFExpression

- Add DWARFExpression in ArrayInfo;
- Add LegacyDynamicArray type for dynamic arrays;
- Evaluate count expression every time we re-evaluate DW_AT_location.

Rebase and fix compilation failures

Only print case sensitiveness if source language is
Cobol or PL/1.
Fixes the following regressions:
LLVM :: DebugInfo/X86/dwarf-public-names.ll
LLVM :: DebugInfo/X86/length_symbol_difference.ll
LLVM :: MC/X86/dwarf-size-field-overflow.test
LLVM :: tools/llvm-dwarfdump/X86/statistics.ll

(cherry picked from commit ff848081162f81ef3c5d8f447b6c28dd564d4ada)

Use correct record size of DIDerivedType
Use last index for Annotations

replace dyn_cast with dyn_cast_or_null to handle invalid input smoothly

Rebasing on LLVM-17-init and fixes regressions

LZLANG-2470 valgrind vs. lldb_private::TargetCharsetReader::convert
    - remove the static buffer_length variable, which may not be big enough.
    - remove the loop
    - add lldb console errno logging when there is an iconv error.

(cherry picked from commit 120402f28f787a90f65f725307519343b5937fee)

LZLANG-2470 Fixes for previous lldb_private::TargetCharsetReader::convert changes.

(cherry picked from commit 918c9b62a63b71347ebee5a7ccd0bd42bbdfc118)

Lexer Bug Fix

COBOL/PLI lexer would return variable name with '\n' at the start.

1155199180

(cherry picked from commit 7266c35747b19a11081b3fab07f6773bfb15fa1f)

Ported Abhishek's Fix
-Set is_singed for int variables

[lldb] Bridge the gap when debugging the variable with command and codelldb

(cherry picked from commit d88ad8abed856d239628d4cda3fad393fef1ba0e)

Build Fixes after cherry-pick previous commit

strings set by codelldb must be enclosed in quotes

(cherry picked from commit 0072c09fbe9f5ead6bde25060dc8e9f4265989b3)

Bug fix:

p var = val in PLI didn't work

(cherry picked from commit 9f3d16f85434cbd17e26d429622cd6b557eddacb)

Port Abhisheks Fixes
-Fix for MOVE val TO VAR

[lldb] Added the DemangledNameContainsPath overload for pli/cobol

(cherry picked from commit 552cf62d001beb59327e4fb81cd4620ee0d62c55)

Fix warnings

Fields of a struct array can now be used with `p`

e.g FIELD(5) is equivalent to FIELD OF ARR(5)

See ticket 1152892604

(cherry picked from commit 5e02341b015fddaca13a674b34228fe2b080a54c)

Cobol-style multi-index support added

(cherry picked from commit 7b0e7ae494ca2a9799e1f09d87146113de2e0f38)

Fixed LENGTH(var) expression

-get the size of var from lldb

(cherry picked from commit 50657e2e7b2ec81a13764ca0105c130cc95ccfc7)

Warning Fixes

Make breakpoint Cases Insensitive

Fixed Build and Regression failure after rebase

Fixed warnings seen during lldb build

[lldb] Store real bitwidth from debuginfo in Scalar Type

Storing in higher bitwidth than required or specified by debug info
creates problem when byteswap is done.

Make comparison of breakpoint names case insensitive in `findEntryOffsetInCurrentIndex`

1156642284

typo fix: s/key/Key/

[lldb] Fix DWARFASTParser to correctly parse DW_AT_count for dynamic arrays

[lldb] Change the way we look for variables in StackFrame for Legacy Languages 1156032652

[lldb] Bugfix in LENGTH(var) [cobol] and STG(var) [pli]

We were encoding 4 bytes of LENGTH data and reading 8 bytes which cause a problem.
Using size_t instead of uint32_t fixes the problem.

[lldb] Fix cast failure in FindFieldInStructArray

Complicated expressions in lldb broke the assumption that the expression is an identifier, thus we got a cast error.
This fix removes that from happening and also fixes the bug that if the identifier is an array itself the last index specified in the input is used to index that variable itself.

e.g 01 SAMPLE-TABLE.
           05 TABLE-DEPTH OCCURS 3 TIMES.
             10 TABLE-ROW OCCURS 3 TIMES.
               15 TABLE-COLUMN OCCURS 3 TIMES PIC 9(8).

Here TABLE-ROW(1, 2) means second element of TABLE-ROW OF TABLE-DEPTH(1).

Revert "[lldb] Fix cast failure in FindFieldInStructArray"

This reverts commit c1bab0e0b6a798698196434c7bb6cbe391fcdc1b.

[lldb] Add support for IBM array-indexing syntax

see 1156841764

[lldb] Fix cast error and support non-ibm indexing syntax

see 1156841764

[lldb] Fixes After Rebase on llvmorg-18.1.4

[lldb] Fix bug in display of varying PLI strings

See 1156884604

The STG function also should include the prefix when counting the size, which for now is 2 bytes for all strings because the PLI compiler doesn't support COMPAT(V3) version.
If in the future we do support it, we would need to fix this again.

(cherry picked from commit 4b39f3e1b55c3df09f5cb89dcdd347682f790ba9)

[lldb] Add basic support for Level88 conditions

[lldb] Add support for calling the runtime function rc_cob_level88 directly from the "p" command

[lldb] Print the value of level88 variables as true/false with parent name.

Prints the value of level88 condition names by calling the runtime functions and formatting it nicely.

[lldb] Add support for indexed level88 variables

[lldb] Fixes After Rebase on llvm main

[LLDB] Preparation for upstream
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Sep 10, 2024
Co-authored-by: Petar Avramovic <[email protected]>
Co-authored-by: Piotr Sobczak <[email protected]>

[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (llvm#78414)

…bf8 instructions

    Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16
    instructions that were supported on GFX940 (MI300):
    - V_CVT_F32_FP8
    - V_CVT_F32_BF8
    - V_CVT_PK_F32_FP8
    - V_CVT_PK_F32_BF8
    - V_CVT_PK_FP8_F32
    - V_CVT_PK_BF8_F32
    - V_CVT_SR_FP8_F32
    - V_CVT_SR_BF8_F32

---------

Co-authored-by: Mateja Marjanovic <[email protected]>
Co-authored-by: Mirko Brkušanin <[email protected]>
(cherry picked from commit cfddb59)

[RISCV] Support __riscv_v_fixed_vlen for vbool types. (llvm#76551)

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.

[Docs] Fix documentation build.

Missing ending `` after c92ad41

Backport '[clang] static operators should evaluate object argument (reland)' to release/18.x (llvm#80109)

Cherry picked from commit ee01a2c.

Closes llvm#80041, backport llvm#80108.

Co-authored-by: Shafik Yaghmour <[email protected]>
Co-authored-by: cor3ntin <[email protected]>
Co-authored-by: Aaron Ballman <[email protected]>

PR for llvm#79568 (llvm#80120)

Backporting llvm#79568 to clang 18.

[docs] Add release notes for Windows specific changes in 18.x (llvm#80011)

[AArch64] Add some release notes items (llvm#79983)

[C++20] [Modules] Don't perform ODR checks in GMF

Close llvm#79240.

See the linked issue for details. Given the frequency of issue reporting
about false positive ODR checks (I received private issue reports too),
I'd like to backport this to 18.x too.

[clang] Fix unexpected `-Wconstant-logical-operand` in C23 (llvm#80724)

C23 has `bool`, but logical operators still return int. Check that
we're not in C to avoid false-positive -Wconstant-logical-operand.

Fixes llvm#64356

(cherry picked from commit a18e92d)

[18.x][Docs] Add release note about Clang-defined target OS macros (llvm#80044)

The change is included in the 18.x release. Move the release note to the
release branch and reformat.

(cherry picked from commit b40d5b1)

ReleaseNotes: mention -mtls-dialect=desc (llvm#82731)

[Clang] Fixes to immediate-escalating functions (llvm#82281)

* Consider that immediate escalating function can appear at global
scope, fixing a crash

* Lambda conversion to function pointer was sometimes not performed in
an immediate function context when it should be.

Fixes llvm#82258

(cherry picked from commit baf6bd3)

[Clang] [Sema] Handle placeholders in '.*' expressions (llvm#83103)

When analysing whether we should handle a binary expression as an
overloaded operator call or a builtin operator, we were calling
`checkPlaceholderForOverload()`, which takes care of any placeholders
that are not overload sets—which would usually make sense since those
need to be handled as part of overload resolution.

Unfortunately, we were also doing that for `.*`, which is not
overloadable, and then proceeding to create a builtin operator anyway,
which would crash if the RHS happened to be an unresolved overload set
(due hitting an assertion in `CreateBuiltinBinOp()`—specifically, in one
of its callees—in the `.*` case that makes sure its arguments aren’t
placeholders).

This pr instead makes it so we check for *all* placeholders early if the
operator is `.*`.

It’s worth noting that,
1. In the `.*` case, we now additionally also check for *any*
placeholders (not just non-overload-sets) in the LHS; this shouldn’t
make a difference, however—at least I couldn’t think of a way to trigger
the assertion with an overload set as the LHS of `.*`; it is worth
noting that the assertion in question would also complain if the LHS
happened to be of placeholder type, though.
2. There is another case in which we also don’t perform overload
resolution—namely `=` if the LHS is not of class or enumeration type
after handling non-overload-set placeholders—as in the `.*` case, but
similarly to 1., I first couldn’t think of a way of getting this case to
crash, and secondly, `CreateBuiltinBinOp()` doesn’t seem to care about
placeholders in the LHS or RHS in the `=` case (from what I can tell,
it, or rather one of its callees, only checks that the LHS is not a
pseudo-object type, but those will have already been handled by the call
to `checkPlaceholderForOverload()` by the time we get to this function),
so I don’t think this case suffers from the same problem.

This fixes llvm#53815.

---------

Co-authored-by: Aaron Ballman <[email protected]>

[InstCombine] Fix miscompilation in PR83947 (llvm#83993)

https://github.com/llvm/llvm-project/blob/762f762504967efbe159db5c737154b989afc9bb/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp#L394-L407

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes llvm#83947.

SystemZ release notes for 18.x. (llvm#84560)

Remove support for EXPORTAS in def files to maintain ABI compatibility for COFFShortExport

[clang][Sema] Fix a CTAD regression after 42239d2 (llvm#86914)

The most recent declaration of a template as a friend can introduce a
different template parameter depth compared to what we anticipate from a
CTAD guide.

Fixes llvm#86769

[clang] Avoid -Wshadow warning when init-capture named same as class field (llvm#74512)

Shadowing warning doesn't make much sense since field is not available
in lambda's body without capturing this.

Fixes llvm#71976

[SLP]Fix a crash if the argument of call was affected by minbitwidth analysis.

Need to support proper type conversion for function arguments to avoid
compiler crash.

Fix override keyword being print to the left side

Previously, the `override` keyword in C++ was being print in the left
side of a method decl, which is unsupported by C++ standard. This commit
fixes that by setting the `CanPrintOnLeft` field to 0, forcing it to be
print on the right side of the decl.

Signed-off-by: Giuliano Belinassi <[email protected]>

[clang codegen] Fix MS ABI detection of user-provided constructors. (llvm#90151)

In the context of determining whether a class counts as an "aggregate",
a constructor template counts as a user-provided constructor.

Fixes llvm#86384

(cherry picked from commit 3ab4ae9)

release/18.x: [libclc] Fix linking against libIRReader

Fixes llvm#91551

Update llvm/test/Transforms/InstCombine/bit_ceil.ll

Co-authored-by: Yingwei Zheng <[email protected]>

[RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.

This is ORed with the fast-unaligned-access feature which applies
to scalar and vector together.:

Regular squash.

Cobol/PlI changes from 785ddc60@https://gitlab.phidani.be/Chirag.Patel/lldb.git

Cobol/PLI support added from 6cefb217f097ac@https://gitlab.phidani.be/Chirag.Patel/llvm.git

[LLDB] added lldb rpmbuild spec file

[RPMBuild] Added lldb rpmbuild support.

[LLDBRpm] added version for yum update.

[lldb_rpm] minor cleanup.

Build fix rleated to RTTI.

build fix.

[DWARFASTParserLegacy] Initial support for union type.

[lldb][LegacyTypeSystem] Changed struct/union member offset from bytes to bits to support DW_AT_data_bit_offset.

[lldbrpm] changed version suffix.

[LegacyASTContext] Fix var string length display.

[LLDB][CobolUserExpression] Added AST node Function call place holder.

build fix.

[LLVM][Test] Fixed assembler round trip test.

[LLDB][CobolUserExpression] Added ast evaluation for sizeof operator.

[LLDB][CobolUserExpression] Added placeholder in parser for func call.

[LLDB][CobolUserExpression] Added sizeof operator, temporary placeholder for LENGTH OF.

[LLDB][PLIUserExpression] Fixed array indexing.

[LLDB][ValueObjectPrinter] Skip summary if custom format is requested.

[LLDB][PLILanguage] changed bitset size, read from array type name.

[LLDB][ValueObject] Fixed pli var string length read.

[LLDB][PLILanguage] Fixed support for var string summary formatter.

[LLVM][DIBuilder][C-API] Added changes to add lexical scope info for auto variable for functions.

[LLVM][AsmCodegen] Added raincode extention AT_lexical_scope.

[LLDB][SymbolFileDWARF] Added support for RAINCODE_lexical_scope attribute.

[LLDB][PLIUserExpression] Added sizeof operator support, it will be renamed to proper functiona name later.

[LLDB][ValueObject] minor cleanup.

[LLDB][PLIUserExpression] Added STORAGE/STG builtin func support.

[LLDB][PLIUserExpression][CobolUserExpression] Added LENGTH() for Cobol and STG/STORAGE() for PL/I, removed sizeof operator for both.

[LLDB][DWARFASTParserLegacy] Added placeholder for DW_TAG_reference_type.

[LLVM][CodeGen][AsmPrinter] Added DW_AT_name attribute to TAG_array_type.

[LLDB][DWARFExpression] minor cleanup.

[LLDB][PLILanguage] Fixed bitset array size read from array type.

[LLDB][LegacyASTContext] Moved bit array calculation to type system.

[LLVM][AsmPrinter] Fixed TAG_array duplicate attribute export.

[LLVM][C-API] Changes array type api.

[LLDB][PLI/Cobol UserExpression] Fixed array indexing, removed c style [] index access.

[LLDB][PLIUserExpression] minor fix.

[LLDB] Fixed bug relating struct member name access for Cobol/PL1.

fixed coding style/whitespace/typos.

[LLDB][LegacyTypeSystem] Place holder beautification for edited types.

[LLDB] rpm version upgrade.

[LLDB][CobolUserExpression] Fixed pic string ref modifier.

[LLDB][TypeSystem] Added support to mutate existing type length, fixed cobolUserExpression refmod for display type.

[LLDB][CobolUserExpression] Fixed lower bound ref modifier.

[LLDB][Cobol/PLI UserExpression] Fixed error while searching for var, do not fully resolve type.

[LLDB][RPM] fixed rpm version.

cleanup, reduce number of changes from trunk.

cleanup, reduce number of changes fomr trunk.

build fix.

Build fix.

Build fix.

[LegacyASTContext] Fixed bug relating packed decimal going through ebcdic iconv.

[LLVM][DebugInfo] Fixed DIExpression node uniquness issue.

build fix.

cleanup.

Refactoring, moved LegacyASTContext to Plugin typeSystem Legacy.

Refactoring, renamed TypeSystem class.

build fix.

build fix, cleanup.

cleanup

fixed assert failure.

[CobolUserExpression] Added placeholder for simpe assignment operator.

[CobolUserExpression] Added basic support for assigment to variables.

[CobolUserExpression] Added cobol move-to set-to syntex support.

[LegacyTypeSystem][CobolUserExpression] Added literal type double,string. Added TypeSystem Encoding helper functions.

[CobolLexer] Added string,float literal type support.

[PLIUserExpression] Added assignment operator support.

[CobolUserExpression] Added assignment comp-3 support.

[PLIUserExpression] Assignment added pli display type support.

temporary build fix. DIExpression asmprinter, print null for invalid entry.

[Cobol/PLI UserExpression] Assignment endianity bug fix.

[Cobol/PLIUserExpression] fixed data extractpr assert, fixed int precision convertion written as zero.

[Cobol/PLIUserExpression] Assignment display type assert failure.

build fix.

[PLIUserExpression] Added support for var string assignment.

[LegacyUserExpression] Simple semantic check place-holder.

[CobolUserexpression] Assignment expression fixed display, array types.

[TypeSystem] fixed encode int precision bug for i64 to i32.

[PLI/Cobol UserExpression] fixed support for assinment into refmod data type.

[CobolUserExpression] Fixed assignment display/comp-3 regression.

temporary build fix. python3 lib on server needs few changes for this build.

[TypeSystemLegacy] Fixed edited type display, skip formatting for edited type.

[lldbrpm] package lldb-python-script too.

[CobolUserExpression] Fixed SelctorOf expression with array index access e.g (lldb)p LastName1 of VAR(1) of TAB.

[CobolUserExpression] Assignment to packed decimal fixed, added digit count read support from dwarf instead of runtime calculation.

[CoboUserExpression] Fixed assignment string invalid byte order.

[PLIUserExpression] Fixed string padding with space.

[CobolUserExpression] Fixed Assignment string space padding.

[LegacyTypeSysten] fixed crash in encoding due to long length the assignment.

[CobolUserExpression][PLIUserExpression] fixed segfault.

[DebugInfo] export identifier case as insensitive for PLI/Cobol compiled units.

[TypeSystemLegacy] Fixed minor bug with dataencoding.

rebase build fix.

[StackFrame] fixed support for cobol/pli modref select syntex.

case-insensitive breakpoint resolution for PLI/Cobol languages.

cleanup.

build fix.

added initial support for TAG_dynamic_type.

added c/c++ api to create dynamic type debug info.

[DebugInfo] Added support to generate dwarf attribute  DW_AT_allocated for DW_TAG_dynamic_type

[PLIUserExpression][CobolUserExpression] Fixed name variable lookup for few cases.

[DWARFASTParserLegacy] Initial support to parse TAG_dynamic_type.

[AsmPrinter] Fix minor mistake for TAG_dynamic DW_AT_allocated.

[TypeSystemLegacy] Added dynamic type place holder.

[LLVM][AsmPrinter] Allow OP_call2/4 expression on local variable location.

build fix.

[LLDB][CompilerType] Added support to fetch dynamic type info.

[lldb][ValueObjectVariable] Added dynamic variable read support.

[LLDB][ValueObjectVariable] Added allocated check for dynamic types.

[LLDB][ValueObjectVariable] fixed TAG_dynamic type attributes optional.

[LLVM][DebugInfoMetadata] Fixed minor function call.

[LLDB][TypeSystemLegacy] Added dynamic type info support.

build fix.

for jekins, use python3  sharedlibs

lldbrpm use python3.

temporary build fix.

[LLDB][DWARFExpression] Added temporary operation extension for address calculation with file address in dwarf v5.

[LLVM][CodeGen] Fixed dynamic type dwarf expression call2/call4 assert.

[LLVM][Verifier] Added dynamic type check.

[LLVM][Verifier] Added debugInfo verifier dynamic type extra checks.

[LLDB][TypeSystemLegacy] Added check to avoid direct nested dynaic types.

[LLVM][DebugInfo] Adding DW_OP_call2/4 support in TAG_subrange attributes DW_AT_lower_bound, DW_AT_upper_bound.

[LLDB] Added option to hide frames with invalid line entry target.hide-invalid-legacy-frames, this is a temporary placeholder and it will be moved to more suitable location in future.

[LLDB][DataFormatters] Fixed printing of char arrays with non-default format.

[LLDB][StackFrame] Added check for member name lookup to reject array of structs.

[lldb][DataFormatters] fixed multi-dimesional string formatting.

[LLDB][ValueObjectVariable] cleanup: proper error message.

[LLVM][DwarfUnit] Added DW_OP_call2/call4 support for array type.

[LLVM][DwarfCompileUnit] fixed assert failure with DW_OP_call2/call4.

[DIBuilder] Added DW_AT_static_link support.

[LLVM][C-API][DebugInfo] Added support for DW_AT_static_link.

[DebugInfo] fixed minor bug with Staticlink attribute generation.

[DebugInfo] static link cleanup.

rebase build fix.

[LLDB][DWARFParser] Added initial support to parse DW_AT_static_link.

[LLDB][StackFrame] Added support to read static link address.

[LLDB][StackFrameList] Added helper function to search stack list using static link.

[LLDB][ValueObjectPrinter] regression fix for hex format value print.

[LLDB] build fix.

[LLDB][ExpressionParser] bug fixed for positive int expression e.g. p move +3 to var.

[LLDB][TypeSystemLegacy] Fixed bcd signed preferred value encoding.

[LLVM][DebuggerTuning] default tune for lldb.

[LLDB][TypeSystemLegacy] iconv try approximate and ignore if not possible, for character decoding.

rebase build fix.

[LLDB][CobolUserExpression][PLIUserExpression] fixed variable name overwriting.

[LLDB][UserExpression] Temporary revert variable name bug.

rebase build fix.

rebase build fix.

rebase build fix.

initial placeholder for DW_AT_RAINCODE_static_link_recv.

[LLDB][CobolUserExpression][PLIUserExpression] fixed variable name overwrite.

[LLDB][Test] fixed UnsupportedLanguage test failure.

[LLDB][CobolUserExpression] Place holder for compare operations.

lldbrpm, temporary skip python dir.

[CobolUserExpression] Adding placeholder for equality comparision.

[PLIUserExpression] PLILexer, added partial support for comparision operators.

[LLDB][DataExtractor] bytes compare func.

rebase build fix.

rebase build fix.

Added DW_AT_RAINCODE_frame_base

Patch by Amin!

[LLDB][DWARFParser] Added support to parse DW_AT_RAINCODE_frame_base.

build fix.

[LLVM] Fix dynamic type

[LLVM-C][API] Add api to create a dynamic DISubrange

[LLDB] Add support for DW_AT_count as a DWARFExpression

- Add DWARFExpression in ArrayInfo;
- Add LegacyDynamicArray type for dynamic arrays;
- Evaluate count expression every time we re-evaluate DW_AT_location.

Rebase and fix compilation failures

Only print case sensitiveness if source language is
Cobol or PL/1.
Fixes the following regressions:
LLVM :: DebugInfo/X86/dwarf-public-names.ll
LLVM :: DebugInfo/X86/length_symbol_difference.ll
LLVM :: MC/X86/dwarf-size-field-overflow.test
LLVM :: tools/llvm-dwarfdump/X86/statistics.ll

(cherry picked from commit ff848081162f81ef3c5d8f447b6c28dd564d4ada)

Use correct record size of DIDerivedType
Use last index for Annotations

replace dyn_cast with dyn_cast_or_null to handle invalid input smoothly

Rebasing on LLVM-17-init and fixes regressions

LZLANG-2470 valgrind vs. lldb_private::TargetCharsetReader::convert
    - remove the static buffer_length variable, which may not be big enough.
    - remove the loop
    - add lldb console errno logging when there is an iconv error.

(cherry picked from commit 120402f28f787a90f65f725307519343b5937fee)

LZLANG-2470 Fixes for previous lldb_private::TargetCharsetReader::convert changes.

(cherry picked from commit 918c9b62a63b71347ebee5a7ccd0bd42bbdfc118)

Lexer Bug Fix

COBOL/PLI lexer would return variable name with '\n' at the start.

1155199180

(cherry picked from commit 7266c35747b19a11081b3fab07f6773bfb15fa1f)

Ported Abhishek's Fix
-Set is_singed for int variables

[lldb] Bridge the gap when debugging the variable with command and codelldb

(cherry picked from commit d88ad8abed856d239628d4cda3fad393fef1ba0e)

Build Fixes after cherry-pick previous commit

strings set by codelldb must be enclosed in quotes

(cherry picked from commit 0072c09fbe9f5ead6bde25060dc8e9f4265989b3)

Bug fix:

p var = val in PLI didn't work

(cherry picked from commit 9f3d16f85434cbd17e26d429622cd6b557eddacb)

Port Abhisheks Fixes
-Fix for MOVE val TO VAR

[lldb] Added the DemangledNameContainsPath overload for pli/cobol

(cherry picked from commit 552cf62d001beb59327e4fb81cd4620ee0d62c55)

Fix warnings

Fields of a struct array can now be used with `p`

e.g FIELD(5) is equivalent to FIELD OF ARR(5)

See ticket 1152892604

(cherry picked from commit 5e02341b015fddaca13a674b34228fe2b080a54c)

Cobol-style multi-index support added

(cherry picked from commit 7b0e7ae494ca2a9799e1f09d87146113de2e0f38)

Fixed LENGTH(var) expression

-get the size of var from lldb

(cherry picked from commit 50657e2e7b2ec81a13764ca0105c130cc95ccfc7)

Warning Fixes

Make breakpoint Cases Insensitive

Fixed Build and Regression failure after rebase

Fixed warnings seen during lldb build

[lldb] Store real bitwidth from debuginfo in Scalar Type

Storing in higher bitwidth than required or specified by debug info
creates problem when byteswap is done.

Make comparison of breakpoint names case insensitive in `findEntryOffsetInCurrentIndex`

1156642284

typo fix: s/key/Key/

[lldb] Fix DWARFASTParser to correctly parse DW_AT_count for dynamic arrays

[lldb] Change the way we look for variables in StackFrame for Legacy Languages 1156032652

[lldb] Bugfix in LENGTH(var) [cobol] and STG(var) [pli]

We were encoding 4 bytes of LENGTH data and reading 8 bytes which cause a problem.
Using size_t instead of uint32_t fixes the problem.

[lldb] Fix cast failure in FindFieldInStructArray

Complicated expressions in lldb broke the assumption that the expression is an identifier, thus we got a cast error.
This fix removes that from happening and also fixes the bug that if the identifier is an array itself the last index specified in the input is used to index that variable itself.

e.g 01 SAMPLE-TABLE.
           05 TABLE-DEPTH OCCURS 3 TIMES.
             10 TABLE-ROW OCCURS 3 TIMES.
               15 TABLE-COLUMN OCCURS 3 TIMES PIC 9(8).

Here TABLE-ROW(1, 2) means second element of TABLE-ROW OF TABLE-DEPTH(1).

Revert "[lldb] Fix cast failure in FindFieldInStructArray"

This reverts commit c1bab0e0b6a798698196434c7bb6cbe391fcdc1b.

[lldb] Add support for IBM array-indexing syntax

see 1156841764

[lldb] Fix cast error and support non-ibm indexing syntax

see 1156841764

[lldb] Fixes After Rebase on llvmorg-18.1.4

[lldb] Fix bug in display of varying PLI strings

See 1156884604

The STG function also should include the prefix when counting the size, which for now is 2 bytes for all strings because the PLI compiler doesn't support COMPAT(V3) version.
If in the future we do support it, we would need to fix this again.

(cherry picked from commit 4b39f3e1b55c3df09f5cb89dcdd347682f790ba9)

[lldb] Add basic support for Level88 conditions

[lldb] Add support for calling the runtime function rc_cob_level88 directly from the "p" command

[lldb] Print the value of level88 variables as true/false with parent name.

Prints the value of level88 condition names by calling the runtime functions and formatting it nicely.

[lldb] Add support for indexed level88 variables

[lldb] Fixes After Rebase on llvm main

[LLDB] Preparation for upstream
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 10, 2024
pradt2 pushed a commit to pradt2/llvm-project that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:analysis llvm:globalisel llvm:ir mc Machine (object) code mlir:llvm mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants