Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM] Fix -mno-omit-leaf-frame-pointer flag doesn't works on 32-bit ARM #109628

Merged
merged 1 commit into from
Oct 17, 2024

Conversation

guoxin049
Copy link
Contributor

The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures and addresses the bug reported in #108019

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-arm

Author: gxlayer (guoxin049)

Changes

The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures and addresses the bug reported in #108019


Full diff: https://github.com/llvm/llvm-project/pull/109628.diff

20 Files Affected:

  • (modified) llvm/include/llvm/Target/TargetOptions.h (+4)
  • (modified) llvm/lib/CodeGen/TargetOptionsImpl.cpp (+13)
  • (modified) llvm/lib/Target/ARM/ARMFrameLowering.cpp (+2-1)
  • (modified) llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll (+1-1)
  • (modified) llvm/test/CodeGen/ARM/call-tc.ll (+2-2)
  • (modified) llvm/test/CodeGen/ARM/debug-frame.ll (+1-1)
  • (modified) llvm/test/CodeGen/ARM/ehabi.ll (+1-1)
  • (modified) llvm/test/CodeGen/ARM/frame-chain.ll (+7-4)
  • (modified) llvm/test/CodeGen/ARM/ifcvt5.ll (+1-1)
  • (modified) llvm/test/CodeGen/ARM/ldrd.ll (+2-2)
  • (modified) llvm/test/CodeGen/ARM/stack-size-section.ll (+1-1)
  • (modified) llvm/test/CodeGen/ARM/v7k-abi-align.ll (+2-2)
  • (modified) llvm/test/CodeGen/Thumb/frame-chain.ll (+10-6)
  • (modified) llvm/test/CodeGen/Thumb2/frame-pointer.ll (+1-1)
  • (modified) llvm/test/CodeGen/Thumb2/frameless.ll (+2-2)
  • (modified) llvm/test/CodeGen/Thumb2/frameless2.ll (+1-1)
  • (modified) llvm/test/CodeGen/Thumb2/machine-licm.ll (+2-2)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll (+1-1)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected (+1-1)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected (+1-1)
diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h
index 94e0fa2404d6fc..37cff4ef57541e 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -161,6 +161,10 @@ namespace llvm {
     /// DisableFramePointerElim - This returns true if frame pointer elimination
     /// optimization should be disabled for the given machine function.
     bool DisableFramePointerElim(const MachineFunction &MF) const;
+    
+    /// EnableLeafFramePointerElim - This returns true if leaf frame pointer elimination
+    /// optimization should be disabled for the given machine function.
+    bool EnableLeafFramePointerElim(const MachineFunction &MF) const;
 
     /// FramePointerIsReserved - This returns true if the frame pointer must
     /// always either point to a new frame record or be un-modified in the given
diff --git a/llvm/lib/CodeGen/TargetOptionsImpl.cpp b/llvm/lib/CodeGen/TargetOptionsImpl.cpp
index 5bf1d265092f6f..3a44de6de51ff7 100644
--- a/llvm/lib/CodeGen/TargetOptionsImpl.cpp
+++ b/llvm/lib/CodeGen/TargetOptionsImpl.cpp
@@ -40,6 +40,19 @@ bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const {
   llvm_unreachable("unknown frame pointer flag");
 }
 
+/// EnableLeafFramePointerElim - This returns true if leaf frame pointer elimination
+/// optimization should be disabled for the given machine function.
+bool TargetOptions::EnableLeafFramePointerElim(const MachineFunction &MF) const {
+  const Function &F = MF.getFunction();
+
+  if (!F.hasFnAttribute("frame-pointer"))
+    return false;
+  StringRef FP = F.getFnAttribute("frame-pointer").getValueAsString();
+  if (FP == "all")
+    return true;
+  return false;
+}
+
 bool TargetOptions::FramePointerIsReserved(const MachineFunction &MF) const {
   // Check to see if the target want to forcibly keep frame pointer.
   if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF))
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 40354f99559896..473490badb7d12 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -2271,6 +2271,7 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
   // is spilled in the order specified by getCalleeSavedRegs() to make it easier
   // to combine multiple loads / stores.
   bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF));
+  bool CanEliminateLeafFrame = !MF.getTarget().Options.EnableLeafFramePointerElim(MF);
   bool CS1Spilled = false;
   bool LRSpilled = false;
   unsigned NumGPRSpills = 0;
@@ -2513,7 +2514,7 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
                     << "; EstimatedFPStack: " << MaxFixedOffset - MaxFPOffset
                     << "; BigFrameOffsets: " << BigFrameOffsets << "\n");
   if (BigFrameOffsets ||
-      !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF)) {
+      !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF) || !CanEliminateLeafFrame) {
     AFI->setHasStackFrame(true);
 
     if (HasFP) {
diff --git a/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll b/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
index aa79e4156dac11..b6adc995091cea 100644
--- a/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
+++ b/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
@@ -1732,7 +1732,7 @@ if.end:
 ; Another infinite loop test this time with two nested infinite loop.
 ; infiniteloop3
 ; bx lr
-define void @infiniteloop3() "frame-pointer"="all" {
+define void @infiniteloop3() "frame-pointer"="none" {
 ; ARM-LABEL: infiniteloop3:
 ; ARM:       @ %bb.0: @ %entry
 ; ARM-NEXT:    mov r0, #0
diff --git a/llvm/test/CodeGen/ARM/call-tc.ll b/llvm/test/CodeGen/ARM/call-tc.ll
index 18d83bdc03e22f..9c70bac0322fe1 100644
--- a/llvm/test/CodeGen/ARM/call-tc.ll
+++ b/llvm/test/CodeGen/ARM/call-tc.ll
@@ -17,7 +17,7 @@ define void @t1() "frame-pointer"="all" {
         ret void
 }
 
-define void @t2() "frame-pointer"="all" {
+define void @t2() "frame-pointer"="none" {
 ; CHECKV6-LABEL: t2:
 ; CHECKV6: bx r0
 ; CHECKT2D-LABEL: t2:
@@ -102,7 +102,7 @@ bb:
 
 ; Make sure codegenprep is duplicating ret instructions to enable tail calls.
 ; rdar://11140249
-define i32 @t8(i32 %x) nounwind ssp "frame-pointer"="all" {
+define i32 @t8(i32 %x) nounwind ssp "frame-pointer"="none" {
 entry:
 ; CHECKT2D-LABEL: t8:
 ; CHECKT2D-NOT: push
diff --git a/llvm/test/CodeGen/ARM/debug-frame.ll b/llvm/test/CodeGen/ARM/debug-frame.ll
index faeafdf45dc392..72e7cfcab487a7 100644
--- a/llvm/test/CodeGen/ARM/debug-frame.ll
+++ b/llvm/test/CodeGen/ARM/debug-frame.ll
@@ -526,7 +526,7 @@ entry:
 ; Test 4
 ;-------------------------------------------------------------------------------
 
-define void @test4() nounwind {
+define void @test4() nounwind "frame-pointer"="none" {
 entry:
   ret void
 }
diff --git a/llvm/test/CodeGen/ARM/ehabi.ll b/llvm/test/CodeGen/ARM/ehabi.ll
index fea497076030f1..d1a4e9a6bccad0 100644
--- a/llvm/test/CodeGen/ARM/ehabi.ll
+++ b/llvm/test/CodeGen/ARM/ehabi.ll
@@ -575,7 +575,7 @@ entry:
 ; Test 4
 ;-------------------------------------------------------------------------------
 
-define void @test4() nounwind {
+define void @test4() nounwind "frame-pointer"="none" {
 entry:
   ret void
 }
diff --git a/llvm/test/CodeGen/ARM/frame-chain.ll b/llvm/test/CodeGen/ARM/frame-chain.ll
index e37213e4aaf8b8..7b722cd5fcef24 100644
--- a/llvm/test/CodeGen/ARM/frame-chain.ll
+++ b/llvm/test/CodeGen/ARM/frame-chain.ll
@@ -10,11 +10,14 @@
 define dso_local noundef i32 @leaf(i32 noundef %0) {
 ; LEAF-FP-LABEL: leaf:
 ; LEAF-FP:       @ %bb.0:
-; LEAF-FP-NEXT:    .pad #4
-; LEAF-FP-NEXT:    sub sp, sp, #4
-; LEAF-FP-NEXT:    str r0, [sp]
+; LEAF-FP-NEXT:    .save {r11, lr}
+; LEAF-FP-NEXT:    push {r11, lr}
+; LEAF-FP-NEXT:    .setfp r11, sp
+; LEAF-FP-NEXT:    mov r11, sp
+; LEAF-FP-NEXT:    push {r0}
 ; LEAF-FP-NEXT:    add r0, r0, #4
-; LEAF-FP-NEXT:    add sp, sp, #4
+; LEAF-FP-NEXT:    mov sp, r11
+; LEAF-FP-NEXT:    pop {r11, lr}
 ; LEAF-FP-NEXT:    mov pc, lr
 ;
 ; LEAF-FP-AAPCS-LABEL: leaf:
diff --git a/llvm/test/CodeGen/ARM/ifcvt5.ll b/llvm/test/CodeGen/ARM/ifcvt5.ll
index dc9a3400b691ac..30a92eb34989a6 100644
--- a/llvm/test/CodeGen/ARM/ifcvt5.ll
+++ b/llvm/test/CodeGen/ARM/ifcvt5.ll
@@ -5,7 +5,7 @@
 
 @x = external global ptr		; <ptr> [#uses=1]
 
-define void @foo(i32 %a) "frame-pointer"="all" {
+define void @foo(i32 %a) "frame-pointer"="none" {
 ; A8-LABEL: foo:
 ; A8:       @ %bb.0: @ %entry
 ; A8-NEXT:    movw r1, :lower16:(L_x$non_lazy_ptr-(LPC0_0+8))
diff --git a/llvm/test/CodeGen/ARM/ldrd.ll b/llvm/test/CodeGen/ARM/ldrd.ll
index cf5c2dfe5ef60b..3cf10f0e64b4d1 100644
--- a/llvm/test/CodeGen/ARM/ldrd.ll
+++ b/llvm/test/CodeGen/ARM/ldrd.ll
@@ -168,7 +168,7 @@ define void @ldrd_postupdate_inc(ptr %p0) "frame-pointer"="all" {
 ; NORMAL: strd r1, r2, [r0], #-8
 ; CONSERVATIVE-NOT: strd
 ; CHECK: bx lr
-define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all" {
+define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="none" {
   %p0.1 = getelementptr i32, ptr %p0, i32 1
   store i32 %v0, ptr %p0
   store i32 %v1, ptr %p0.1
@@ -180,7 +180,7 @@ define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all"
 ; NORMAL: strd r1, r2, [r0], #8
 ; CONSERVATIVE-NOT: strd
 ; CHECK: bx lr
-define ptr @strd_postupdate_inc(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all" {
+define ptr @strd_postupdate_inc(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="none" {
   %p0.1 = getelementptr i32, ptr %p0, i32 1
   store i32 %v0, ptr %p0
   store i32 %v1, ptr %p0.1
diff --git a/llvm/test/CodeGen/ARM/stack-size-section.ll b/llvm/test/CodeGen/ARM/stack-size-section.ll
index fb23e358d856ee..8272389719a691 100644
--- a/llvm/test/CodeGen/ARM/stack-size-section.ll
+++ b/llvm/test/CodeGen/ARM/stack-size-section.ll
@@ -29,4 +29,4 @@ define void @dynalloc(i32 %N) #0 {
   ret void
 }
 
-attributes #0 = { "frame-pointer"="all" }
+attributes #0 = { "frame-pointer"="none" }
diff --git a/llvm/test/CodeGen/ARM/v7k-abi-align.ll b/llvm/test/CodeGen/ARM/v7k-abi-align.ll
index 20c7aea5dcbe6b..b27c4354f432a1 100644
--- a/llvm/test/CodeGen/ARM/v7k-abi-align.ll
+++ b/llvm/test/CodeGen/ARM/v7k-abi-align.ll
@@ -117,7 +117,7 @@ define void @test_dpr_unwind_align_no_dprs() "frame-pointer"="all" {
 
 ; 128-bit vectors should use 128-bit (i.e. correctly aligned) slots on
 ; the stack.
-define <4 x float> @test_v128_stack_pass([8 x double], float, <4 x float> %in) "frame-pointer"="all" {
+define <4 x float> @test_v128_stack_pass([8 x double], float, <4 x float> %in) "frame-pointer"="none" {
 ; CHECK-LABEL: test_v128_stack_pass:
 ; CHECK: add r[[ADDR:[0-9]+]], sp, #16
 ; CHECK: vld1.64 {d0, d1}, [r[[ADDR]]:128]
@@ -140,7 +140,7 @@ define void @test_v128_stack_pass_varargs(<4 x float> %in) "frame-pointer"="all"
 
 ; To be compatible with AAPCS's va_start model (store r0-r3 at incoming SP, give
 ; a single pointer), 64-bit quantities must be pass
-define i64 @test_64bit_gpr_align(i32, i64 %r2_r3, i32 %sp) "frame-pointer"="all" {
+define i64 @test_64bit_gpr_align(i32, i64 %r2_r3, i32 %sp) "frame-pointer"="none" {
 ; CHECK-LABEL: test_64bit_gpr_align:
 ; CHECK: ldr [[RHS:r[0-9]+]], [sp]
 ; CHECK: adds r0, [[RHS]], r2
diff --git a/llvm/test/CodeGen/Thumb/frame-chain.ll b/llvm/test/CodeGen/Thumb/frame-chain.ll
index eb62ce09caf1be..e68fc626be9819 100644
--- a/llvm/test/CodeGen/Thumb/frame-chain.ll
+++ b/llvm/test/CodeGen/Thumb/frame-chain.ll
@@ -8,12 +8,16 @@
 define dso_local noundef i32 @leaf(i32 noundef %0) {
 ; LEAF-FP-LABEL: leaf:
 ; LEAF-FP:       @ %bb.0:
-; LEAF-FP-NEXT:    .pad #4
-; LEAF-FP-NEXT:    sub sp, #4
-; LEAF-FP-NEXT:    str r0, [sp]
-; LEAF-FP-NEXT:    adds r0, r0, #4
-; LEAF-FP-NEXT:    add sp, #4
-; LEAF-FP-NEXT:    bx lr
+; LEAF-FP-NEXT:    .save	{r7, lr}
+; LEAF-FP-NEXT:    push	{r7, lr}
+; LEAF-FP-NEXT:	   .setfp	r7, sp
+; LEAF-FP-NEXT:	   add	r7, sp, #0
+; LEAF-FP-NEXT:	   .pad	#4
+; LEAF-FP-NEXT:	   sub	sp, #4
+; LEAF-FP-NEXT:	   str	r0, [sp]
+; LEAF-FP-NEXT:	   adds	r0, r0, #4
+; LEAF-FP-NEXT:	   add	sp, #4
+; LEAF-FP-NEXT:	   pop	{r7, pc}
 ;
 ; LEAF-FP-AAPCS-LABEL: leaf:
 ; LEAF-FP-AAPCS:       @ %bb.0:
diff --git a/llvm/test/CodeGen/Thumb2/frame-pointer.ll b/llvm/test/CodeGen/Thumb2/frame-pointer.ll
index ae3c1c8a50e2b4..85c919a50d88c1 100644
--- a/llvm/test/CodeGen/Thumb2/frame-pointer.ll
+++ b/llvm/test/CodeGen/Thumb2/frame-pointer.ll
@@ -14,7 +14,7 @@ define void @leaf() {
 
 ; Leaf function, frame pointer is requested but we don't need any stack frame,
 ; so don't create a frame pointer.
-define void @leaf_nofpelim() "frame-pointer"="all" {
+define void @leaf_nofpelim() "frame-pointer"="none" {
 ; CHECK-LABEL: leaf_nofpelim:
 ; CHECK-NOT: push
 ; CHECK-NOT: sp
diff --git a/llvm/test/CodeGen/Thumb2/frameless.ll b/llvm/test/CodeGen/Thumb2/frameless.ll
index 01e0414de37d93..44914136b1f839 100644
--- a/llvm/test/CodeGen/Thumb2/frameless.ll
+++ b/llvm/test/CodeGen/Thumb2/frameless.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=all | not grep mov
-; RUN: llc < %s -mtriple=thumbv7-linux -frame-pointer=all | not grep mov
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=none | not grep mov
+; RUN: llc < %s -mtriple=thumbv7-linux -frame-pointer=none | not grep mov
 
 define void @t() nounwind readnone {
   ret void
diff --git a/llvm/test/CodeGen/Thumb2/frameless2.ll b/llvm/test/CodeGen/Thumb2/frameless2.ll
index 4750527ae555cd..4848deaf8a1e4c 100644
--- a/llvm/test/CodeGen/Thumb2/frameless2.ll
+++ b/llvm/test/CodeGen/Thumb2/frameless2.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=all | not grep r7
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=none | not grep r7
 
 %struct.noise3 = type { [3 x [17 x i32]] }
 %struct.noiseguard = type { i32, i32, i32 }
diff --git a/llvm/test/CodeGen/Thumb2/machine-licm.ll b/llvm/test/CodeGen/Thumb2/machine-licm.ll
index 5a2ec9280de770..a2f379f7b54384 100644
--- a/llvm/test/CodeGen/Thumb2/machine-licm.ll
+++ b/llvm/test/CodeGen/Thumb2/machine-licm.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=dynamic-no-pic -frame-pointer=all | FileCheck %s
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=all | FileCheck %s --check-prefix=PIC
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=dynamic-no-pic -frame-pointer=none | FileCheck %s
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=none | FileCheck %s --check-prefix=PIC
 ; rdar://7353541
 ; rdar://7354376
 
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
index bae66d456f89a5..174cca4fab0982 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
@@ -60,4 +60,4 @@ define dso_local i32 @main() #0 {
   ret i32 0
 }
 
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
index de5571f6436154..2dfb725f556655 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
@@ -61,7 +61,7 @@ define dso_local i32 @main() #0 {
   ret i32 0
 }
 
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }
 ; CHECK-LABEL: check_boundaries:
 ; CHECK:       @ %bb.0:
 ; CHECK-NEXT:    sub sp, sp, #20
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
index 4f623384ade602..85d3389cdaaf92 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
@@ -121,4 +121,4 @@ define dso_local i32 @main() #0 {
   ret i32 0
 }
 
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }

@guoxin049 guoxin049 force-pushed the fp-test branch 2 times, most recently from 9a2f502 to 45efaee Compare September 23, 2024 13:14
Copy link

github-actions bot commented Sep 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this patch fully. We already have ways of specifying how frame pointer elimination works through function attributes, so if those are incorrect, then we should correct how they are applied. Can you elaborate on why we need to add an interface to TargetOptions instead of using the existing mechanisms?

Additionally, it seems that perhaps the inconsistency is in how clang applies attribute and perhaps the logic in

static bool mustMaintainValidFrameChain(const llvm::opt::ArgList &Args,
is incomplete. My guess is that if we shore up the logic here and get the right attributes on functions, then we don't need any additional machinery in the backend.

Lastly, I'm not sure why so many existing tests were changed from frame-pointer=all to frame-pointer=none. That seems incorrect, though if the tests didn't change, perhaps the attribute isn't needed in many cases. In any case, I'm hesitant to say those are appropriate w/o more rationale.

@ilovepi
Copy link
Contributor

ilovepi commented Sep 23, 2024

Maybe

static bool framePointerImpliesLeafFramePointer(const llvm::opt::ArgList &Args,
was a more relevant function to point out.

@@ -40,6 +40,20 @@ bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const {
llvm_unreachable("unknown frame pointer flag");
}

/// DisableLeafFramePointerElim - This returns true if leaf frame pointer
/// elimination optimization should be disabled for the given machine function.
bool TargetOptions::DisableLeafFramePointerElim(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DisableFramePointerElim itself should be sufficient; it already checks whether the function is a leaf function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, but the logic in determineCalleeSaves currently does not align with this scenario, which involves saving frame pointer (FP) registers with -mno-omit-leaf-frame-pointer and -fno-omit-frame-pointer

void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
BitVector &SavedRegs,
RegScavenger *RS) const {
TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS);
// This tells PEI to spill the FP as if it is any other callee-save register
// to take advantage the eliminateFrameIndex machinery. This also ensures it
// is spilled in the order specified by getCalleeSavedRegs() to make it easier
// to combine multiple loads / stores.
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF));

I would appreciate any suggestions for improvements or modifications you might have. Thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, DisableFramePointerElim() should return true in all the cases where we want to disable frame-pointer elimination. So you shouldn't need a separate API specifically for leaf functions. I think if you just replace the call to DisableLeafFramePointerElim with a call to DisableFramePointerElim, you should get the same behavior. So the target-independent code shouldn't need changes.

Probably at that point, the logic could be refactored a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thank you for your advice. I'll remove DisableFramePointerElim() and refactor CanEliminateFrame like this:
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) || !(MF.getFunction().hasFnAttribute("frame-pointer") && MF.getFunction().getFnAttribute("frame-pointer").getValueAsString() == "all");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow this... I'd prefer to continue using the target-independent DisableFramePointerElim() where possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did an experiment and changed it like this bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) && !MF.getTarget().Options.DisableFramePointerElim(MF);
However, fp is retained in the following scenarios:

bool ARMFrameLowering::keepFramePointer(const MachineFunction &MF) const {
// iOS always has a FP for backtracking, force other targets to keep their FP
// when doing FastISel. The emitted code is currently superior, and in cases
// like test-suite's lencod FastISel isn't quite correct when FP is eliminated.
return MF.getSubtarget<ARMSubtarget>().useFastISel();
}

I don't know if it makes sense. For example, the fast-isel-intrinsic.ll test case saves fp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this,

@ %bb.0:
	push	{r7, lr}
	mov	r7, sp
	movw	r0, :lower16:_message1
	movt	r0, :upper16:_message1
	add	r0, r0, #5
	movw	r1, #64
	movw	r2, #10
	and	r1, r1, #255
	bl	_memset
	pop	{r7, pc}
                                        @ -- End function
	.globl	_t2                             @ -- Begin function t2
	.p2align	2
	.code	32                              @ @t2

Because this judgment is at the front

bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const {
// Check to see if the target want to forcibly keep frame pointer.
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF))
return true;
const Function &F = MF.getFunction();
if (!F.hasFnAttribute("frame-pointer"))

Copy link
Collaborator

@efriedma-quic efriedma-quic Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to preserve the existing behavior of fast-isel, maybe ARMFrameLowering::keepFramePointer() should check MF.getFrameInfo().hasCalls()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, hmm. I guess the way this works on ARM specifically is that we have a state between "non-leaf" and "leaf": basically, we have a frame pointer if we spill any registers. (Or a few other obscure reasons which are mostly irrelevant outside of M-class targets.) And currently, the ARM target translates "leaf" to this.

I think the best thing to do would be to get rid of ARMFrameLowering::keepFramePointer(), and add the useFastISel() check to ARMFrameLowering::hasFP() instead. That should preserve the behavior we want here, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, I've made the code modifications. I hope you can help me check it.

@efriedma-quic
Copy link
Collaborator

Lastly, I'm not sure why so many existing tests were changed from frame-pointer=all to frame-pointer=none

I think the issue here is that the function attributes were requesting a frame pointer, but the CHECK lines check that there isn't one. So the tests need to be fixed one way or the other. Changing the attribute seems like the least invasive fix.

@ilovepi
Copy link
Contributor

ilovepi commented Sep 23, 2024

Lastly, I'm not sure why so many existing tests were changed from frame-pointer=all to frame-pointer=none

I think the issue here is that the function attributes were requesting a frame pointer, but the CHECK lines check that there isn't one. So the tests need to be fixed one way or the other. Changing the attribute seems like the least invasive fix.

Ah thanks. That was not obvious to me when I first looked at the tests. So, I guess in that case, updating the attribute should be fine.

@guoxin049
Copy link
Contributor Author

Maybe

static bool framePointerImpliesLeafFramePointer(const llvm::opt::ArgList &Args,

was a more relevant function to point out.

Thank you for your comments. The framePointerImpliesLeafFramePointer and mustMaintainValidFrameChain interfaces only adjust the getFramePointerKind function in the clang to return None, Reserved, NonLeaf, and All.

clang::CodeGenOptions::FramePointerKind
getFramePointerKind(const llvm::opt::ArgList &Args,
const llvm::Triple &Triple) {

However, in the backend, the logic bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)); is used, where both requiresAAPCSFrameRecord and hasFP are true. I noticed that to save the frame pointer (FP) with -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer, I must enable AAPCS, which I understand is not required. Therefore, I added the DisableLeafFramePointerElim interface to accommodate this scenario.
void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
BitVector &SavedRegs,
RegScavenger *RS) const {
TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS);
// This tells PEI to spill the FP as if it is any other callee-save register
// to take advantage the eliminateFrameIndex machinery. This also ensures it
// is spilled in the order specified by getCalleeSavedRegs() to make it easier
// to combine multiple loads / stores.
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF));

bool ARMFrameLowering::requiresAAPCSFrameRecord(
const MachineFunction &MF) const {
const auto &Subtarget = MF.getSubtarget<ARMSubtarget>();
return Subtarget.createAAPCSFrameChain() && hasFP(MF);
}

Additionally, I'm curious about why the CanEliminateFrame value is determined this way and whether requiresAAPCSFrameRecord can be removed. I understand this aligns with the AAPCS standard, but it seems unnecessary to modify it in this context.

Copy link
Contributor

@dongjianqiang2 dongjianqiang2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -22,10 +22,6 @@ using namespace llvm;
/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.
bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const {
// Check to see if the target want to forcibly keep frame pointer.
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keepFramePointer is implemented as a target independent method... if you're going to get rid of the call here, you should also get rid of the interface from TargetFrameLowering. (No other in-tree target implements it anyway, but we want to keep the code clean, and give an obvious error in case there is someone with an out-of-tree dependency.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the interface from TargetFrameLowering in the latest submission.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excuse me, Please check the code again, thank you. @efriedma-quic

@@ -203,6 +203,10 @@ bool ARMFrameLowering::hasFP(const MachineFunction &MF) const {
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
const MachineFrameInfo &MFI = MF.getFrameInfo();

// Check to see if the target want to forcibly keep frame pointer.
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF))
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine.

@@ -2270,7 +2274,8 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
// to take advantage the eliminateFrameIndex machinery. This also ensures it
// is spilled in the order specified by getCalleeSavedRegs() to make it easier
// to combine multiple loads / stores.
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF));
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) &&
!MF.getTarget().Options.DisableFramePointerElim(MF);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems fine.

@guoxin049 guoxin049 force-pushed the fp-test branch 2 times, most recently from e7c6606 to 08cca47 Compare October 8, 2024 06:29
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a release note to llvm/docs/ReleaseNotes.rst and clang/docs/ReleaseNotes.rst.

clang, as far as I can tell, currently defaults to -mno-omit-leaf-frame-pointer; do we want to change that, so new versions of clang are consistent with older versions and gcc by default? This is useLeafFramePointerForTargetByDefault in clang/lib/Driver/ToolChains/CommonArgs.cpp .

(I probably should have mentioned both of these earlier, but I was focused more on the code change itself, not the bigger picture...)

@DavidSpickett
Copy link
Collaborator

Please add a release note to llvm/docs/ReleaseNotes.rst and clang/docs/ReleaseNotes.rst.

LLVM is .md (Markdown) now but same folder still: https://github.com/llvm/llvm-project/blob/main/llvm/docs/ReleaseNotes.md

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 15, 2024
@guoxin049
Copy link
Contributor Author

Please add a release note to llvm/docs/ReleaseNotes.rst and clang/docs/ReleaseNotes.rst.

clang, as far as I can tell, currently defaults to -mno-omit-leaf-frame-pointer; do we want to change that, so new versions of clang are consistent with older versions and gcc by default? This is useLeafFramePointerForTargetByDefault in clang/lib/Driver/ToolChains/CommonArgs.cpp .

(I probably should have mentioned both of these earlier, but I was focused more on the code change itself, not the bigger picture...)

I've added llvm/docs/ReleaseNotes.md and clang/docs/ReleaseNotes.rst.
As far as I know, gcc retains fp after adding the -fno-omit-frame-pointer option, which is currently the same as gcc. I understand clang/lib/Driver/ToolChains/CommonArgs.cpp there is no need to change here. Here's my test demo:
https://godbolt.org/z/5TY6E4P5a
So I think it's reasonable to default to -mno-omit-leaf-frame-pointer.

Leaf Functions Do Not Retain FP Bug Fixed
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- Fixed ``-fno-omit-frame-pointer`` leaf functions do not retain fp.(#GH108019)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this under "Target-specific changes".

Maybe mention -momit-leaf-frame-pointer here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,Could you kindly review the changes I've made? Thank you.

* The default behavior for frame pointers in leaf functions has been updated. When
`-fno-omit-frame-pointer` is specified, the frame pointer (FP) will now be retained
in leaf functions by default. To eliminate the frame pointer in leaf functions, the
`-momit-leaf-frame-pointer` option must be explicitly provided.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also mention "frame-pointer"="all"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,Could you kindly review the changes I've made? Thank you.

@hstk30-hw
Copy link
Contributor

LGTM, thx.

@hstk30-hw hstk30-hw merged commit 4a2bd78 into llvm:main Oct 17, 2024
7 checks passed
Copy link

@guoxin049 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@DavidSpickett
Copy link
Collaborator

FYI we found an issue with this when testing lldb - #113154

No clear indication yet whether this change uncovered an existing problem, or has caused a new one.

DavidSpickett added a commit that referenced this pull request Oct 21, 2024
…en issue

Since #109628 landed, this test
has been failing on 32-bit Arm.

This is due to a codegen problem (whether added or uncovered by the change,
not known) where the trap instruction is placed after the frame pointer
and link register are restored.

#113154

So the code was:
```
std::__1::vector<int>::operator[](unsigned int):
  sub sp, sp, #8
  str r0, [sp, #4]
  str r1, [sp]
  add sp, sp, #8
  .inst 0xe7ffdefe
  bx lr
```
When lldb saw the trap, the PC was inside operator[] but the frame
information actually pointed to g.

This bug only happens for leaf functions so adding a return type
works around it:
```
std::__1::vector<int>::operator[](unsigned int):
  push {r11, lr}
  mov r11, sp
  sub sp, sp, #8
  str r0, [sp, #4]
  str r1, [sp]
  mov sp, r11
  pop {r11, lr}
  .inst 0xe7ffdefe
  bx lr
```
(and operator[] should return T& anyway)

Now the PC location and frame information should match and the
test passes.
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
…RM (llvm#109628)

The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures
and addresses the bug reported in llvm#108019
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
…en issue

Since llvm#109628 landed, this test
has been failing on 32-bit Arm.

This is due to a codegen problem (whether added or uncovered by the change,
not known) where the trap instruction is placed after the frame pointer
and link register are restored.

llvm#113154

So the code was:
```
std::__1::vector<int>::operator[](unsigned int):
  sub sp, sp, llvm#8
  str r0, [sp, #4]
  str r1, [sp]
  add sp, sp, llvm#8
  .inst 0xe7ffdefe
  bx lr
```
When lldb saw the trap, the PC was inside operator[] but the frame
information actually pointed to g.

This bug only happens for leaf functions so adding a return type
works around it:
```
std::__1::vector<int>::operator[](unsigned int):
  push {r11, lr}
  mov r11, sp
  sub sp, sp, llvm#8
  str r0, [sp, #4]
  str r1, [sp]
  mov sp, r11
  pop {r11, lr}
  .inst 0xe7ffdefe
  bx lr
```
(and operator[] should return T& anyway)

Now the PC location and frame information should match and the
test passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants