-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM] Fix -mno-omit-leaf-frame-pointer flag doesn't works on 32-bit ARM #109628
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-arm Author: gxlayer (guoxin049) ChangesThe -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures and addresses the bug reported in #108019 Full diff: https://github.com/llvm/llvm-project/pull/109628.diff 20 Files Affected:
diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h
index 94e0fa2404d6fc..37cff4ef57541e 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -161,6 +161,10 @@ namespace llvm {
/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;
+
+ /// EnableLeafFramePointerElim - This returns true if leaf frame pointer elimination
+ /// optimization should be disabled for the given machine function.
+ bool EnableLeafFramePointerElim(const MachineFunction &MF) const;
/// FramePointerIsReserved - This returns true if the frame pointer must
/// always either point to a new frame record or be un-modified in the given
diff --git a/llvm/lib/CodeGen/TargetOptionsImpl.cpp b/llvm/lib/CodeGen/TargetOptionsImpl.cpp
index 5bf1d265092f6f..3a44de6de51ff7 100644
--- a/llvm/lib/CodeGen/TargetOptionsImpl.cpp
+++ b/llvm/lib/CodeGen/TargetOptionsImpl.cpp
@@ -40,6 +40,19 @@ bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const {
llvm_unreachable("unknown frame pointer flag");
}
+/// EnableLeafFramePointerElim - This returns true if leaf frame pointer elimination
+/// optimization should be disabled for the given machine function.
+bool TargetOptions::EnableLeafFramePointerElim(const MachineFunction &MF) const {
+ const Function &F = MF.getFunction();
+
+ if (!F.hasFnAttribute("frame-pointer"))
+ return false;
+ StringRef FP = F.getFnAttribute("frame-pointer").getValueAsString();
+ if (FP == "all")
+ return true;
+ return false;
+}
+
bool TargetOptions::FramePointerIsReserved(const MachineFunction &MF) const {
// Check to see if the target want to forcibly keep frame pointer.
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF))
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 40354f99559896..473490badb7d12 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -2271,6 +2271,7 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
// is spilled in the order specified by getCalleeSavedRegs() to make it easier
// to combine multiple loads / stores.
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF));
+ bool CanEliminateLeafFrame = !MF.getTarget().Options.EnableLeafFramePointerElim(MF);
bool CS1Spilled = false;
bool LRSpilled = false;
unsigned NumGPRSpills = 0;
@@ -2513,7 +2514,7 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF,
<< "; EstimatedFPStack: " << MaxFixedOffset - MaxFPOffset
<< "; BigFrameOffsets: " << BigFrameOffsets << "\n");
if (BigFrameOffsets ||
- !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF)) {
+ !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF) || !CanEliminateLeafFrame) {
AFI->setHasStackFrame(true);
if (HasFP) {
diff --git a/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll b/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
index aa79e4156dac11..b6adc995091cea 100644
--- a/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
+++ b/llvm/test/CodeGen/ARM/arm-shrink-wrapping.ll
@@ -1732,7 +1732,7 @@ if.end:
; Another infinite loop test this time with two nested infinite loop.
; infiniteloop3
; bx lr
-define void @infiniteloop3() "frame-pointer"="all" {
+define void @infiniteloop3() "frame-pointer"="none" {
; ARM-LABEL: infiniteloop3:
; ARM: @ %bb.0: @ %entry
; ARM-NEXT: mov r0, #0
diff --git a/llvm/test/CodeGen/ARM/call-tc.ll b/llvm/test/CodeGen/ARM/call-tc.ll
index 18d83bdc03e22f..9c70bac0322fe1 100644
--- a/llvm/test/CodeGen/ARM/call-tc.ll
+++ b/llvm/test/CodeGen/ARM/call-tc.ll
@@ -17,7 +17,7 @@ define void @t1() "frame-pointer"="all" {
ret void
}
-define void @t2() "frame-pointer"="all" {
+define void @t2() "frame-pointer"="none" {
; CHECKV6-LABEL: t2:
; CHECKV6: bx r0
; CHECKT2D-LABEL: t2:
@@ -102,7 +102,7 @@ bb:
; Make sure codegenprep is duplicating ret instructions to enable tail calls.
; rdar://11140249
-define i32 @t8(i32 %x) nounwind ssp "frame-pointer"="all" {
+define i32 @t8(i32 %x) nounwind ssp "frame-pointer"="none" {
entry:
; CHECKT2D-LABEL: t8:
; CHECKT2D-NOT: push
diff --git a/llvm/test/CodeGen/ARM/debug-frame.ll b/llvm/test/CodeGen/ARM/debug-frame.ll
index faeafdf45dc392..72e7cfcab487a7 100644
--- a/llvm/test/CodeGen/ARM/debug-frame.ll
+++ b/llvm/test/CodeGen/ARM/debug-frame.ll
@@ -526,7 +526,7 @@ entry:
; Test 4
;-------------------------------------------------------------------------------
-define void @test4() nounwind {
+define void @test4() nounwind "frame-pointer"="none" {
entry:
ret void
}
diff --git a/llvm/test/CodeGen/ARM/ehabi.ll b/llvm/test/CodeGen/ARM/ehabi.ll
index fea497076030f1..d1a4e9a6bccad0 100644
--- a/llvm/test/CodeGen/ARM/ehabi.ll
+++ b/llvm/test/CodeGen/ARM/ehabi.ll
@@ -575,7 +575,7 @@ entry:
; Test 4
;-------------------------------------------------------------------------------
-define void @test4() nounwind {
+define void @test4() nounwind "frame-pointer"="none" {
entry:
ret void
}
diff --git a/llvm/test/CodeGen/ARM/frame-chain.ll b/llvm/test/CodeGen/ARM/frame-chain.ll
index e37213e4aaf8b8..7b722cd5fcef24 100644
--- a/llvm/test/CodeGen/ARM/frame-chain.ll
+++ b/llvm/test/CodeGen/ARM/frame-chain.ll
@@ -10,11 +10,14 @@
define dso_local noundef i32 @leaf(i32 noundef %0) {
; LEAF-FP-LABEL: leaf:
; LEAF-FP: @ %bb.0:
-; LEAF-FP-NEXT: .pad #4
-; LEAF-FP-NEXT: sub sp, sp, #4
-; LEAF-FP-NEXT: str r0, [sp]
+; LEAF-FP-NEXT: .save {r11, lr}
+; LEAF-FP-NEXT: push {r11, lr}
+; LEAF-FP-NEXT: .setfp r11, sp
+; LEAF-FP-NEXT: mov r11, sp
+; LEAF-FP-NEXT: push {r0}
; LEAF-FP-NEXT: add r0, r0, #4
-; LEAF-FP-NEXT: add sp, sp, #4
+; LEAF-FP-NEXT: mov sp, r11
+; LEAF-FP-NEXT: pop {r11, lr}
; LEAF-FP-NEXT: mov pc, lr
;
; LEAF-FP-AAPCS-LABEL: leaf:
diff --git a/llvm/test/CodeGen/ARM/ifcvt5.ll b/llvm/test/CodeGen/ARM/ifcvt5.ll
index dc9a3400b691ac..30a92eb34989a6 100644
--- a/llvm/test/CodeGen/ARM/ifcvt5.ll
+++ b/llvm/test/CodeGen/ARM/ifcvt5.ll
@@ -5,7 +5,7 @@
@x = external global ptr ; <ptr> [#uses=1]
-define void @foo(i32 %a) "frame-pointer"="all" {
+define void @foo(i32 %a) "frame-pointer"="none" {
; A8-LABEL: foo:
; A8: @ %bb.0: @ %entry
; A8-NEXT: movw r1, :lower16:(L_x$non_lazy_ptr-(LPC0_0+8))
diff --git a/llvm/test/CodeGen/ARM/ldrd.ll b/llvm/test/CodeGen/ARM/ldrd.ll
index cf5c2dfe5ef60b..3cf10f0e64b4d1 100644
--- a/llvm/test/CodeGen/ARM/ldrd.ll
+++ b/llvm/test/CodeGen/ARM/ldrd.ll
@@ -168,7 +168,7 @@ define void @ldrd_postupdate_inc(ptr %p0) "frame-pointer"="all" {
; NORMAL: strd r1, r2, [r0], #-8
; CONSERVATIVE-NOT: strd
; CHECK: bx lr
-define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all" {
+define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="none" {
%p0.1 = getelementptr i32, ptr %p0, i32 1
store i32 %v0, ptr %p0
store i32 %v1, ptr %p0.1
@@ -180,7 +180,7 @@ define ptr @strd_postupdate_dec(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all"
; NORMAL: strd r1, r2, [r0], #8
; CONSERVATIVE-NOT: strd
; CHECK: bx lr
-define ptr @strd_postupdate_inc(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="all" {
+define ptr @strd_postupdate_inc(ptr %p0, i32 %v0, i32 %v1) "frame-pointer"="none" {
%p0.1 = getelementptr i32, ptr %p0, i32 1
store i32 %v0, ptr %p0
store i32 %v1, ptr %p0.1
diff --git a/llvm/test/CodeGen/ARM/stack-size-section.ll b/llvm/test/CodeGen/ARM/stack-size-section.ll
index fb23e358d856ee..8272389719a691 100644
--- a/llvm/test/CodeGen/ARM/stack-size-section.ll
+++ b/llvm/test/CodeGen/ARM/stack-size-section.ll
@@ -29,4 +29,4 @@ define void @dynalloc(i32 %N) #0 {
ret void
}
-attributes #0 = { "frame-pointer"="all" }
+attributes #0 = { "frame-pointer"="none" }
diff --git a/llvm/test/CodeGen/ARM/v7k-abi-align.ll b/llvm/test/CodeGen/ARM/v7k-abi-align.ll
index 20c7aea5dcbe6b..b27c4354f432a1 100644
--- a/llvm/test/CodeGen/ARM/v7k-abi-align.ll
+++ b/llvm/test/CodeGen/ARM/v7k-abi-align.ll
@@ -117,7 +117,7 @@ define void @test_dpr_unwind_align_no_dprs() "frame-pointer"="all" {
; 128-bit vectors should use 128-bit (i.e. correctly aligned) slots on
; the stack.
-define <4 x float> @test_v128_stack_pass([8 x double], float, <4 x float> %in) "frame-pointer"="all" {
+define <4 x float> @test_v128_stack_pass([8 x double], float, <4 x float> %in) "frame-pointer"="none" {
; CHECK-LABEL: test_v128_stack_pass:
; CHECK: add r[[ADDR:[0-9]+]], sp, #16
; CHECK: vld1.64 {d0, d1}, [r[[ADDR]]:128]
@@ -140,7 +140,7 @@ define void @test_v128_stack_pass_varargs(<4 x float> %in) "frame-pointer"="all"
; To be compatible with AAPCS's va_start model (store r0-r3 at incoming SP, give
; a single pointer), 64-bit quantities must be pass
-define i64 @test_64bit_gpr_align(i32, i64 %r2_r3, i32 %sp) "frame-pointer"="all" {
+define i64 @test_64bit_gpr_align(i32, i64 %r2_r3, i32 %sp) "frame-pointer"="none" {
; CHECK-LABEL: test_64bit_gpr_align:
; CHECK: ldr [[RHS:r[0-9]+]], [sp]
; CHECK: adds r0, [[RHS]], r2
diff --git a/llvm/test/CodeGen/Thumb/frame-chain.ll b/llvm/test/CodeGen/Thumb/frame-chain.ll
index eb62ce09caf1be..e68fc626be9819 100644
--- a/llvm/test/CodeGen/Thumb/frame-chain.ll
+++ b/llvm/test/CodeGen/Thumb/frame-chain.ll
@@ -8,12 +8,16 @@
define dso_local noundef i32 @leaf(i32 noundef %0) {
; LEAF-FP-LABEL: leaf:
; LEAF-FP: @ %bb.0:
-; LEAF-FP-NEXT: .pad #4
-; LEAF-FP-NEXT: sub sp, #4
-; LEAF-FP-NEXT: str r0, [sp]
-; LEAF-FP-NEXT: adds r0, r0, #4
-; LEAF-FP-NEXT: add sp, #4
-; LEAF-FP-NEXT: bx lr
+; LEAF-FP-NEXT: .save {r7, lr}
+; LEAF-FP-NEXT: push {r7, lr}
+; LEAF-FP-NEXT: .setfp r7, sp
+; LEAF-FP-NEXT: add r7, sp, #0
+; LEAF-FP-NEXT: .pad #4
+; LEAF-FP-NEXT: sub sp, #4
+; LEAF-FP-NEXT: str r0, [sp]
+; LEAF-FP-NEXT: adds r0, r0, #4
+; LEAF-FP-NEXT: add sp, #4
+; LEAF-FP-NEXT: pop {r7, pc}
;
; LEAF-FP-AAPCS-LABEL: leaf:
; LEAF-FP-AAPCS: @ %bb.0:
diff --git a/llvm/test/CodeGen/Thumb2/frame-pointer.ll b/llvm/test/CodeGen/Thumb2/frame-pointer.ll
index ae3c1c8a50e2b4..85c919a50d88c1 100644
--- a/llvm/test/CodeGen/Thumb2/frame-pointer.ll
+++ b/llvm/test/CodeGen/Thumb2/frame-pointer.ll
@@ -14,7 +14,7 @@ define void @leaf() {
; Leaf function, frame pointer is requested but we don't need any stack frame,
; so don't create a frame pointer.
-define void @leaf_nofpelim() "frame-pointer"="all" {
+define void @leaf_nofpelim() "frame-pointer"="none" {
; CHECK-LABEL: leaf_nofpelim:
; CHECK-NOT: push
; CHECK-NOT: sp
diff --git a/llvm/test/CodeGen/Thumb2/frameless.ll b/llvm/test/CodeGen/Thumb2/frameless.ll
index 01e0414de37d93..44914136b1f839 100644
--- a/llvm/test/CodeGen/Thumb2/frameless.ll
+++ b/llvm/test/CodeGen/Thumb2/frameless.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=all | not grep mov
-; RUN: llc < %s -mtriple=thumbv7-linux -frame-pointer=all | not grep mov
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=none | not grep mov
+; RUN: llc < %s -mtriple=thumbv7-linux -frame-pointer=none | not grep mov
define void @t() nounwind readnone {
ret void
diff --git a/llvm/test/CodeGen/Thumb2/frameless2.ll b/llvm/test/CodeGen/Thumb2/frameless2.ll
index 4750527ae555cd..4848deaf8a1e4c 100644
--- a/llvm/test/CodeGen/Thumb2/frameless2.ll
+++ b/llvm/test/CodeGen/Thumb2/frameless2.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=all | not grep r7
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -frame-pointer=none | not grep r7
%struct.noise3 = type { [3 x [17 x i32]] }
%struct.noiseguard = type { i32, i32, i32 }
diff --git a/llvm/test/CodeGen/Thumb2/machine-licm.ll b/llvm/test/CodeGen/Thumb2/machine-licm.ll
index 5a2ec9280de770..a2f379f7b54384 100644
--- a/llvm/test/CodeGen/Thumb2/machine-licm.ll
+++ b/llvm/test/CodeGen/Thumb2/machine-licm.ll
@@ -1,5 +1,5 @@
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=dynamic-no-pic -frame-pointer=all | FileCheck %s
-; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=all | FileCheck %s --check-prefix=PIC
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=dynamic-no-pic -frame-pointer=none | FileCheck %s
+; RUN: llc < %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=pic -frame-pointer=none | FileCheck %s --check-prefix=PIC
; rdar://7353541
; rdar://7354376
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
index bae66d456f89a5..174cca4fab0982 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
@@ -60,4 +60,4 @@ define dso_local i32 @main() #0 {
ret i32 0
}
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
index de5571f6436154..2dfb725f556655 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
@@ -61,7 +61,7 @@ define dso_local i32 @main() #0 {
ret i32 0
}
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }
; CHECK-LABEL: check_boundaries:
; CHECK: @ %bb.0:
; CHECK-NEXT: sub sp, sp, #20
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
index 4f623384ade602..85d3389cdaaf92 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
@@ -121,4 +121,4 @@ define dso_local i32 @main() #0 {
ret i32 0
}
-attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
+attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="none" }
|
9a2f502
to
45efaee
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this patch fully. We already have ways of specifying how frame pointer elimination works through function attributes, so if those are incorrect, then we should correct how they are applied. Can you elaborate on why we need to add an interface to TargetOptions
instead of using the existing mechanisms?
Additionally, it seems that perhaps the inconsistency is in how clang applies attribute and perhaps the logic in
static bool mustMaintainValidFrameChain(const llvm::opt::ArgList &Args, |
Lastly, I'm not sure why so many existing tests were changed from frame-pointer=all
to frame-pointer=none
. That seems incorrect, though if the tests didn't change, perhaps the attribute isn't needed in many cases. In any case, I'm hesitant to say those are appropriate w/o more rationale.
Maybe
|
@@ -40,6 +40,20 @@ bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const { | |||
llvm_unreachable("unknown frame pointer flag"); | |||
} | |||
|
|||
/// DisableLeafFramePointerElim - This returns true if leaf frame pointer | |||
/// elimination optimization should be disabled for the given machine function. | |||
bool TargetOptions::DisableLeafFramePointerElim( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DisableFramePointerElim itself should be sufficient; it already checks whether the function is a leaf function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right, but the logic in determineCalleeSaves currently does not align with this scenario, which involves saving frame pointer (FP) registers with -mno-omit-leaf-frame-pointer and -fno-omit-frame-pointer
llvm-project/llvm/lib/Target/ARM/ARMFrameLowering.cpp
Lines 2265 to 2273 in 9b5a303
void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF, | |
BitVector &SavedRegs, | |
RegScavenger *RS) const { | |
TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS); | |
// This tells PEI to spill the FP as if it is any other callee-save register | |
// to take advantage the eliminateFrameIndex machinery. This also ensures it | |
// is spilled in the order specified by getCalleeSavedRegs() to make it easier | |
// to combine multiple loads / stores. | |
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)); |
I would appreciate any suggestions for improvements or modifications you might have. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, DisableFramePointerElim() should return true in all the cases where we want to disable frame-pointer elimination. So you shouldn't need a separate API specifically for leaf functions. I think if you just replace the call to DisableLeafFramePointerElim with a call to DisableFramePointerElim, you should get the same behavior. So the target-independent code shouldn't need changes.
Probably at that point, the logic could be refactored a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thank you for your advice. I'll remove DisableFramePointerElim()
and refactor CanEliminateFrame like this:
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) || !(MF.getFunction().hasFnAttribute("frame-pointer") && MF.getFunction().getFnAttribute("frame-pointer").getValueAsString() == "all");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow this... I'd prefer to continue using the target-independent DisableFramePointerElim() where possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did an experiment and changed it like this bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) && !MF.getTarget().Options.DisableFramePointerElim(MF);
However, fp is retained in the following scenarios:
llvm-project/llvm/lib/Target/ARM/ARMFrameLowering.cpp
Lines 180 to 185 in 3f37c51
bool ARMFrameLowering::keepFramePointer(const MachineFunction &MF) const { | |
// iOS always has a FP for backtracking, force other targets to keep their FP | |
// when doing FastISel. The emitted code is currently superior, and in cases | |
// like test-suite's lencod FastISel isn't quite correct when FP is eliminated. | |
return MF.getSubtarget<ARMSubtarget>().useFastISel(); | |
} |
I don't know if it makes sense. For example, the fast-isel-intrinsic.ll test case saves fp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like this,
@ %bb.0:
push {r7, lr}
mov r7, sp
movw r0, :lower16:_message1
movt r0, :upper16:_message1
add r0, r0, #5
movw r1, #64
movw r2, #10
and r1, r1, #255
bl _memset
pop {r7, pc}
@ -- End function
.globl _t2 @ -- Begin function t2
.p2align 2
.code 32 @ @t2
Because this judgment is at the front
llvm-project/llvm/lib/CodeGen/TargetOptionsImpl.cpp
Lines 24 to 31 in 22829f7
bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const { | |
// Check to see if the target want to forcibly keep frame pointer. | |
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF)) | |
return true; | |
const Function &F = MF.getFunction(); | |
if (!F.hasFnAttribute("frame-pointer")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to preserve the existing behavior of fast-isel, maybe ARMFrameLowering::keepFramePointer() should check MF.getFrameInfo().hasCalls()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, hmm. I guess the way this works on ARM specifically is that we have a state between "non-leaf" and "leaf": basically, we have a frame pointer if we spill any registers. (Or a few other obscure reasons which are mostly irrelevant outside of M-class targets.) And currently, the ARM target translates "leaf" to this.
I think the best thing to do would be to get rid of ARMFrameLowering::keepFramePointer(), and add the useFastISel() check to ARMFrameLowering::hasFP() instead. That should preserve the behavior we want here, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I've made the code modifications. I hope you can help me check it.
I think the issue here is that the function attributes were requesting a frame pointer, but the CHECK lines check that there isn't one. So the tests need to be fixed one way or the other. Changing the attribute seems like the least invasive fix. |
Ah thanks. That was not obvious to me when I first looked at the tests. So, I guess in that case, updating the attribute should be fine. |
Thank you for your comments. The llvm-project/clang/lib/Driver/ToolChains/CommonArgs.cpp Lines 211 to 213 in 9b5a303
However, in the backend, the logic bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)); is used, where both requiresAAPCSFrameRecord and hasFP are true. I noticed that to save the frame pointer (FP) with -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer , I must enable AAPCS , which I understand is not required. Therefore, I added the DisableLeafFramePointerElim interface to accommodate this scenario.llvm-project/llvm/lib/Target/ARM/ARMFrameLowering.cpp Lines 2265 to 2273 in 9b5a303
llvm-project/llvm/lib/Target/ARM/ARMFrameLowering.cpp Lines 2235 to 2239 in 9b5a303
Additionally, I'm curious about why the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -22,10 +22,6 @@ using namespace llvm; | |||
/// DisableFramePointerElim - This returns true if frame pointer elimination | |||
/// optimization should be disabled for the given machine function. | |||
bool TargetOptions::DisableFramePointerElim(const MachineFunction &MF) const { | |||
// Check to see if the target want to forcibly keep frame pointer. | |||
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keepFramePointer is implemented as a target independent method... if you're going to get rid of the call here, you should also get rid of the interface from TargetFrameLowering. (No other in-tree target implements it anyway, but we want to keep the code clean, and give an obvious error in case there is someone with an out-of-tree dependency.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the interface from TargetFrameLowering in the latest submission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse me, Please check the code again, thank you. @efriedma-quic
@@ -203,6 +203,10 @@ bool ARMFrameLowering::hasFP(const MachineFunction &MF) const { | |||
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo(); | |||
const MachineFrameInfo &MFI = MF.getFrameInfo(); | |||
|
|||
// Check to see if the target want to forcibly keep frame pointer. | |||
if (MF.getSubtarget().getFrameLowering()->keepFramePointer(MF)) | |||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine.
@@ -2270,7 +2274,8 @@ void ARMFrameLowering::determineCalleeSaves(MachineFunction &MF, | |||
// to take advantage the eliminateFrameIndex machinery. This also ensures it | |||
// is spilled in the order specified by getCalleeSavedRegs() to make it easier | |||
// to combine multiple loads / stores. | |||
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)); | |||
bool CanEliminateFrame = !(requiresAAPCSFrameRecord(MF) && hasFP(MF)) && | |||
!MF.getTarget().Options.DisableFramePointerElim(MF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also seems fine.
e7c6606
to
08cca47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a release note to llvm/docs/ReleaseNotes.rst and clang/docs/ReleaseNotes.rst.
clang, as far as I can tell, currently defaults to -mno-omit-leaf-frame-pointer
; do we want to change that, so new versions of clang are consistent with older versions and gcc by default? This is useLeafFramePointerForTargetByDefault in clang/lib/Driver/ToolChains/CommonArgs.cpp .
(I probably should have mentioned both of these earlier, but I was focused more on the code change itself, not the bigger picture...)
LLVM is |
I've added llvm/docs/ReleaseNotes.md and clang/docs/ReleaseNotes.rst. |
clang/docs/ReleaseNotes.rst
Outdated
Leaf Functions Do Not Retain FP Bug Fixed | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
- Fixed ``-fno-omit-frame-pointer`` leaf functions do not retain fp.(#GH108019) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this under "Target-specific changes".
Maybe mention -momit-leaf-frame-pointer
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,Could you kindly review the changes I've made? Thank you.
llvm/docs/ReleaseNotes.md
Outdated
* The default behavior for frame pointers in leaf functions has been updated. When | ||
`-fno-omit-frame-pointer` is specified, the frame pointer (FP) will now be retained | ||
in leaf functions by default. To eliminate the frame pointer in leaf functions, the | ||
`-momit-leaf-frame-pointer` option must be explicitly provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also mention "frame-pointer"="all"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,Could you kindly review the changes I've made? Thank you.
LGTM, thx. |
@guoxin049 Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
FYI we found an issue with this when testing lldb - #113154 No clear indication yet whether this change uncovered an existing problem, or has caused a new one. |
…en issue Since #109628 landed, this test has been failing on 32-bit Arm. This is due to a codegen problem (whether added or uncovered by the change, not known) where the trap instruction is placed after the frame pointer and link register are restored. #113154 So the code was: ``` std::__1::vector<int>::operator[](unsigned int): sub sp, sp, #8 str r0, [sp, #4] str r1, [sp] add sp, sp, #8 .inst 0xe7ffdefe bx lr ``` When lldb saw the trap, the PC was inside operator[] but the frame information actually pointed to g. This bug only happens for leaf functions so adding a return type works around it: ``` std::__1::vector<int>::operator[](unsigned int): push {r11, lr} mov r11, sp sub sp, sp, #8 str r0, [sp, #4] str r1, [sp] mov sp, r11 pop {r11, lr} .inst 0xe7ffdefe bx lr ``` (and operator[] should return T& anyway) Now the PC location and frame information should match and the test passes.
…RM (llvm#109628) The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures and addresses the bug reported in llvm#108019
…en issue Since llvm#109628 landed, this test has been failing on 32-bit Arm. This is due to a codegen problem (whether added or uncovered by the change, not known) where the trap instruction is placed after the frame pointer and link register are restored. llvm#113154 So the code was: ``` std::__1::vector<int>::operator[](unsigned int): sub sp, sp, llvm#8 str r0, [sp, #4] str r1, [sp] add sp, sp, llvm#8 .inst 0xe7ffdefe bx lr ``` When lldb saw the trap, the PC was inside operator[] but the frame information actually pointed to g. This bug only happens for leaf functions so adding a return type works around it: ``` std::__1::vector<int>::operator[](unsigned int): push {r11, lr} mov r11, sp sub sp, sp, llvm#8 str r0, [sp, #4] str r1, [sp] mov sp, r11 pop {r11, lr} .inst 0xe7ffdefe bx lr ``` (and operator[] should return T& anyway) Now the PC location and frame information should match and the test passes.
The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures and addresses the bug reported in #108019