Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MC: "error: A dwo section may not contain relocations" when building with fission + RISCV64 #56642

Open
compnerd opened this issue Jul 20, 2022 · 21 comments
Assignees
Labels

Comments

@compnerd
Copy link
Member

compnerd commented Jul 20, 2022

When building with fission, the DW_AT_high_pc field in .debug_info which is redirected to .debug_info.dwo will contain a label difference. The field is meant to be DW_AT_low_pc relative distance, so the difference is unavoidable. Furthermore, with relaxation, the size of the function may change at link time, and thus needs to be resolved at link time.

It appears that gcc will emit the object file, duplicate it and then "extract the DWO sections" dropping the associated relocations resulting in the file showing no relocations.

// RUN: %clang_cc1 -triple riscv64-unknown-linux-gnu -debug-info-kind=constructor -emit-obj -o %t.o -split-dwarf-file %t.dwo -split-dwarf-output %t.dwo
int f(void) { return 32; }
@MaskRay
Copy link
Member

MaskRay commented Jul 20, 2022

https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#split-dwarf

-gsplit-dwarf produces a .dwo file which will not be processed by the linker. If .dwo files contain relocations, they will not be resolved. Therefore the practice is that .dwo files do not contain relocations. Since features like address ranges need relocations and cannot be resolved without linking, split DWARF is foundamentally incompatible with linker relaxation.

Let me send a Clang driver patch to disallow -mrelax -gsplit-dwarf for RISC-V...

@EugeneZelenko EugeneZelenko added backend:RISC-V mc Machine (object) code and removed new issue labels Jul 20, 2022
@llvmbot
Copy link
Member

llvmbot commented Jul 20, 2022

@llvm/issue-subscribers-backend-risc-v

@jrtc27
Copy link
Collaborator

jrtc27 commented Jul 20, 2022

So GCC just silently produces broken .dwo files?

@compnerd
Copy link
Member Author

@MaskRay - note that this will occur regardless of the state of -mrelax - the address difference is part of the DWARF specification.

@jrtc27 I can't say so definitely, but from a quick look, it seemed like gcc will just do objcopy --extract-dwo ....o ....dwo which will then just filter any section that has a .dwo suffix into the new object file (which excludes the relocation list). But the DW_AT_high_pc field is still left as a difference of two labels. I don't know if dwp or gdb do something special to restore that to a constant but I find that difficult to believe as the label can be adjusted during the link phase.

@MaskRay
Copy link
Member

MaskRay commented Jul 20, 2022

Yes, GCC -mrelax -gsplit-dwarf output is currently silently broken. It uses a constant in .debug_info.dwo to represent DW_AT_high_pc, which will be wrong when the code ranges decrease.

Correction to myself: -gsplit-dwarf and linker relaxation is not incompatible. It's just unsupported in GCC and LLVM's debug info generator. Using address ranges/location description forms which are based on .debug_addr indexes can describe -mrelax code. The DWARF output will be larger, and likely make split DWARF less useful. Since the implementation is non-trivial and the benefit is unclear, I suggest we have a driver error (https://reviews.llvm.org/D130190) for now.

@llvmbot
Copy link
Member

llvmbot commented Jul 20, 2022

@llvm/issue-subscribers-debuginfo

@compnerd compnerd self-assigned this Jul 20, 2022
MaskRay added a commit that referenced this issue Jul 23, 2022
-gsplit-dwarf produces a .dwo file which will not be processed by the linker. If
.dwo files contain relocations, they will not be resolved. Therefore the
practice is that .dwo files do not contain relocations.

Address ranges and location description need to use forms/entry kinds indexing
into .debug_addr (DW_FORM_addrx/DW_RLE_startx_endx/etc), which is currently not
implemented.

There is a difficult-to-read MC error with -gsplit-dwarf with RISC-V for both -mrelax and -mno-relax.
```
% clang --target=riscv64-linux-gnu -g -gsplit-dwarf -c a.c
error: A dwo section may not contain relocations
```

We expect to fix -mno-relax soon, so report a driver error for -mrelax for now.

Link: #56642

Reviewed By: compnerd, kito-cheng

Differential Revision: https://reviews.llvm.org/D130190
@dwblaikie
Copy link
Collaborator

@jrtc27 I can't say so definitely, but from a quick look, it seemed like gcc will just do objcopy --extract-dwo ....o ....dwo which will then just filter any section that has a .dwo suffix into the new object file (which excludes the relocation list). But the DW_AT_high_pc field is still left as a difference of two labels. I don't know if dwp or gdb do something special to restore that to a constant but I find that difficult to believe as the label can be adjusted during the link phase.

Yep, that sounds busted to me, which is why we have that check in LLVM to ensure no relocations are produced in .dwo sections.

It sounds like what's needed is some way to check if a label difference is computable at compile-time (in which case we can use the high_pc-relative-to-low_pc encoding - and if it's not possible, we can use another address for high_pc) - is there such a hook/query already that we could use to fix the DWARF generation?

compnerd added a commit to compnerd/llvm-project that referenced this issue Jul 25, 2022
Force eager evaluation of symbolic difference on debug_info which is
required to be resolved eagerly for fission as dwo sections may not
contain relocations.

Fixes: llvm#56642
@compnerd
Copy link
Member Author

compnerd commented Jul 25, 2022

@dwblaikie hmm, well, it is possible to compute that in the case that relaxations are disabled. Given that -gsplit-dwarf -mrelax is now a driver level error, we only need to worry about -gsplit-dwarf -mno-relax where the values should be computable. This is what https://reviews.llvm.org/D130206 attempts to do poorly. Unfortunately, this is now uncovering a similar issue in the .debug_ranges.dwo section where a constant is not being emitted and instead is forming a relocation.

@dwblaikie
Copy link
Collaborator

Yeah, if we had/have a way to detect if a range will be known at compile time (ie: not subject to linker relaxation) then we could use that in the .debug_ranges building logic too, to use a different encoding if the end of a range can't be encoded as a fixed offset from the start.

@compnerd
Copy link
Member Author

I think that we should be able to compute that from the SubtargetInfo.

@compnerd
Copy link
Member Author

Another reduced test case failure:

// RUN: %clang_cc1 -triple riscv64-unknown-linux-musl -emit-obj -debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file %t.dwo -split-dwarf-output %t.dwo -mllvm -minimize-addr-in-v5=Ranges -o %t.o -x c++ %s
struct S {
  S(const char*);
};
S s("");

@dwblaikie
Copy link
Collaborator

yeah, looks like the minimize-addr-in-v5 isn't needed? & it reproduces with standalone debug, not just with constructor or limited debug info (and in v4).

Hard to dump the dwo file (if I modify LLVM to still produce the relocations, so I might see where the relocations are) - since the relocations are invalid/refer to sections not in the dwo file.

But looking at the output... it complains 4 times, though I only see 2 relocations that should be problematic. (the high_pc for __cxx_global_var_init and for _GLOBAL__sub_I_riscv.cpp) Not sure about the other two.

The issue does reproduce with a simpler file: void f1() { } (two errors about dwo sections containing relocations). I'd expect one (for f1s high_pc) not sure where the second is coming from. Maybe the same relocation is checked twice for some reason.

In any case if there's some way to test whether MCLabel - MCLabel is computable at compile-time or not, then that test should be used here to avoid using the label difference:

if (DD->getDwarfVersion() < 4)

A similar check would be needed somewhere...

(tnhis one will involve more close looking to figure out exactly how to make it work - but basically "Base" should never be set if relative differences can't be used, and we want to go all the way to the else clause that emits begin/end pairs, rather than any relative references). We'll need to add another parameter for the DWARFv5 encoding to allow it to use the DW_LLE/RLE_startx_endx encoding.

@compnerd
Copy link
Member Author

Oh, the interesting thing about that reproduction (and the use of the minimize-addr-in-v5) is because I actually was testing with a change that eagerly evaluated the other form of the label difference. This is a second variant that it doesn't cover and seems to indicate that there is at least one other site that needs to be adjusted.

@dwblaikie
Copy link
Collaborator

Oh, OK - yeah, so the reproduction that doesn't require minimize-addr-in-v5 but does hit the other codepath (the second one linked above) would require a non-contiguous scope inside a function - I usually do that by hand crafting/modifying IR, though there's probably some way to force reordering with the right instructions/transforms.

So something like this:

void f1();
inline void f2() {
  f1();
  f1();
}
void f3() {
  f1();
  f2();
}

Perform inlining, then reorder the non-inlined f1 call in between the inlined ones, then you'll get a range list in the dwo file that'll use an offset_pair that'll end up requiring relocations on RISCV, I think.

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
-gsplit-dwarf produces a .dwo file which will not be processed by the linker. If
.dwo files contain relocations, they will not be resolved. Therefore the
practice is that .dwo files do not contain relocations.

Address ranges and location description need to use forms/entry kinds indexing
into .debug_addr (DW_FORM_addrx/DW_RLE_startx_endx/etc), which is currently not
implemented.

There is a difficult-to-read MC error with -gsplit-dwarf with RISC-V for both -mrelax and -mno-relax.
```
% clang --target=riscv64-linux-gnu -g -gsplit-dwarf -c a.c
error: A dwo section may not contain relocations
```

We expect to fix -mno-relax soon, so report a driver error for -mrelax for now.

Link: llvm/llvm-project#56642

Reviewed By: compnerd, kito-cheng

Differential Revision: https://reviews.llvm.org/D130190
@kxxt
Copy link

kxxt commented Apr 6, 2023

On riscv64gc linux, with rust > 1.64, building a crate(proc-macro2) in dev profile with option split-debuginfo="unpacked" fails with the same error.

error: A dwo section may not contain relocations
  • For rust <=1.64, cargo didn't pass -C split-debuginfo=unpacked flag to rustc for some reason that I don't know.
    • For rust 1.64, a segmentation fault happens in rustc.

@jrtc27
Copy link
Collaborator

jrtc27 commented Apr 6, 2023

Don’t do that then? Split dwarf doesn’t work with linker relaxations, and clang will stop you enabling it. If rustc lets you do it then that’s rustc’s problem.

@dwblaikie
Copy link
Collaborator

Split DWARF isn't fundamentally incapable of handling linker relaxation - it's bugs that can be fixed & would be true without split DWARF too... - that we can't use absolute values to represent the distance from one label to another in cases where that difference may be subject to relaxation.

@MaskRay
Copy link
Member

MaskRay commented May 15, 2023

-gsplit-dwarf is not fundamentally incompatible with RISC-V linker relaxation, but we need to use DW_RLE_startx_endx in *.dwo files.

I think the assembler architecture for linker relaxation needs more work to not become a bottleneck fixing debug information issues.

After .debug_line/.debug_frame fix https://reviews.llvm.org/D150004 , I will think about end a relaxable instruction with a new MCFragment in the MC layer, and then come to this issue.

@nickdesaulniers nickdesaulniers assigned MaskRay and unassigned compnerd Aug 16, 2023
arichardson pushed a commit to CTSRD-CHERI/llvm-project that referenced this issue Jan 9, 2024
-gsplit-dwarf produces a .dwo file which will not be processed by the linker. If
.dwo files contain relocations, they will not be resolved. Therefore the
practice is that .dwo files do not contain relocations.

Address ranges and location description need to use forms/entry kinds indexing
into .debug_addr (DW_FORM_addrx/DW_RLE_startx_endx/etc), which is currently not
implemented.

There is a difficult-to-read MC error with -gsplit-dwarf with RISC-V for both -mrelax and -mno-relax.
```
% clang --target=riscv64-linux-gnu -g -gsplit-dwarf -c a.c
error: A dwo section may not contain relocations
```

We expect to fix -mno-relax soon, so report a driver error for -mrelax for now.

Link: llvm/llvm-project#56642

Reviewed By: compnerd, kito-cheng

Differential Revision: https://reviews.llvm.org/D130190
kxxt added a commit to kxxt/rust that referenced this issue Jan 31, 2024
Disable packed/unpacked options for riscv linux/android.
Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future.
See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 3, 2024
…piler-errors

riscv only supports split_debuginfo=off for now

Disable packed/unpacked options for riscv linux/android. Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future. See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 4, 2024
…piler-errors

riscv only supports split_debuginfo=off for now

Disable packed/unpacked options for riscv linux/android. Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future. See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 5, 2024
…piler-errors

riscv only supports split_debuginfo=off for now

Disable packed/unpacked options for riscv linux/android. Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future. See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 5, 2024
…piler-errors

riscv only supports split_debuginfo=off for now

Disable packed/unpacked options for riscv linux/android. Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future. See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 5, 2024
Rollup merge of rust-lang#120518 - kxxt:riscv-split-debug-info, r=compiler-errors

riscv only supports split_debuginfo=off for now

Disable packed/unpacked options for riscv linux/android. Other riscv targets already only have the off option.

The packed/unpacked options might be supported in the future. See upstream issue for more details:
llvm/llvm-project#56642

Fixes rust-lang#110224
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 19, 2024
…sable-split-debuginfo, r=jieyouxu

Disable run-make/split-debuginfo test for RISC-V 64

Together with `@Hoverbear,` we've been improving the state of the riscv64gc-unknown-linux-gnu target.

This is in relation to rust-lang#125220 where `tests/ui/debuginfo/debuginfo-emit-llvm-ir-and-split-debuginfo.rs` was disabled for RISC-V 64 in that another test, `tests/run-make/split-debuginfo` also needs to be disabled due to llvm/llvm-project#56642 and the changes made in rust-lang#120518.

This test appears to be a host test, not a target test, so it isn't seen failing in rust-lang#126641, however, we are in the process of testing host tools for riscv64-gc-unknown-linux-gnu so this test has now been noticed to be a problem.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jul 19, 2024
Rollup merge of rust-lang#127967 - joshua-zivkovic:joshua-zivkovic/disable-split-debuginfo, r=jieyouxu

Disable run-make/split-debuginfo test for RISC-V 64

Together with `@Hoverbear,` we've been improving the state of the riscv64gc-unknown-linux-gnu target.

This is in relation to rust-lang#125220 where `tests/ui/debuginfo/debuginfo-emit-llvm-ir-and-split-debuginfo.rs` was disabled for RISC-V 64 in that another test, `tests/run-make/split-debuginfo` also needs to be disabled due to llvm/llvm-project#56642 and the changes made in rust-lang#120518.

This test appears to be a host test, not a target test, so it isn't seen failing in rust-lang#126641, however, we are in the process of testing host tools for riscv64-gc-unknown-linux-gnu so this test has now been noticed to be a problem.
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jul 20, 2024
…it-debuginfo, r=jieyouxu

Disable run-make/split-debuginfo test for RISC-V 64

Together with `@Hoverbear,` we've been improving the state of the riscv64gc-unknown-linux-gnu target.

This is in relation to rust-lang/rust#125220 where `tests/ui/debuginfo/debuginfo-emit-llvm-ir-and-split-debuginfo.rs` was disabled for RISC-V 64 in that another test, `tests/run-make/split-debuginfo` also needs to be disabled due to llvm/llvm-project#56642 and the changes made in rust-lang/rust#120518.

This test appears to be a host test, not a target test, so it isn't seen failing in rust-lang/rust#126641, however, we are in the process of testing host tools for riscv64-gc-unknown-linux-gnu so this test has now been noticed to be a problem.
@asb
Copy link
Contributor

asb commented Oct 9, 2024

-gsplit-dwarf is not fundamentally incompatible with RISC-V linker relaxation, but we need to use DW_RLE_startx_endx in *.dwo files.

I think the assembler architecture for linker relaxation needs more work to not become a bottleneck fixing debug information issues.

After .debug_line/.debug_frame fix https://reviews.llvm.org/D150004 , I will think about end a relaxable instruction with a new MCFragment in the MC layer, and then come to this issue.

Hi @MaskRay, I Just wondered what your thoughts are on the current situation for -gsplit-dwarf and RISC-V linker relaxation. i.e. to what degree is more reworking of the MC layer needed vs this being "just" a not yet implemented feature?

@dwblaikie
Copy link
Collaborator

FWIW, my take on it would be that MC should be able to answer the question "is this label difference resolvable at assembly-time" and then the DWARF emitter (lib/CodeGen/AsmPrinter/Dwarf*) can use that query to decide which representations to use most efficiently.

This could/should/would also be used, for instance ,for PROPELLER (@tmsri FYI, we've talked about this before) if/when it's supporting relaxation.

@tmsri
Copy link
Member

tmsri commented Oct 9, 2024

FWIW, my take on it would be that MC should be able to answer the question "is this label difference resolvable at assembly-time" and then the DWARF emitter (lib/CodeGen/AsmPrinter/Dwarf*) can use that query to decide which representations to use most efficiently.

This could/should/would also be used, for instance ,for PROPELLER (@tmsri FYI, we've talked about this before) if/when it's supporting relaxation.

Correct. With basic block sections, linker relaxation could shrink branches changing the size of the block. The label differences would not be resolved at assembly-time in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants