Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking error with Xcode 15 invalid r_symbolnum #1456

Closed
ehuss opened this issue Feb 13, 2024 · 16 comments
Closed

Linking error with Xcode 15 invalid r_symbolnum #1456

ehuss opened this issue Feb 13, 2024 · 16 comments
Labels
C-bug Category: This is a bug. help wanted Extra attention is needed O-macos Operating system: MacOS

Comments

@ehuss
Copy link

ehuss commented Feb 13, 2024

Trying to run the cranelift testsuite on an x86_64 macOS system with Xcode 15 results in an error that looks roughly like this:

  = note: ld: warning: search path '/Users/eric/Proj/rust/rustc_codegen_cranelift/./build/rtstartup/lib/rustlib/x86_64-apple-darwin/lib' not found
          ld: warning: search path '/Users/eric/Proj/rust/rustc_codegen_cranelift/./build/rtstartup/lib/rustlib/x86_64-apple-darwin/lib' not found
          ld: multiple errors: invalid r_symbolnum=16 in '/Users/eric/Proj/rust/rustc_codegen_cranelift/build/stdlib_target/x86_64-apple-darwin/release/deps/std-6043c0e15839b1b8.37dz0diu3mok62ud.rcgu.o'; invalid r_symbolnum=20 in '/Users/eric/Proj/rust/rustc_codegen_cranelift/build/stdlib_target/x86_64-apple-darwin/release/deps/std-6043c0e15839b1b8.3cpybd90f24iafu4.rcgu.o'; invalid r_symbolnum=11 in '/Users/eric/Proj/rust/rustc_codegen_cranelift/build/stdlib_target/x86_64-apple-darwin/release/deps/std-6043c0e15839b1b8.5dobmwxxuzv3ys1m.rcgu.o'
          clang: error: linker command failed with exit code 1 (use -v to see invocation)

This is fairly easily reproducible, though it doesn't seem to happen 100% of the time. This is a blocker for rust-lang/rust updating the Xcode version.

Currently tested Xcode 15.0 and 15.2. A significant thing to note is that Xcode 15 introduced ld-prime, their new linker. One thing to consider while investigating is to try setting the linker flag -ld_classic to narrow down if the new linker is a factor (sorry, I'm not familiar with cranelift's setup to help more).

Tested at commit cdae185.

@bjorn3
Copy link
Member

bjorn3 commented Apr 6, 2024

The new linker indeed plays a role. With Xcode 15 it only reproduces when -ld_classic isn't passed.

@bjorn3
Copy link
Member

bjorn3 commented Apr 6, 2024

If someone with a mac is able to reproduce this and figure out to which function(s) these relocations which ld-prime complains about apply that would be very much appreciated. I don't have a mac myself, so I can't reproduce it locally.

Note: arm64 macOS is not supported yet, so if you want to reproduce it on Apple Silicon you need to run the following to use the x86_64 toolchain in Rosetta 2:

rustup toolchain install --force-non-host nightly-2024-03-30-x86_64-apple-darwin
rustup override set nightly-2024-03-30-x86_64-apple-darwin

@bjorn3 bjorn3 added C-bug Category: This is a bug. help wanted Extra attention is needed O-macos Operating system: MacOS labels Apr 6, 2024
@bjorn3
Copy link
Member

bjorn3 commented Apr 28, 2024

GHA has started using an XCode version with ld-prime by default. Had to downgrade it in 05367c5.

@philipc
Copy link
Contributor

philipc commented Apr 29, 2024

I get this error
[BUILD] mini_core
error: linking with `cc` failed: exit status: 1
  |
  = note: env -u IPHONEOS_DEPLOYMENT_TARGET -u TVOS_DEPLOYMENT_TARGET -u XROS_DEPLOYMENT_TARGET LC_ALL="C" PATH="/Users/user937765/rustc_codegen_cranelift/./dist/lib/rustlib/aarch64-apple-darwin/bin:/Users/user937765/.cargo/bin:/opt/homebrew/opt/ruby/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/share/dotnet:/usr/local/munki:~/.dotnet/tools:/Library/Apple/usr/bin:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin" VSLANG="1033" ZERO_AR_DATE="1" "cc" "-Wl,-exported_symbols_list,/var/folders/08/phksnxt15gd837rxj36b7pwm0000lm/T/rustcoIiXtD/list" "-arch" "arm64" "/var/folders/08/phksnxt15gd837rxj36b7pwm0000lm/T/rustcoIiXtD/symbols.o" "/Users/user937765/rustc_codegen_cranelift/./build/example/mini_core.mini_core.44353e37ad17544c-cgu.0.rcgu.o" "/Users/user937765/rustc_codegen_cranelift/./build/example/mini_core.4n5kch4f68x303nu.rcgu.rmeta" "-L" "/Users/user937765/rustc_codegen_cranelift/./build/example" "-L" "/Users/user937765/rustc_codegen_cranelift/./dist/lib/rustlib/aarch64-apple-darwin/lib" "-lc" "-L" "/Users/user937765/rustc_codegen_cranelift/./dist/lib/rustlib/aarch64-apple-darwin/lib" "-o" "/Users/user937765/rustc_codegen_cranelift/./build/example/libmini_core.dylib" "-Wl,-dead_strip" "-dynamiclib" "-Wl,-dylib" "-nodefaultlibs" "-undefined" "dynamic_lookup"
  = note: ld: warning: search path '/Users/user937765/rustc_codegen_cranelift/./dist/lib/rustlib/aarch64-apple-darwin/lib' not found
          ld: warning: search path '/Users/user937765/rustc_codegen_cranelift/./dist/lib/rustlib/aarch64-apple-darwin/lib' not found
          0  0x10298ef2c  __assert_rtn + 72
          1  0x1028d5ec4  ld::InputFiles::SliceParser::parseObjectFile(mach_o::Header const*) const + 22976
          2  0x1028e2404  ld::InputFiles::parseAllFiles(void (ld::AtomFile const*) block_pointer)::$_7::operator()(unsigned long, ld::FileInfo const&) const + 420
          3  0x184d30440  _dispatch_client_callout2 + 20
          4  0x184d43f1c  _dispatch_apply_invoke + 224
          5  0x184d30400  _dispatch_client_callout + 20
          6  0x184d41fb8  _dispatch_root_queue_drain + 684
          7  0x184d426c0  _dispatch_worker_thread2 + 164
          8  0x184edc038  _pthread_wqthread + 228
          ld: Assertion failed: (pattern[0].addrMode == addr_other), function addFixupFromRelocations, file Relocations.cpp, line 700.
          clang: error: linker command failed with exit code 1 (use -v to see invocation)
          

error: aborting due to 1 previous error

"/Users/user937765/rustc_codegen_cranelift/./dist/rustc-clif" "-Clink-arg=-undefined" "-Clink-arg=dynamic_lookup" "-L" "crate=/Users/user937765/rustc_codegen_cranelift/./build/example" "--out-dir" "/Users/user937765/rustc_codegen_cranelift/./build/example" "-Cdebuginfo=2" "--target" "aarch64-apple-darwin" "-Cpanic=abort" "-Zunstable-options" "--check-cfg=cfg(no_unstable_features)" "--check-cfg=cfg(jit)" "example/mini_core.rs" "--crate-type" "lib,dylib" exited with status ExitStatus(unix_wait_status(256))

I'll look into it more, but do you have any tips on how to get the information you need?

@bjorn3
Copy link
Member

bjorn3 commented Apr 29, 2024

I think the first step would be to try trimming down the crate it fails to link (in this case example/mini_core.rs) to try and narrow down which symbol it can't handle.

@philipc
Copy link
Contributor

philipc commented Apr 29, 2024

Oh, I hadn't set the toolchain. Fixing that, I get this error:

  = note: ld: warning: search path '/Users/user937765/rustc_codegen_cranelift/./build/rtstartup/lib/rustlib/x86_64-apple-darwin/lib' not found
          ld: warning: search path '/Users/user937765/rustc_codegen_cranelift/./build/rtstartup/lib/rustlib/x86_64-apple-darwin/lib' not found
          ld: multiple errors: invalid r_symbolnum=3 in '/Users/user937765/rustc_codegen_cranelift/build/stdlib_target/x86_64-apple-darwin/release/deps/std-6bec33c9e234de93.353qo5q157z8btsm.rcgu.o'; invalid r_symbolnum=4 in '/Users/user937765/rustc_codegen_cranelift/build/stdlib_target/x86_64-apple-darwin/release/deps/std-6bec33c9e234de93.3h2cmphinuqs2z50.rcgu.o'

Looking in those object files, there's only one relocation for each of the r_symbolnum. Here's the relocations and associated symbols.

RelocationInfo {
    Address: 0x1DF
    Extern: yes
    Symbol: "_.Ldata0" (0x3)
    PcRel: yes
    Length: 2
    Type: X86_64_RELOC_GOT (0x4)
}
Nlist {
    Index: 3
    String: "_.Ldata0" (0x651)
    Type: N_SECT (0xE)
    Section: "__TEXT,__const" (0x2)
    Desc: 0x0
    Value: 0x20
}
Section {
    Index: 2
    SectionName: "__const"
    SegmentName: "__TEXT"
    Address: 0x20
    Size: 0x0
    Offset: 0x370
    Align: 0x0
    RelocationOffset: 0x0
    NumberOfRelocations: 0x0
    Flags: S_REGULAR (0x0)
}

RelocationInfo {
    Address: 0x7F
    Extern: yes
    Symbol: "_.Ldata2" (0x4)
    PcRel: yes
    Length: 2
    Type: X86_64_RELOC_GOT (0x4)
}
Nlist {
    Index: 4
    String: "_.Ldata2" (0x4BD)
    Type: N_SECT (0xE)
    Section: "__TEXT,__const" (0x1)
    Desc: 0x0
    Value: 0x0
}
Section {
    Index: 1
    SectionName: "__const"
    SegmentName: "__TEXT"
    Address: 0x0
    Size: 0x0
    Offset: 0x350
    Align: 0x0
    RelocationOffset: 0x0
    NumberOfRelocations: 0x0
    Flags: S_REGULAR (0x0)
}

Those symbols look wrong because they are in empty sections.

@bjorn3
Copy link
Member

bjorn3 commented Apr 29, 2024

Those symbols look wrong because they are in empty sections.

If a data object is empty, that will create a symbol to an empty section. Looks like LLVM adds a dummy byte in that case. Is this required by Mach-O?

@philipc
Copy link
Contributor

philipc commented Apr 30, 2024

If a data object is empty, that will create a symbol to an empty section. Looks like LLVM adds a dummy byte in that case.

How did you determine that? The symbols I gave above were for uses of "". I haven't yet come up with a test case where rustc emits a symbol for "" on x86_64 using llvm, but for aarch64, rustc emits a length 1 symbol for "", both when it is the only symbol in the section, and when there is another symbol following it.

Is this required by Mach-O?

I don't know, but I don't understand how MH_SUBSECTIONS_VIA_SYMBOLS could work with zero size symbols. If two symbols have the same address, how does it determine which one is the zero size subsection?

@bjorn3
Copy link
Member

bjorn3 commented Apr 30, 2024

How did you determine that?

I created an static with a ZST as type: https://rust.godbolt.org/z/4f1f8eEj6

The symbols I gave above were for uses of ""

Anonymous allocations and statics are handled differently. Anonymous allocations get unnamed_addr and in some cases cg_llvm entirely skips creating a global variable for anonymous zero sized allocations.

@philipc
Copy link
Contributor

philipc commented Apr 30, 2024

"" can be static too though, right? Either of these will use a zero sized _.Ldata1 and reproduce the problem:

pub static FOO: &'static () = &();
pub static FOO: &'static str = "";

Is this enough info to fix the problem? Here's a hack to object::write::Object::add_symbol_data that passes the tests:

diff --git a/src/write/mod.rs b/src/write/mod.rs
index 198ef5e..39cc988 100644
--- a/src/write/mod.rs
+++ b/src/write/mod.rs
@@ -488,9 +488,12 @@ impl<'a> Object<'a> {
         &mut self,
         symbol_id: SymbolId,
         section: SectionId,
-        data: &[u8],
+        mut data: &[u8],
         align: u64,
     ) -> u64 {
+        if data.is_empty() {
+            data = &[0];
+        }
         let offset = self.append_section_data(section, data, align);
         self.set_symbol_data(symbol_id, section, offset, data.len() as u64);
         offset

I'm not sure if this is the right place to fix this, and if it is we probably should only do this for symbols in Mach-O when using subsections. (Related to that, I notice cranelift never uses subsections for data, but for Mach-O MH_SUBSECTIONS_VIA_SYMBOLS applies to both text and data, so there might be ordering problems if we add data before text.)

@bjorn3
Copy link
Member

bjorn3 commented Apr 30, 2024

"" can be static too though, right? Either of these will use a zero sized _.Ldata1 and reproduce the problem:

In those cases the static contains a pointer to an anonymous allocation. Only the latter is zero sized.

I'm not sure if this is the right place to fix this, and if it is we probably should only do this for symbols in Mach-O when using subsections.

Makes sense to me to do it there and in add_subsection and I agree it should only be done when using subsections.

I notice cranelift never uses subsections for data

That is because using subsections for data is not yet implemented: bytecodealliance/wasmtime#2368

@bjorn3
Copy link
Member

bjorn3 commented Jun 1, 2024

There may be another bug somewhere. In https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift/topic/Minimal.20macos.20executable.20compilation someone hit a crash of the new linker in some code that directly uses cranelift-object without any empty data objects.

Edit: bytecodealliance/wasmtime#8730

@bjorn3
Copy link
Member

bjorn3 commented Jun 20, 2024

With object 0.36 and cranelift 0.109 I'm now getting

            0  0x100f3f648  __assert_rtn + 72
            1  0x100e8b0f4  ld::InputFiles::SliceParser::parseObjectFile(mach_o::Header const*) const + 21260
            2  0x100e96e30  ld::InputFiles::parseAllFiles(void (ld::AtomFile const*) block_pointer)::$_7::operator()(unsigned long, ld::FileInfo const&) const + 420
            3  0x198c0e428  _dispatch_client_callout2 + 20
            4  0x198c22850  _dispatch_apply_invoke3 + 336
            5  0x198c0e3e8  _dispatch_client_callout + 20
            6  0x198c0fc68  _dispatch_once_callout + 32
            7  0x198c218a4  _dispatch_apply_invoke + 252
            8  0x198c0e3e8  _dispatch_client_callout + 20
            9  0x198c20080  _dispatch_root_queue_drain + 864
            10  0x198c206b8  _dispatch_worker_thread2 + 156
            11  0x198dbafd0  _pthread_wqthread + 228
            ld: Assertion failed: (pattern[0].addrMode == addr_other), function addFixupFromRelocations, file Relocations.cpp, line 698.
            clang: error: linker command failed with exit code 1 (use -v to see invocation)

https://github.com/rust-lang/rustc_codegen_cranelift/actions/runs/9601537631/job/26480325144

Edit: This is likely the same as bytecodealliance/wasmtime#8730

@philipc
Copy link
Contributor

philipc commented Jun 20, 2024

That job is for aarch64. It should be working for x86_64.

@bjorn3
Copy link
Member

bjorn3 commented Jun 21, 2024

@bjorn3
Copy link
Member

bjorn3 commented Jun 29, 2024

3d54358 fixes the last known incompatibility with the new linker.

@bjorn3 bjorn3 closed this as completed Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. help wanted Extra attention is needed O-macos Operating system: MacOS
Projects
None yet
Development

No branches or pull requests

3 participants