Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: index out of bounds in link.MachO.ZldAtom.resolveRelocsArm64 #14559

Closed
winterqt opened this issue Feb 5, 2023 · 19 comments · Fixed by #14575
Closed

panic: index out of bounds in link.MachO.ZldAtom.resolveRelocsArm64 #14559

winterqt opened this issue Feb 5, 2023 · 19 comments · Fixed by #14575
Labels
bug Observed behavior contradicts documented or intended behavior linking os-macos
Milestone

Comments

@winterqt
Copy link
Contributor

winterqt commented Feb 5, 2023

Zig Version

0.10.1

Steps to Reproduce and Observed Behavior

After fixing the @alignCast issue in #14558, stage3 fails to build on aarch64-darwin with a different error:

thread 666841 panic: index out of bounds
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link/MachO/ZldAtom.zig:793:53: 0x105a2bd27 in link.MachO.ZldAtom.resolveRelocsArm64 (zig2)
                    mem.readIntLittle(i64, atom_code[rel_offset..][0..8])
                                                    ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link/MachO/ZldAtom.zig:483:39: 0x105a2719f in link.MachO.ZldAtom.resolveRelocs (zig2)
        .aarch64 => resolveRelocsArm64(zld, atom_index, atom_code, atom_relocs, reverse_lookup, ctx),
                                      ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link/MachO/zld.zig:1736:43: 0x1056db723 in link.MachO.zld.Zld.writeAtoms (zig2)
                    try Atom.resolveRelocs(
                                          ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link/MachO/zld.zig:4290:27: 0x10545e997 in link.MachO.zld.linkWithZld (zig2)
        try zld.writeAtoms(reverse_lookups);
                          ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link/MachO.zig:427:44: 0x10519e6d7 in link.MachO.flush (zig2)
        .one_shot => return zld.linkWithZld(self, comp, prog_node),
                                           ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/link.zig:797:72: 0x10519b4b3 in link.File.flush (zig2)
            .macho => return @fieldParentPtr(MachO, "base", base).flush(comp, prog_node),
                                                                       ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/Compilation.zig:2516:24: 0x105114a87 in Compilation.flush (zig2)
    comp.bin_file.flush(comp, prog_node) catch |err| switch (err) {
                       ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/Compilation.zig:2480:27: 0x105107bbf in Compilation.update (zig2)
            try comp.flush(main_progress_node);
                          ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/main.zig:3361:20: 0x1050898af in main.updateModule (zig2)
    try comp.update();
                   ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/main.zig:3028:17: 0x104fc60f7 in main.buildOutputType (zig2)
    updateModule(gpa, comp, hook) catch |err| switch (err) {
                ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/main.zig:230:31: 0x104f4efc3 in main.mainArgs (zig2)
        return buildOutputType(gpa, arena, args, .{ .build = .Exe });
                              ^
/private/tmp/nix-build-zig-0.10.1.drv-0/source/src/stage1.zig:56:24: 0x104f4e8fb in main (zig2)
        stage2.mainArgs(gpa, arena, args) catch unreachable;
                       ^

Expected Behavior

For it to link correctly :)

@winterqt winterqt added the bug Observed behavior contradicts documented or intended behavior label Feb 5, 2023
@andrewrk andrewrk added this to the 0.11.0 milestone Feb 5, 2023
@andrewrk
Copy link
Member

andrewrk commented Feb 5, 2023

@kubkon would you mind taking a look at this? It is reported against 0.10.1 but I believe the issue may still be present in master branch. @winterqt is working on packaging zig 0.10.1 for nixpkgs and has run into this problem. It would behoove us and our users if we could provide a workaround patch for nixpkgs to carry to resolve this issue until they can update to 0.11.0.

@winterqt

This comment was marked as resolved.

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Sure, but I need more than just a panic stack trace :-) Are there any steps I could follow to even try and repro the issue locally? Otherwise, it's not much to go on sadly.

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Here's my hunch at what is happening. Packaging for nixpgks generates an object file with a funky looking symtab which is then packaged into archive and put on the linker link as per #14558. This manifests itself as an alignment panic, which if circumvented (downgraded to align(1)) advanced the linker further and fails with out-of-bounds access when pulling relocations for some subsection from that object file which are not there, or for whatever reason. If it was possible for me to repro the issue, or at the very least to learn which archive/object file causes the original alignment panic and have it available for inspection, I should be able to fix the issue relatively quickly.

@kubkon kubkon closed this as completed Feb 5, 2023
@kubkon kubkon reopened this Feb 5, 2023
@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

(Oops, apologies for accidentally closing the issue!)

@winterqt
Copy link
Contributor Author

winterqt commented Feb 5, 2023

@kubkon You can reproduce this by installing Nix, cloning Nixpkgs, and running NIXPKGS_ALLOW_BROKEN=1 nix-build -A zig_0_10. You can add patches by adding patches = [ ./foo.nix ] to pkgs/development/compilers/zig/0.10.nix, where foo.nix is in pkgs/development/compilers/zig.

If you'd rather not install Nix, let me know -- I have a machine I can give you access to that has it installed.

If there's anything that you need further clarification on, let me know, and thank you both for helping us out.

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Nice, thank you so much for the repro! I'll get right on it!

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Is there any trick into getting nix to find the includes? I am getting this currently:

/tmp/nix-build-zig-0.10.1.drv-0/source/lib/libcxx/include/stdlib.h:93:15: fatal error: 'stdlib.h' file not found

Perhaps an env var I need to set or something?

@winterqt
Copy link
Contributor Author

winterqt commented Feb 5, 2023

Hmm, I haven't run into that.

What modifications did you make, if any, or is this a clean tree? Also, just to be sure, this is on aarch64, and not x86_64, right?

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

A clean tree on aarch64-macos yep.

@winterqt
Copy link
Contributor Author

winterqt commented Feb 5, 2023

Hm, my hunch without being able to look at anything else at the moment would be to try and enable the sandbox (though I have no idea why this would be the case, it really shouldn't do anything).

Can you add sandbox = true to /etc/nix/nix.conf, reboot (or restart org.nixos.nix-daemon via launchd), and try again?

That would be the only difference between our machines... 😕 and if that is the issue, then we have a problem on our hands.

If that doesn't fix it, I can try to repro, though I have no clue how I'd even begin doing that (since I never ran into this).

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Thanks, that was it! Woohoo, got a repro!

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

@winterqt just to keep you in the loop, fixed it today. Will now clean it up, submit a PR, and afterwards, will submit a patch against 0.10.1 to nixpkgs. This should make 0.10.1 available in nix on darwin before 0.11 release.
Screenshot 2023-02-05 at 19 21 41

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

I will also make sure that nix is capable of building master branch too.

@winterqt
Copy link
Contributor Author

winterqt commented Feb 5, 2023

@kubkon Thanks, much appreciated.

I can confirm that master works fine already. I guess something about the WASI bootstrap makes this issue not occur, as it fails before the merge, but succeeds afterwards. Weird, but who knows.

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

@kubkon Thanks, much appreciated.

I can confirm that master works fine already. I guess something about the WASI bootstrap makes this issue not occur, as it fails before the merge, but succeeds afterwards. Weird, but who knows.

Well, it was a genuine bug on my part as I made certain assumptions about the compiler behaviour I shouldn't have (i.e., assumptions about the relocations order in the object file in case you are interested).

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Also, thanks for confirming master works. Btw, I have noticed we don't pass either -DZIG_STATIC_LLVM=ON or -DZIG_SHARED_LLVM=ON in Nix's derivation which may cause a mixup of static archives for libLLVM and dylibs for libclang causing Zig refusing to start up.

@winterqt
Copy link
Contributor Author

winterqt commented Feb 5, 2023

Btw, I have noticed we don't pass either -DZIG_STATIC_LLVM=ON or -DZIG_SHARED_LLVM=ON in Nix's derivation which may cause a mixup of static archives for libLLVM and dylibs for libclang causing Zig refusing to start up.

We also don't pass this in our 0.9.1 derivation (which works fine on macOS and Linux). Should we be doing that for both 0.9.1 and 0.10.1 even if it just happens to work? (Why would it work without it... luck?)

If you think this is the case, feel free to do it in a separate commit in your PR.

@kubkon
Copy link
Member

kubkon commented Feb 5, 2023

Detecting this at runtime by Zig has been a recent change (0.10 onwards) and yes, if it was working so far then it was luck in the way llvm-config reported the libs as either all static or dynamic. Anyhow, my suggestion would be to force static linking of LLVM at all times.

kubkon added a commit to kubkon/nixpkgs that referenced this issue Mar 14, 2023
Relevant upstream issue: ziglang/zig#14559

The patch is a backport of fixes that landed in zig-master and can
be removed with zig-0.11 release.

Additionally, make sure we link statically against LLVM to avoid
unpleasant runtime surprises originating from mixing static and
dynamic LLVM libraries.

Finally, unbreak Zig 0.10.1 on macOS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior linking os-macos
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants