Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo test, failed to initiate panic, error 5 #88622

Open
stefAIno opened this issue Aug 31, 2021 · 12 comments
Open

Cargo test, failed to initiate panic, error 5 #88622

stefAIno opened this issue Aug 31, 2021 · 12 comments
Labels
A-libtest Area: `#[test]` / the `test` library A-panic Area: Panicking machinery C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@stefAIno
Copy link

Suddenly, every panic in a test returns a fatal runtime error.
Here the error:
fatal runtime error: failed to initiate panic, error 5
Caused by:
process didn't exit successfully: (signal: 6, SIGABRT: process abort signal)

macOS Big Sur 11.5.2
cc (Homebrew GCC 11.2.0) 11.2.0
cargo 1.54.0 (5ae8d74b3 2021-06-22)

@ehuss ehuss transferred this issue from rust-lang/cargo Sep 3, 2021
@ehuss
Copy link
Contributor

ehuss commented Sep 3, 2021

Thanks for the report, I have moved it to the rust-lang/rust repository where tracking of libtest issues lives.

Can you provide a reproduction or more information? Can you examine the core dump in lldb?

@maddiemort
Copy link

maddiemort commented Nov 5, 2021

I'm also experiencing this issue. I'll try to get a reproduction together.

macOS Big Sur 11.4
cargo 1.58.0-nightly (94ca096af 2021-10-29)

@maddiemort
Copy link

I haven't been able to nail down a minimal repro yet, but one thing I have identified is that it seems to be triggered (in the codebase I can't unfortunately share where I originally encountered this) by an assert_eq!(result, Err(something)); that fails due to the something not matching the actual value inside result.

@maddiemort
Copy link

Okay, this is spooky. To sum up the last hour I've spent investigating this: I was able to reproduce this issue with a crate containing only a lib.rs with the following content:

#[cfg(test)]
mod tests {
    #[test]
    fn reproduce() {
        assert!(false);
    }
}

I got to that by minimising the original codebase that caused the panic, removing code bit by bit until I was left with only that.

I then copied that folder, checked that I could still reproduce the problem in the copied folder, pushed it to a new GitHub remote, cloned that repo, and I could still reproduce the problem in the fresh clone.

Then I tried creating a new fresh Cargo project without copying any files. That project (and a fresh clone in the same style as above) did not exhibit the problem. That's even though all the files in the whole repo had the same contents as the successful reproduction.

Then, because I was starting to suspect that this problem might be related to something other than just the contents of the files (even though the files all had the same permissions... every aspect of the files' contents and metadata that I know how to check seemed to be the same), I copied the lib.rs file from the unsuccessful reproduction (the fresh Cargo project that did not produce the "failed to initiate panic" message) into the successful reproduction, overwriting the lib.rs file that was producing a "failed to initiate panic" message before.

I could no longer reproduce the problem in the (formerly) successful reproduction project.

Git detected no changes to the file.

I cloned the successful reproduction repo again (the same thing that, before, allowed me to reliably reproduce the problem).

I could not reproduce the problem in the fresh clone.


I'm now convinced that some of the files on my machine are haunted in a way that transcends our current understanding of file metadata.

I would really appreciate some help or some advice (ideally from someone who understands the topics involved here more deeply than me - such as libtest, rustc, Git, file metadata, permissions, uhhhh... anything related to a computer, apparently?) because I'm kind of lost for anything to try next.

I do still have files/projects that will reproduce the problem. Not only does the original main codebase that caused this problem still exhibit it, I have another codebase that does too.

Here are links to my successful and unsuccessful panic reproduction repos, even though the successful one now no longer exhibits the problem for me, even in fresh clones:

https://github.com/nerosnm/successful-panic-repro
https://github.com/nerosnm/unsuccessful-panic-repro

@maddiemort
Copy link

One thing I should mention about my setup, that will probably be apparent from looking at my repro repos, is that I'm using the Nix package manager on this machine (even though it's a macOS machine; I'm using nix-darwin). But this has never been a problem for me before, and I've never had this "failed to initiate panic" issue on my full NixOS machine.

@maddiemort
Copy link

Sorry for spamming a bunch of comments onto this issue - this problem has really been getting under my skin (and I wanted to provide as much information as possible for anyone trying to reproduce).

I've just read through #84157, and it looks likely that that is related to my issue. I had gcc listed in my flake.nix, and removing it seems to stop the "failed to initiate panic" message from occurring (which gcc changes from reporting a path to a gcc installed by Nix to reporting /usr/bin/gcc).

@stefAIno I'm interested to know what which gcc reports for you?

@stefAIno
Copy link
Author

stefAIno commented Nov 6, 2021

Hi @nerosnm, apparently the problem was related to gcc. I solved it by switching to CLang although I had other problems there that were unrelated to this one. I suggest you to try switching every reference you have to GCC and use the standard library offered by macOS. I hope this is helpful!

@NilsIrl
Copy link

NilsIrl commented Dec 19, 2021

I uninstalled the gcc I had installed through nix and that fixed the problem (on darwin).

macos 12.0.1 (21A559) (M1)
rustc 1.59.0-nightly (7abab1e 2021-12-17)
gcc (GCC) 11.2.0 (reliably causes the issue, even after uninstalling and re installing, installed from nixpkgs)

@maddiemort
Copy link

maddiemort commented Jan 10, 2023

Hah, I just found this issue again when searching for the problem... I'm having the same issue again, but this time I'm on NixOS and I don't have gcc installed.

@Dylan-DPC Dylan-DPC added the C-bug Category: This is a bug. label Feb 12, 2023
@muja
Copy link

muja commented Jul 19, 2023

I also have this error and I don't have gcc installed. I'm running MacOS 13.4.1

I also have a very weird side effect that the test only aborts with Err::unwrap(), and doesn't abort with None::unwrap().

This is reproducible for all versions of Rust for me, I did a toolchain upgrade and cargo clean.

It only seemed to start occuring recently. I'm not sure what caused this, maybe a system upgrade or an installation of a crate that may have pulled a system dependency or something? Only notable ones that I recently installed were criterion/flamegraph.

After a very long session of trying to find out what the cause of this issue is, I believe I might have narrowed it down to a infinite loop inside the libunwind library. The backtrace of where I imagine this infinite loop occurs is here:
image

However I'm not sure if there is a infinite loop detection since the function, in the end, does return so that panicking.rs:756 can abort the program. Anyway, here is how the function behaves:

The function forEachImageTextSegment continues until +104, calling dyld4::APIs::findImageMappedAt. There ~7 instructions get executed before the function jumps from +20 to +121 and subsequently returning. forEachImageTextSegment proceeds from +108 to +124, and then jumps back to +80, until the loop in +104 continues again.

image

Obviously this might be a complete wrong assessment on my part. I only noticed this because after minutes of step into there seemed to be no progress. I didn't inspect any registers or any data, this is just superficial analysis.

However, I uploaded the "malformed" binary and could reproduce the issue in GitHub actions (with the binary from my machine. I haven't even attempted of building it with GitHub Actions because I'm convinced this is a local issue)

Here it is, if someone cares to dive deep: https://github.com/muja/rust_unwind_error

The failing job can be seen in the Actions tab, however unfortunately it's not possible to disclose the logs to the public but basically it's this:
image

@jerel
Copy link

jerel commented Aug 24, 2023

I'm hitting this as well. I have a CI pipeline that has been working fine for many months and now fails across all recent Rust versions with this same error. No changes to Cargo.lock, no changes to the pipeline, no changes to the code.

This is on Ubuntu so a MacOS environment can be ruled out. I've tried Ubuntu 18.04, 20.04, and 22.04 and Rust versions from 1.62 up to nightly.

jerel added a commit to jerel/membrane that referenced this issue Aug 28, 2023
jerel added a commit to jerel/membrane that referenced this issue Aug 29, 2023
@fmease fmease added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-libtest Area: `#[test]` / the `test` library A-panic Area: Panicking machinery and removed needs-triage-legacy labels Sep 8, 2023
@Velnbur
Copy link

Velnbur commented Jan 20, 2024

I've had the same error when running cargo test as shell command from Neovim installed with nixvim. If I leave it and run in current shell everything works fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-libtest Area: `#[test]` / the `test` library A-panic Area: Panicking machinery C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
Status: No status
Development

No branches or pull requests

10 participants