Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtraces broken on s390x #53372

Closed
cuviper opened this issue Aug 15, 2018 · 4 comments
Closed

Backtraces broken on s390x #53372

cuviper opened this issue Aug 15, 2018 · 4 comments
Labels
C-bug Category: This is a bug. O-SystemZ Target: SystemZ processors (s390x) regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@cuviper
Copy link
Member

cuviper commented Aug 15, 2018

run-pass/backtrace.rs and run-pass/backtrace-debuginfo.rs are both broken on s390x since Rust 1.28. When std is built without debuginfo, the tests hang with their backtracing processes stuck in a loop. With std debuginfo, it fails with <unknown> output like:

thread 'main' panicked at 'bad output: thread 'main' panicked at 'explicit panic', [...]/rust/src/test/run-pass/backtrace.rs:25:9
stack backtrace:
   0: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
   1: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
   2: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
   3: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
   4: std::panicking::rust_panic_with_hook
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
  10: rust_metadata_std_afee429493f25d7710e1d9899ebe9736
  11: __rust_maybe_catch_panic
  12: std::rt::lang_start_internal
  13: <unknown>
  14: __libc_start_main
  15: <unknown>
', [...]/rust/src/test/run-pass/backtrace.rs:60:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

I believe the problem is this part of #50955:
https://github.com/rust-lang/rust/pull/50955/files#diff-df8879cba69175008878e97cee6d7d91R113
where target.contains("64") doesn't identify that s390x-unknown-linux-gnu is a 64-bit platform.

See also rust-lang/backtrace-rs#122. I plan to update that here when it's published, and make a similar fix in rust/src/libstd/build.rs.

@cuviper cuviper added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. C-bug Category: This is a bug. O-SystemZ Target: SystemZ processors (s390x) labels Aug 15, 2018
@cuviper
Copy link
Member Author

cuviper commented Aug 15, 2018

#53377 fixes the backtrace when std is built with debuginfo, but I'm still getting a hang without.

In GDB, it looks like the unwinder gets to the __rust_try that catches main panics, then loops unwinding that same PC forever. This part of the regression is older -- happens on 1.27.2 as well, but 1.26.2 is fine. Binaries from the older version don't appear to have a distinct __rust_try symbol at all, but maybe that's just inlined differently.

I'm bisecting for that hang now.

@cuviper
Copy link
Member Author

cuviper commented Aug 16, 2018

Bisecting pointed me at #50188 (cc @alexcrichton), which added target-feature attributes to all functions. For s390x the base features actually include a removal, "-vector". But it doesn't look like the features are applied to the synthesized __rust_try (intrinsics::try), and as a result it doesn't get inlined anymore since it has more features (w/ vector) than the callee.

  • We probably should apply target-features to __rust_try as well.
  • I don't know why the non-inlined __rust_try can't be unwound -- maybe we need to force uwtable?

By default GDB backtrace stops at crate main, but set backtrace past-main on tries to go further:

(gdb) bt
#0  0x000002aa00034064 in std::sys::unix::backtrace::tracing::imp::trace_fn ()
#1  0x000003fffdd0a0ac in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
#2  0x000002aa00033fbc in std::sys::unix::backtrace::tracing::imp::unwind_backtrace ()
#3  0x000002aa00039872 in std::sys_common::backtrace::print ()
#4  0x000002aa00029252 in std::panicking::default_hook::{{closure}} ()
#5  0x000002aa00028ede in std::panicking::default_hook ()
#6  0x000002aa0002979c in std::panicking::rust_panic_with_hook ()
#7  0x000002aa000180c6 in std::panicking::begin_panic (msg=..., file_line_col=0x2aa00063d30) at /home/nfs/jistone/rust/src/libstd/panicking.rs:391
#8  0x000002aa0001d4e4 in backtrace::foo () at ./src/test/run-pass/backtrace.rs:25
#9  0x000002aa0001ec88 in backtrace::main () at ./src/test/run-pass/backtrace.rs:111
#10 0x000002aa00017f3c in std::rt::lang_start::{{closure}} () at /home/nfs/jistone/rust/src/libstd/rt.rs:74
#11 0x000002aa0002933a in std::panicking::try::do_call ()
#12 0x000002aa00042ddc in __rust_try ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Maybe Rust's trace_fn could have a similar defensive check, at least.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 16, 2018
This shim is generated elsewhere in the compiler so this commit adds support to
ensure it goes through similar paths as the rest of the compiler to set llvm
function attributes like target features.

cc rust-lang#53372
@alexcrichton
Copy link
Member

Nice find @cuviper! Definitely makes sense to me, and this should be fixed up (at least the target-feature business) in #53437

bors added a commit that referenced this issue Aug 18, 2018
…rister

Set more llvm function attributes for __rust_try

This shim is generated elsewhere in the compiler so this commit adds support to
ensure it goes through similar paths as the rest of the compiler to set llvm
function attributes like target features.

cc #53372
bors added a commit that referenced this issue Aug 18, 2018
std: stop backtracing when the frames are full

This is a defensive measure to mitigate the infinite unwind loop seen in #53372.  That case will still repeatedly unwind `__rust_try`, but now it will at least stop when `cx.frames` is full.

r? @alexcrichton
@cuviper
Copy link
Member Author

cuviper commented Aug 30, 2018

With #53377, #53436, and #53437, I believe we have this fixed.

@cuviper cuviper closed this as completed Aug 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-SystemZ Target: SystemZ processors (s390x) regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants