Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-1083: panic handler for router crashes #1084

Merged
merged 3 commits into from
May 22, 2023

Conversation

kgiusti
Copy link
Contributor

@kgiusti kgiusti commented May 18, 2023

Part 1: very basic stack unwind + register dump to stderr on crash

To be done:

  • unit test!
  • display mappings for link shared libraries (e.g. proton...)
  • document how to use the output for crash debug

Adds requirement for libunwind library.

Part 1: very basic stack unwind + register dump to stderr on crash

To be done:
 - unit test!
 - display mappings for link shared libraries (e.g. proton...)
 - document how to use the output for crash debug

Adds requirement for libunwind library.
@kgiusti kgiusti requested review from ssorj and ganeshmurthy May 18, 2023 18:21
@codecov
Copy link

codecov bot commented May 18, 2023

Codecov Report

❗ No coverage uploaded for pull request base (main@35c2d47). Click here to learn what that means.
The diff coverage is 34.83%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1084   +/-   ##
=======================================
  Coverage        ?   78.62%           
=======================================
  Files           ?      237           
  Lines           ?    60042           
  Branches        ?     5607           
=======================================
  Hits            ?    47206           
  Misses          ?    10220           
  Partials        ?     2616           
Flag Coverage Δ
pysystemtests 87.31% <ø> (?)
systemtests 71.87% <34.83%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
calculator 78.62% <34.83%> (?)
systemtests 78.62% <34.83%> (?)

run.py.in Outdated Show resolved Hide resolved
{
if (getenv("SKUPPER_ROUTER_DISABLE_PANIC_HANDLER") == 0) {
struct sigaction sa = {
// use SA_RESETHAND since if the stack unwind fails the default signal handler (coredump) will be invoked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment is incorrect, no? the default signal handler will be invoked at the end regardless of whether stack unwind failed or not. Which is the desired behavior, so that when we can get coredump, we do also get coredump, in addition to stack unwind.

Copy link
Contributor Author

@kgiusti kgiusti May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - removed.

@kgiusti kgiusti linked an issue May 22, 2023 that may be closed by this pull request
@kgiusti
Copy link
Contributor Author

kgiusti commented May 22, 2023

Sample output:

`*** SKUPPER-ROUTER FATAL ERROR ***
Version: 2.1.0-201-gc283d48e-modified
Signal: 11 SIGSEGV
Process ID: 22265 (skrouterd)
Thread ID: 22267 (core_thread)

Backtrace:
[0] IP: 0x00007fabbc85fb20 (/lib64/libc.so.6 + 0x000000000003cb20)
Registers:
RAX: 0x00007fabac2df030 RDI: 0x00007fabac2df030 R11: 0x00007fabbe076c88
RBX: 0x00007fabac2e01d0 RBP: 0x00007fabac2e0240 R12: 0x000000000000000b
RCX: 0x0000000000000000 R8: 0x0000000000000000 R13: 0x00000ff57585c03a
RDX: 0x0000000000000000 R9: 0x00007fabac2dfccf R14: 0x0000000000000001
RSI: 0x0000000000000000 R10: 0x0000000000000001 R15: 0x00007fabac2e01d0
SP: 0x00007fabac2df4c0

[1] IP: 0x00000000006e2e6b (skrouterd + 0x00000000006e2e6b)
Registers:
RAX: 0x00000000deadbeef RDI: 0x00000000006e2e59 R11: 0x0000000000000206
RBX: 0x00007fabac2e01d0 RBP: 0x00007fabac2e0240 R12: 0x000000000000000b
RCX: 0x00007fabbca1aa7b R8: 0x0000000000000001 R13: 0x00000ff57585c03a
RDX: 0x000000009bd537dd R9: 0x00007fabac2dfccf R14: 0x0000000000000001
RSI: 0x00007fabac2e0180 R10: 0x0000000000000046 R15: 0x00007fabac2e01d0
SP: 0x00007fabac2e0190

[2] IP: 0x00000000006b6072 (skrouterd + 0x00000000006b6072)
Registers:
RAX: 0x00000000deadbeef RDI: 0x00000000006e2e59 R11: 0x0000000000000206
RBX: 0x00007fabac2e02f0 RBP: 0x00007fabac2e0320 R12: 0x00007fabac2e0290
RCX: 0x00007fabbca1aa7b R8: 0x0000000000000001 R13: 0x00000ff57585c052
RDX: 0x000000009bd537dd R9: 0x00007fabac2dfccf R14: 0x0000000000000001
RSI: 0x00007fabac2e0180 R10: 0x0000000000000046 R15: 0x00000000006e2be0
SP: 0x00007fabac2e0250

[3] IP: 0x00000000005dfb05 (skrouterd + 0x00000000005dfb05)
Registers:
RAX: 0x00000000deadbeef RDI: 0x00000000006e2e59 R11: 0x0000000000000206
RBX: 0x0000604000005210 RBP: 0x00007fabac2e0350 R12: 0x00000000006b454e
RCX: 0x00007fabbca1aa7b R8: 0x0000000000000001 R13: 0x000000000000000b
RDX: 0x000000009bd537dd R9: 0x00007fabac2dfccf R14: 0x00007ffda9545810
RSI: 0x00007fabac2e0180 R10: 0x0000000000000046 R15: 0x00007fabab952000
SP: 0x00007fabac2e0330

[4] IP: 0x00007fabbc8ae12d (/lib64/libc.so.6 + 0x000000000008b12d)
Registers:
RAX: 0x00000000deadbeef RDI: 0x00000000006e2e59 R11: 0x0000000000000206
RBX: 0x00007fabac2e16c0 RBP: 0x0000000000000000 R12: 0xfffffffffffff428
RCX: 0x00007fabbca1aa7b R8: 0x0000000000000001 R13: 0x000000000000000b
RDX: 0x000000009bd537dd R9: 0x00007fabac2dfccf R14: 0x00007ffda9545810
RSI: 0x00007fabac2e0180 R10: 0x0000000000000046 R15: 0x00007fabab952000
SP: 0x00007fabac2e0360

[5] IP: 0x00007fabbc92fbc0 (/lib64/libc.so.6 + 0x000000000010cbc0)
Registers:
RAX: 0x00000000deadbeef RDI: 0x00000000006e2e59 R11: 0x0000000000000206
RBX: 0x00007fabbc8ade60 RBP: 0x0000000000000000 R12: 0xfffffffffffff428
RCX: 0x00007fabbca1aa7b R8: 0x0000000000000001 R13: 0x000000000000000b
RDX: 0x000000009bd537dd R9: 0x00007fabac2dfccf R14: 0x00007ffda9545810
RSI: 0x00007fabac2e0180 R10: 0x0000000000000046 R15: 0x00007fabab952000
SP: 0x00007fabac2e0400

*** END ***
Segmentation fault (core dumped)
`

@kgiusti kgiusti merged commit fcee563 into skupperproject:main May 22, 2023
@kgiusti kgiusti deleted the ISSUE-1083 branch May 22, 2023 17:01
ganeshmurthy pushed a commit that referenced this pull request May 22, 2023
Very basic stack unwind + register dump to stderr on crash

(cherry picked from commit fcee563)
jiridanek pushed a commit to jiridanek/skupper-router that referenced this pull request Jun 3, 2023
Very basic stack unwind + register dump to stderr on crash
@@ -846,7 +846,7 @@ jobs:
dnf config-manager --set-enabled powertools
dnf install --setopt=tsflags=nodocs --setopt=install_weak_deps=False -y epel-release 'dnf-command(copr)' 'dnf-command(builddep)'
dnf copr enable -y clime/rpkg-util
dnf install --setopt=tsflags=nodocs --setopt=install_weak_deps=False -y git rpkg
dnf install --setopt=tsflags=nodocs --setopt=install_weak_deps=False -y git rpkg libunwind-devel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libunwind-devel here is redundant, it is installed by dnf builddep when it scans the spec file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New feature: panic handler
3 participants