Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logrotate issues since 2.10.1-1 #6737

Closed
sschindel opened this issue Oct 29, 2018 · 59 comments · Fixed by #7004
Closed

Logrotate issues since 2.10.1-1 #6737

sschindel opened this issue Oct 29, 2018 · 59 comments · Fixed by #7004
Assignees
Labels
core/crash Shouldn't happen, requires attention ref/IP
Milestone

Comments

@sschindel
Copy link

Hey!

Since upgrading to 2.10.1-1 we face some issues with the logrotating part. The daemon sometime crashes after the logrotate run, this did not happen before. This is the crash report:

cat /var/log/icinga2/crash/report.1540790715.016844

Application version: r2.10.1-1

System information:
Platform: Debian GNU/Linux
Platform version: 9 (stretch)
Kernel: Linux
Kernel version: 4.9.0-6-amd64
Architecture: x86_64

Build information:
Compiler: GNU 6.3.0
Build host: 1880c6977367

Application information:

General paths:
Config directory: /etc/icinga2
Data directory: /var/lib/icinga2
Log directory: /var/log/icinga2
Cache directory: /var/cache/icinga2
Spool directory: /var/spool/icinga2
Run directory: /run/icinga2

Old paths (deprecated):
Installation root: /usr
Sysconf directory: /etc
Run directory (base): /run
Local state directory: /var

Internal paths:
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
Stacktrace:

(0) libc.so.6: gsignal (+0xcf) [0x7fa1362c5fff]
(1) libc.so.6: abort (+0x16a) [0x7fa1362c742a]
(2) libc.so.6: <unknown function> (+0x70c00) [0x7fa136303c00]
(3) libc.so.6: <unknown function> (+0x76fc6) [0x7fa136309fc6]
(4) libc.so.6: <unknown function> (+0x7780e) [0x7fa13630a80e]
(5) icinga2: std::_Function_base::_Base_manager<std::_Bind<std::_Mem_fn<void (icinga::Timer::*)()> (boost::intrusive_ptr<icinga::Timer>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) (+0x38) [0x563e339d0358]
(6) icinga2: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x4ca) [0x563e339c116a]
(7) libboost_thread.so.1.62.0: <unknown function> (+0x12116) [0x7fa138c1f116]
(8) libpthread.so.0: <unknown function> (+0x7494) [0x7fa137c41494]
(9) libc.so.6: clone (+0x3f) [0x7fa13637bacf]

  • This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
  • please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
  • information that might be useful in order to reproduce this problem.

Failed to launch GDB: No such file or directory

@Crunsher
Copy link
Contributor

A description of the circumstances to reproduce this is required. You say it happens during log rotate, what makes you think that?

The log around this time would also be very welcome.

@Crunsher Crunsher added the needs feedback We'll only proceed once we hear from you again label Oct 29, 2018
@sschindel
Copy link
Author

We've got a stock Debian cron and logrotate setup, around 200 Hosts and randomly one or two of those hosts has this issue. Logrotate in in cron.daily, which gets executed at 06:25:

25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

/var/log/icinga2# ls -l
total 1624
drwxr-s--x 3 nagios adm 4096 Mar 7 2017 compat
drwxr-s--x 2 nagios adm 4096 Oct 29 06:25 crash
-rw------- 1 nagios adm 0 Mar 7 2017 icinga2.err
-rw-r--r-- 1 nagios nagios 57755 Oct 29 10:35 icinga2.log
-rw-r--r-- 1 nagios nagios 1301697 Oct 29 06:25 icinga2.log.1
-rw-r--r-- 1 nagios nagios 48305 Oct 28 06:25 icinga2.log.2.gz
-rw-r--r-- 1 nagios nagios 42199 Oct 27 06:25 icinga2.log.3.gz
-rw-r--r-- 1 nagios nagios 42196 Oct 26 06:25 icinga2.log.4.gz
-rw-r--r-- 1 nagios nagios 42458 Oct 25 06:25 icinga2.log.5.gz
-rw-r--r-- 1 nagios nagios 41915 Oct 24 06:25 icinga2.log.6.gz
-rw-r--r-- 1 nagios nagios 43024 Oct 23 06:25 icinga2.log.7.gz
-rw-r--r-- 1 root adm 6361 Mar 7 2017 startup.log

The crash report is from that time:

stat /var/log/icinga2/crash/report.1540790715.016844

File: /var/log/icinga2/crash/report.1540790715.016844
Size: 2148 Blocks: 8 IO Block: 4096 regular file
Device: 802h/2050d Inode: 1711371 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 82/ nagios) Gid: ( 4/ adm)
Access: 2018-10-29 10:03:38.490438878 +0100
Modify: 2018-10-29 06:25:15.105257350 +0100
Change: 2018-10-29 06:25:15.105257350 +0100
Birth: -

And the last entry of the log files are:

[2018-10-29 06:23:17 +0100] information/WorkQueue: #14 (JsonRpcConnection, #5) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:23:25 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (30/min 150/5min 450/15min);
[2018-10-29 06:23:26 +0100] information/WorkQueue: #15 (JsonRpcConnection, #6) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:23:35 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (18/min 90/5min 270/15min);
[2018-10-29 06:23:36 +0100] information/WorkQueue: #9 (JsonRpcConnection, #0) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:23:37 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:23:45 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (12/min 60/5min 180/15min);
[2018-10-29 06:23:45 +0100] information/WorkQueue: #13 (JsonRpcConnection, #4) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:23:55 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:23:57 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:05 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (30/min 150/5min 450/15min);
[2018-10-29 06:24:06 +0100] information/WorkQueue: #11 (JsonRpcConnection, #2) items: 0, rate: 1.33333/s (80/min 400/5min 1200/15min);
[2018-10-29 06:24:07 +0100] information/WorkQueue: #12 (JsonRpcConnection, #3) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:15 +0100] information/WorkQueue: #16 (JsonRpcConnection, #7) items: 0, rate: 0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:15 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (36/min 180/5min 540/15min);
[2018-10-29 06:24:25 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (30/min 150/5min 450/15min);
[2018-10-29 06:24:35 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (18/min 90/5min 270/15min);
[2018-10-29 06:24:45 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (12/min 60/5min 180/15min);
[2018-10-29 06:24:55 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:25:05 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (30/min 150/5min 450/15min);

Do you need any further information?

@fladna9
Copy link

fladna9 commented Oct 30, 2018

I think i got the same problem on a machine i host.

root@log:/home/user# cat /var/log/icinga2/crash/report.1540790722.611491
  Application version: r2.10.1-1

System information:
  Platform: Debian GNU/Linux
  Platform version: 9 (stretch)
  Kernel: Linux
  Kernel version: 4.9.0-7-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 6.3.0
  Build host: 1880c6977367

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
Stacktrace:

	(0) libc.so.6: gsignal (+0xcf) [0x7f3a80164fff]
	(1) libc.so.6: abort (+0x16a) [0x7f3a8016642a]
	(2) libc.so.6: <unknown function> (+0x70c00) [0x7f3a801a2c00]
	(3) libc.so.6: <unknown function> (+0x76fc6) [0x7f3a801a8fc6]
	(4) libc.so.6: <unknown function> (+0x7780e) [0x7f3a801a980e]
	(5) icinga2: icinga::StreamLogger::BindStream(std::ostream*, bool) (+0xa6) [0x55f96871f5c6]
	(6) icinga2: icinga::FileLogger::ReopenLogFile() (+0xaa) [0x55f9686d5f0a]
	(7) icinga2: icinga::Application::RunEventLoop() (+0x7c2) [0x55f9686b33a2]
	(8) icinga2: icinga::IcingaApplication::Main() (+0x1f4) [0x55f9688e7974]
	(9) icinga2: icinga::Application::Run() (+0xc1) [0x55f9686b1351]
	(10) icinga2: icinga::DaemonCommand::Run(boost::program_options::variables_map const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const (+0xbc9) [0x55f968860259]
	(11) icinga2: <unknown function> (+0x266ef6) [0x55f9686b6ef6]
	(12) icinga2: main (+0x49) [0x55f96865b299]
	(13) libc.so.6: __libc_start_main (+0xf1) [0x7f3a801522e1]
	(14) icinga2: _start (+0x2a) [0x55f96867881a]

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

Failed to launch GDB: No such file or directory
root@log:/home/user# stat /var/log/icinga2/crash/report.1540790722.611491
  File: /var/log/icinga2/crash/report.1540790722.611491
  Size: 2517      	Blocks: 8          IO Block: 4096   regular file
Device: 801h/2049d	Inode: 12323212    Links: 1
Access: (0644/-rw-r--r--)  Uid: (  112/  nagios)   Gid: (    4/     adm)
Access: 2018-10-30 07:51:47.899211999 +0100
Modify: 2018-10-29 06:25:23.771421824 +0100
Change: 2018-10-29 06:25:23.771421824 +0100
 Birth: -

Last log lines:

[2018-10-29 06:20:32 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:21:02 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:21:22 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:21:32 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:22:02 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:22:22 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:22:32 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:23:02 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:23:22 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:23:32 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:24:02 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:24:22 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:24:22 +0100] information/WorkQueue: #10 (JsonRpcConnection, #1) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:23 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:32 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:24:52 +0100] information/WorkQueue: #14 (JsonRpcConnection, #5) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:52 +0100] information/WorkQueue: #9 (JsonRpcConnection, #0) items: 0, rate: 0.3/s (18/min 90/5min 270/15min);
[2018-10-29 06:24:52 +0100] information/WorkQueue: #13 (JsonRpcConnection, #4) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:24:53 +0100] information/WorkQueue: #15 (JsonRpcConnection, #6) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:25:02 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-10-29 06:25:02 +0100] information/WorkQueue: #11 (JsonRpcConnection, #2) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:25:12 +0100] information/WorkQueue: #12 (JsonRpcConnection, #3) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-10-29 06:25:19 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
[2018-10-29 06:25:23 +0100] information/WorkQueue: #16 (JsonRpcConnection, #7) items: 0, rate:  0/s (0/min 0/5min 0/15min);

I happens randomly on other machine, just like @sschindel, but the stacktrace is not exactly the same... Can provide more info if it can help.

Oh, and about configuration, nothing has been manually changed. Only launched a # icinga2 node wizard to connect to central Icinga2 monitoring system.

Thank you very much.

Edit: High probability this is a double of #6736

@dnsmichi
Copy link
Contributor

Are you using SysVinit or Systemd here? I'm thinking about a relationship to #6356.

@dnsmichi dnsmichi added the core/crash Shouldn't happen, requires attention label Oct 30, 2018
@dnsmichi
Copy link
Contributor

If it is just Debian 9, #6571 could come into play as well.

@fladna9
Copy link

fladna9 commented Oct 30, 2018

@dnsmichi : Standard Debian Stretch install, so systemd :-)
I also have 3 or 4 Ubuntu VMs using systemd as well (compared to about 30+ Debian Stretch machines), but no crash so far.

Hope it helps.

@sschindel
Copy link
Author

Hi,

we are also using Debian Stretch and had no issues with 2.9.2-1. This issue only appeared since upgrading to 2.10.1-1. We are using the packages from packges.icinga.com:

deb http://packages.icinga.com/debian icinga-stretch main

@fladna9
Copy link

fladna9 commented Nov 6, 2018

I'm still using stable deb source.list, so still using the sometimes-crashing-on-my-vm version.

It did not happen for a week, but then it happened today again when logrotate was called again.
My Icinga2 (slave) node is out again. Strange thing: it is always the same machine.

We host about 40 VMs and hosts (35 Debian Stretch (Systemd), 1 Debian Jessie (Systemd), rest is Ubuntu (Systemd)), and only 1 is posing problems.

Exactly the same time as previously (6:25 AM CET, that is to say 5:25 AM UTC, some machines aren't on the same timezone, but i've taken care to take the correct log lines).

Here is my syslog:
image

Here is crash report:

root# cat /var/log/icinga2/crash/report.1541481925.521074
  Application version: r2.10.1-1

System information:
  Platform: Debian GNU/Linux
  Platform version: 9 (stretch)
  Kernel: Linux
  Kernel version: 4.9.0-7-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 6.3.0
  Build host: 1880c6977367

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
Stacktrace:

	(0) libc.so.6: gsignal (+0xcf) [0x7fc551ebdfff]
	(1) libc.so.6: abort (+0x16a) [0x7fc551ebf42a]
	(2) libc.so.6: <unknown function> (+0x70c00) [0x7fc551efbc00]
	(3) libc.so.6: <unknown function> (+0x76fc6) [0x7fc551f01fc6]
	(4) libc.so.6: <unknown function> (+0x7780e) [0x7fc551f0280e]
	(5) icinga2: icinga::StreamLogger::BindStream(std::ostream*, bool) (+0xa6) [0x562c3d7c25c6]
	(6) icinga2: icinga::FileLogger::ReopenLogFile() (+0xaa) [0x562c3d778f0a]
	(7) icinga2: icinga::Application::RunEventLoop() (+0x7c2) [0x562c3d7563a2]
	(8) icinga2: icinga::IcingaApplication::Main() (+0x1f4) [0x562c3d98a974]
	(9) icinga2: icinga::Application::Run() (+0xc1) [0x562c3d754351]
	(10) icinga2: icinga::DaemonCommand::Run(boost::program_options::variables_map const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const (+0xbc9) [0x562c3d903259]
	(11) icinga2: <unknown function> (+0x266ef6) [0x562c3d759ef6]
	(12) icinga2: main (+0x49) [0x562c3d6fe299]
	(13) libc.so.6: __libc_start_main (+0xf1) [0x7fc551eab2e1]
	(14) icinga2: _start (+0x2a) [0x562c3d71b81a]

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

Failed to launch GDB: No such file or directory

Here are the last /var/log/icinga2/icinga2.log.1file corresponding to before the crash.

[2018-11-06 06:20:18 +0100] information/WorkQueue: #14 (JsonRpcConnection, #5) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:20:36 +0100] information/WorkQueue: #16 (JsonRpcConnection, #7) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:20:38 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:20:56 +0100] information/WorkQueue: #11 (JsonRpcConnection, #2) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:20:56 +0100] information/WorkQueue: #15 (JsonRpcConnection, #6) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:20:58 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:21:18 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:21:36 +0100] information/WorkQueue: #12 (JsonRpcConnection, #3) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:21:38 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:21:56 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:21:58 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:22:06 +0100] information/WorkQueue: #10 (JsonRpcConnection, #1) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:22:16 +0100] information/WorkQueue: #9 (JsonRpcConnection, #0) items: 0, rate: 0.3/s (18/min 90/5min 270/15min);
[2018-11-06 06:22:18 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:22:22 +0100] information/ConfigObject: Dumping program state to file '/var/lib/icinga2/icinga2.state'
[2018-11-06 06:22:28 +0100] information/WorkQueue: #13 (JsonRpcConnection, #4) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:22:38 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-06 06:22:38 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:22:58 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:23:18 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:23:38 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:23:58 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:24:18 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:24:38 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:24:58 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-06 06:25:18 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);

Here is the systemctl status icinga2 result

● icinga2.service - Icinga host/service/network monitoring system
   Loaded: loaded (/lib/systemd/system/icinga2.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/icinga2.service.d
           └─limits.conf
   Active: failed (Result: signal) since Tue 2018-11-06 06:25:25 CET; 2h 0min ago
  Process: 9187 ExecStart=/usr/sbin/icinga2 daemon --close-stdio -e ${ICINGA2_ERROR_LOG} (code=killed, signal=ABRT)
  Process: 9177 ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/default/icinga2 (code=exited, status=0/SUCCESS)
 Main PID: 9187 (code=killed, signal=ABRT)

Nov 06 06:25:25 log systemd[1]: icinga2.service: Main process exited, code=killed, status=6/ABRT
Nov 06 06:25:25 log systemd[1]: icinga2.service: Unit entered failed state.
Nov 06 06:25:25 log systemd[1]: icinga2.service: Failed with result 'signal'.
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

Here is the /etc/logrotate.d/icinga2 file.

root# cat /etc/logrotate.d/icinga2
/var/log/icinga2/icinga2.log /var/log/icinga2/debug.log {
	daily
	rotate 7
	su nagios nagios
	compress
	delaycompress
	missingok
	notifempty
	create 644 nagios nagios
	postrotate
		/bin/kill -USR1 $(cat /run/icinga2/icinga2.pid 2> /dev/null) 2> /dev/null || true
	endscript
}

/var/log/icinga2/error.log {
	daily
	rotate 90
	su nagios nagios
	compress
	delaycompress
	missingok
	notifempty
	create 644 nagios nagios
	# TODO: figure out how to get Icinga to re-open this log file
}

Hope it helps.

To dev staff: Here to help/test things if needed ;-)

Edit: added systemctl status icinga2 output

@dnsmichi
Copy link
Contributor

dnsmichi commented Nov 6, 2018

This looks like a wrong file descriptor handle, i.e. logrotate moves something away where Icinga still wants to write to. I suspect the close-stdio changes and flushing the streams to cause trouble here.

If anyone has time to lookup the code changes, I‘m out for OSMC this week.

@NielsH
Copy link

NielsH commented Nov 7, 2018

Hi,

I also have this issue, please see my crash log as well. This also includes the gdb output:

 Application version: r2.10.1-1

System information:
  Platform: Debian GNU/Linux
  Platform version: 9 (stretch)
  Kernel: Linux
  Kernel version: 4.9.0-8-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 6.3.0
  Build host: 1880c6977367

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
Stacktrace:

	(0) libc.so.6: gsignal (+0xcf) [0x7fbf8b938fff]
	(1) libc.so.6: abort (+0x16a) [0x7fbf8b93a42a]
	(2) libc.so.6: <unknown function> (+0x70c00) [0x7fbf8b976c00]
	(3) libc.so.6: <unknown function> (+0x76fc6) [0x7fbf8b97cfc6]
	(4) libc.so.6: <unknown function> (+0x7780e) [0x7fbf8b97d80e]
	(5) icinga2: icinga::StreamLogger::BindStream(std::ostream*, bool) (+0xa6) [0x560aed2475c6]
	(6) icinga2: icinga::FileLogger::ReopenLogFile() (+0xaa) [0x560aed1fdf0a]
	(7) icinga2: icinga::Application::RunEventLoop() (+0x7c2) [0x560aed1db3a2]
	(8) icinga2: icinga::IcingaApplication::Main() (+0x1f4) [0x560aed40f974]
	(9) icinga2: icinga::Application::Run() (+0xc1) [0x560aed1d9351]
	(10) icinga2: icinga::DaemonCommand::Run(boost::program_options::variables_map const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const (+0xbc9) [0x560aed388259]
	(11) icinga2: <unknown function> (+0x266ef6) [0x560aed1deef6]
	(12) icinga2: main (+0x49) [0x560aed183299]
	(13) libc.so.6: __libc_start_main (+0xf1) [0x7fbf8b9262e1]
	(14) icinga2: _start (+0x2a) [0x560aed1a081a]

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

[New LWP 31300]
[New LWP 31301]
[New LWP 31302]
[New LWP 31303]
[New LWP 31304]
[New LWP 31316]
[New LWP 31326]
[New LWP 31327]
[New LWP 31353]
[New LWP 31354]
[New LWP 31355]
[New LWP 31356]
[New LWP 31357]
[New LWP 31358]
[New LWP 31359]
[New LWP 31360]
[New LWP 31361]
[New LWP 31362]
[New LWP 31363]
[New LWP 31364]
[New LWP 31365]
[New LWP 31366]
[New LWP 31367]
[New LWP 31368]
[New LWP 31420]
[New LWP 31485]
[New LWP 31486]
[New LWP 31487]
[New LWP 31488]
[New LWP 6596]
[New LWP 20511]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fbf8d2bdb3a in __waitpid (pid=15301, stat_loc=0x7ffca6d8b7bc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
29	../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory.

Thread 32 (Thread 0x7fbf73910700 (LWP 20511)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73910700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73910700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460254365440, 4463560644109586558, 0, 140460254893855, 140460254365440, 140460061493968, -4427276064668987266, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 31 (Thread 0x7fbf7388e700 (LWP 6596)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7388e700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7388e700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460253832960, 4463560644109586558, 0, 140460254893855, 140460253832960, 140460061493184, -4427276132314722178, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 30 (Thread 0x7fbf73339700 (LWP 31488)):
#0  0x00007fbf8b9e567d in poll () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed2111ae in icinga::Process::IOThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73339700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73339700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460248241920, 4463560644109586558, 0, 140460253818223, 0, 140460707086400, -4427277410604363650, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 29 (Thread 0x7fbf7337a700 (LWP 31487)):
#0  0x00007fbf8b9e567d in poll () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed2111ae in icinga::Process::IOThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7337a700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7337a700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460248508160, 4463560644109586558, 0, 140460253818223, 0, 140460707086400, -4427277375707754370, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 28 (Thread 0x7fbf733bb700 (LWP 31486)):
#0  0x00007fbf8b9e567d in poll () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed2111ae in icinga::Process::IOThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf733bb700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf733bb700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460248774400, 4463560644109586558, 0, 140460253818223, 0, 140460707086400, -4427277342958628738, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 27 (Thread 0x7fbf733fc700 (LWP 31485)):
#0  0x00007fbf8b9e567d in poll () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed2111ae in icinga::Process::IOThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf733fc700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf733fc700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460249040640, 4463560644109586558, 0, 140460253818223, 0, 140460707086400, -4427277308062019458, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 26 (Thread 0x7fbf7343d700 (LWP 31420)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed3179c8 in icinga::JsonRpcConnection::MessageHandler(icinga::String const&) ()
No symbol table info available.
#5  0x0000560aed31877b in icinga::JsonRpcConnection::MessageHandlerWrapper(icinga::String const&) ()
No symbol table info available.
#6  0x0000560aed26213e in icinga::WorkQueue::RunTaskFunction(std::function<void ()> const&) ()
No symbol table info available.
#7  0x0000560aed26ac85 in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#8  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#9  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7343d700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7343d700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460249306880, 4463560644109586558, 0, 140460253820031, 0, 140460707086400, -4427277825068707714, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 25 (Thread 0x7fbf7347e700 (LWP 31368)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed26aebb in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7347e700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7347e700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460249573120, 4463560644109586558, 0, 140460253820927, 0, 140460707086400, -4427277790172098434, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 24 (Thread 0x7fbf734bf700 (LWP 31367)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed26aebb in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf734bf700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf734bf700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460249839360, 4463560644109586558, 0, 140460253820927, 0, 140460707086400, -4427277757422972802, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 23 (Thread 0x7fbf73500700 (LWP 31366)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed26aebb in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73500700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73500700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460250105600, 4463560644109586558, 0, 140460253820927, 0, 140460707086400, -4427277722526363522, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 22 (Thread 0x7fbf73541700 (LWP 31365)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed26aebb in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73541700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73541700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460250371840, 4463560644109586558, 0, 140460253820927, 0, 140460707086400, -4427277672597368706, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 21 (Thread 0x7fbf73582700 (LWP 31364)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed3179c8 in icinga::JsonRpcConnection::MessageHandler(icinga::String const&) ()
No symbol table info available.
#5  0x0000560aed31877b in icinga::JsonRpcConnection::MessageHandlerWrapper(icinga::String const&) ()
No symbol table info available.
#6  0x0000560aed26213e in icinga::WorkQueue::RunTaskFunction(std::function<void ()> const&) ()
No symbol table info available.
#7  0x0000560aed26ac85 in icinga::WorkQueue::WorkerThreadProc() ()
No symbol table info available.
#8  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#9  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73582700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73582700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460250638080, 4463560644109586558, 0, 140460253820031, 0, 140460707086400, -4427277637700759426, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 20 (Thread 0x7fbf735c3700 (LWP 31363)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf735c3700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf735c3700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460250904320, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427277604951633794, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 19 (Thread 0x7fbf73604700 (LWP 31362)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73604700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73604700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460251170560, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427278119810838402, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 18 (Thread 0x7fbf73645700 (LWP 31361)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73645700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73645700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460251436800, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427278087061712770, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 17 (Thread 0x7fbf73686700 (LWP 31360)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73686700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73686700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460251703040, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427278052165103490, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 16 (Thread 0x7fbf736c7700 (LWP 31359)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf736c7700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf736c7700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460251969280, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427278019415977858, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 15 (Thread 0x7fbf73708700 (LWP 31358)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73708700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73708700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460252235520, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427277984519368578, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 14 (Thread 0x7fbf73749700 (LWP 31357)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73749700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73749700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460252501760, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427277951770242946, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 13 (Thread 0x7fbf7378a700 (LWP 31356)):
#0  0x00007fbf8b9ef0c3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23f577 in icinga::SocketEventEngineEpoll::ThreadProc(int) ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7378a700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7378a700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460252768000, 4463560644109586558, 0, 140460253819983, 0, 140460707086400, -4427277916873633666, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 12 (Thread 0x7fbf737cb700 (LWP 31355)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf737cb700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf737cb700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460253034240, 4463560644109586558, 0, 140460254893855, 0, 140460707086400, -4427277884124508034, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 11 (Thread 0x7fbf7380c700 (LWP 31354)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed241506 in icinga::ThreadPool::ManagerThreadProc() ()
No symbol table info available.
#5  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7380c700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7380c700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460253300480, 4463560644109586558, 0, 140460254894383, 0, 140460707086400, -4427276199960457090, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#7  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 10 (Thread 0x7fbf7384d700 (LWP 31353)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf7384d700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf7384d700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460253566720, 4463560644109586558, 0, 140460254893743, 0, 140460707086400, -4427276167211331458, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 9 (Thread 0x7fbf73951700 (LWP 31327)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed498f7b in icinga::CheckerComponent::CheckThreadProc() ()
No symbol table info available.
#3  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73951700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73951700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460254631680, 4463560644109586558, 0, 140723107711823, 140460254631680, 94604956656800, -4427276014739992450, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 8 (Thread 0x7fbf73992700 (LWP 31326)):
#0  0x00007fbf8d2bd3ed in accept () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x0000560aed23af57 in icinga::Socket::Accept() ()
No symbol table info available.
#2  0x0000560aed2b9a97 in icinga::ApiListener::ListenerThreadProc(boost::intrusive_ptr<icinga::Socket> const&) ()
No symbol table info available.
#3  0x0000560aed370d06 in ?? ()
No symbol table info available.
#4  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#5  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf73992700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf73992700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460254897920, 4463560644109586558, 0, 140723107709919, 140460254897920, 94604956605632, -4427275979843383170, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#6  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 7 (Thread 0x7fbf739d3700 (LWP 31316)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
No locals.
#1  0x0000560aed242833 in icinga::Timer::TimerThreadProc() ()
No symbol table info available.
#2  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#3  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf739d3700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf739d3700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460255164160, 4463560644109586558, 0, 140723107712367, 0, 140460707086400, -4427275947094257538, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 6 (Thread 0x7fbf8e71e700 (LWP 31304)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed241506 in icinga::ThreadPool::ManagerThreadProc() ()
No symbol table info available.
#5  0x00007fbf8c279e6f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf8e71e700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf8e71e700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460705310464, 4463560644109586558, 0, 140723107732495, 0, 140460707086400, -4427359343011114882, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#7  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 5 (Thread 0x7fbf8e75f700 (LWP 31303)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed2ad41d in icinga::ApiListener::ApiTimerHandler() ()
No symbol table info available.
#5  0x0000560aed25e931 in ?? ()
No symbol table info available.
#6  0x0000560aed220a14 in icinga::Timer::Call() ()
No symbol table info available.
#7  0x0000560aed2440d8 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#8  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#9  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf8e75f700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf8e75f700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460705576704, 4463560644109586558, 0, 140723107731855, 0, 140460707086400, -4427359310261989250, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 4 (Thread 0x7fbf8e822700 (LWP 31302)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
No locals.
#1  0x00007fbf8d2b6c06 in __GI___pthread_mutex_lock (mutex=0x560aeed05d10) at ../nptl/pthread_mutex_lock.c:115
        id = -1926508708
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = 1
        id = <optimized out>
#2  0x0000560aed1ee5f5 in icinga::ObjectLock::LockMutex(icinga::Object const*) ()
No symbol table info available.
#3  0x0000560aed1fb388 in icinga::Log::~Log() ()
No symbol table info available.
#4  0x0000560aed49773e in icinga::CheckerComponent::ResultTimerHandler() ()
No symbol table info available.
#5  0x0000560aed25e931 in ?? ()
No symbol table info available.
#6  0x0000560aed220a14 in icinga::Timer::Call() ()
No symbol table info available.
#7  0x0000560aed2440d8 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#8  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#9  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf8e822700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf8e822700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460706375424, 4463560644109586558, 0, 140723107731855, 140460706375424, 94604950740848, -4427357541272334210, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 3 (Thread 0x7fbf8e7e1700 (LWP 31301)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf8e7e1700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf8e7e1700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460706109184, 4463560644109586558, 0, 140723107731855, 140460706109184, 94604950813008, -4427359225436385154, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 2 (Thread 0x7fbf8e7a0700 (LWP 31300)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
No locals.
#1  0x0000560aed1c866b in boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) ()
No symbol table info available.
#2  0x0000560aed243f97 in icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ()
No symbol table info available.
#3  0x00007fbf8e292116 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
No symbol table info available.
#4  0x00007fbf8d2b4494 in start_thread (arg=0x7fbf8e7a0700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7fbf8e7a0700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140460705842944, 4463560644109586558, 0, 140723107731855, 140460705842944, 94604950813920, -4427359275365379970, -4427361040356159362}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fbf8b9eeacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
No locals.

Thread 1 (Thread 0x7fbf8e82e480 (LWP 31268)):
#0  0x00007fbf8d2bdb3a in __waitpid (pid=15301, stat_loc=0x7ffca6d8b7bc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
        resultvar = 18446744073709551104
        sc_cancel_oldtype = 0
#1  0x0000560aed1b528e in icinga::Application::AttachDebugger(icinga::String const&, bool) ()
No symbol table info available.
#2  0x0000560aed1d99a7 in icinga::Application::SigAbrtHandler(int) ()
No symbol table info available.
#3  <signal handler called>
No locals.
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
        set = {__val = {0, 2314885530818453536, 8660248813382869024, 7596496373740942904, 3419760881481315694, 7310595060593224044, 3328211180835464301, 7090415180238960433, 3472328515408705638, 7293692169334238512, 8223625903103632184, 3472328295963438455, 4192904167887482928, 7365366848950382640, 3472665881546876514, 4064043904900345904}}
        pid = <optimized out>
        tid = <optimized out>
#5  0x00007fbf8b93a42a in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x756e672d78756e69, sa_sigaction = 0x756e672d78756e69}, sa_mask = {__val = {3761119431852583983, 7377571526170604334, 3472328301395600696, 4063716251160426285, 8606977229197423204, 3472328296226648109, 3475143045726351408, 7378645556122361904, 3472331625700155747, 3918522442241748272, 8223625903103818084, 3472328295963438455, 4192904167887482928, 2314885531086893104, 2314885530818453536, 140723107711424}}, sa_flags = 123, sa_restorer = 0x7ffca6d8c9c0}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#6  0x00007fbf8b976c00 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7fbf8ba6bd98 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
        ap = {{gp_offset = 40, fp_offset = 32703, overflow_arg_area = 0x7ffca6d8c9d0, reg_save_area = 0x7ffca6d8c960}}
        fd = 2
        on_2 = <optimized out>
        list = <optimized out>
        nlist = <optimized out>
        cp = <optimized out>
        written = <optimized out>
#7  0x00007fbf8b97cfc6 in malloc_printerr (action=3, str=0x7fbf8ba6bef0 "double free or corruption (fasttop)", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5049
        buf = "0000560aeed709c0"
        cp = <optimized out>
        ar_ptr = <optimized out>
        ptr = <optimized out>
        str = 0x7fbf8ba6bef0 "double free or corruption (fasttop)"
        action = 3
#8  0x00007fbf8b97d80e in _int_free (av=0x7fbf8bc9fb00 <main_arena>, p=0x560aeed709b0, have_lock=0) at malloc.c:3905
        size = <optimized out>
        fb = <optimized out>
        nextchunk = <optimized out>
        nextsize = <optimized out>
        nextinuse = <optimized out>
        prevsize = <optimized out>
        bck = <optimized out>
        fwd = <optimized out>
        errstr = <optimized out>
        locked = <optimized out>
        __func__ = "_int_free"
#9  0x0000560aed2475c6 in icinga::StreamLogger::BindStream(std::ostream*, bool) ()
No symbol table info available.
#10 0x0000560aed1fdf0a in icinga::FileLogger::ReopenLogFile() ()
No symbol table info available.
#11 0x0000560aed1db3a2 in icinga::Application::RunEventLoop() ()
No symbol table info available.
#12 0x0000560aed40f974 in icinga::IcingaApplication::Main() ()
No symbol table info available.
#13 0x0000560aed1d9351 in icinga::Application::Run() ()
No symbol table info available.
#14 0x0000560aed388259 in icinga::DaemonCommand::Run(boost::program_options::variables_map const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const ()
No symbol table info available.
#15 0x0000560aed1deef6 in ?? ()
No symbol table info available.
#16 0x0000560aed183299 in main ()
No symbol table info available.

After the crash a new icinga2.log is already created:

root@icingasql02:~# ls -lah /var/log/icinga2/
total 1.1M
drwxr-s--x  4 nagios adm    4.0K Nov  7 06:25 .
drwxr-xr-x 14 root   root   4.0K Nov  7 06:25 ..
drwxr-s--x  3 nagios adm    4.0K Oct 17  2017 compat
drwxr-s--x  2 nagios adm    4.0K Nov  7 06:25 crash
-rw-r--r--  1 nagios nagios    0 Nov  7 06:25 icinga2.log
-rw-r--r--  1 nagios nagios 838K Nov  7 06:25 icinga2.log.1
-rw-r--r--  1 nagios nagios  29K Nov  6 06:25 icinga2.log.2.gz
-rw-r--r--  1 nagios nagios  26K Nov  5 06:25 icinga2.log.3.gz
-rw-r--r--  1 nagios nagios  26K Nov  4 06:25 icinga2.log.4.gz
-rw-r--r--  1 nagios nagios  26K Nov  3 06:24 icinga2.log.5.gz
-rw-r--r--  1 nagios nagios  30K Nov  2 06:24 icinga2.log.6.gz
-rw-r--r--  1 nagios nagios  30K Nov  1 06:24 icinga2.log.7.gz

But as you can see nothing is written to it yet. When I start up the Icinga2 daemon again after the crash it starts writing to icinga2.log

@mcktr
Copy link
Member

mcktr commented Nov 7, 2018

Can you try the snapshot packages? In #6736 a bug was fixed which seems similar to this.

@fladna9
Copy link

fladna9 commented Nov 7, 2018

Hello @mcktr,
Is snapshot deb repository stable enough to go on a production server?
Because my crashing machine is an internal-but-important-on-production server...

@dnsmichi
Copy link
Contributor

dnsmichi commented Nov 9, 2018

@mcktr the changes you are referring to are solely cluster related, nothing which points to the stream logger class. I've fixed a possible double free as see in the gdb stacktrace yesterday, please test them in the snapshot packages once being built. On a related note, right now the master solely contains bugfixes and no radical changes, so it is safe to assume it works.

@dnsmichi dnsmichi added this to the 2.10.2 milestone Nov 9, 2018
@dnsmichi
Copy link
Contributor

Thus far, I am not able to reproduce this with 2.10.1 on Debian Stretch.

wget -O - https://packages.icinga.com/icinga.key | apt-key add -

DIST=$(awk -F"[)(]+" '/VERSION=/ {print $2}' /etc/os-release); \
 echo "deb http://packages.icinga.com/debian icinga-${DIST} main" > \
 /etc/apt/sources.list.d/${DIST}-icinga.list
 echo "deb-src http://packages.icinga.com/debian icinga-${DIST} main" >> \
 /etc/apt/sources.list.d/${DIST}-icinga.list

apt-get update

apt-get install -y icinga2 vim


vim /etc/logrotate.d/icinga2

/var/log/icinga2/icinga2.log /var/log/icinga2/debug.log {
	size 1K
	rotate 7
	su nagios nagios
	compress
	delaycompress
	missingok
	notifempty
	create 644 nagios nagios
	postrotate
		/bin/kill -USR1 $(cat /run/icinga2/icinga2.pid 2> /dev/null) 2> /dev/null || true
	endscript
}

/var/log/icinga2/error.log {
	size 1K
	rotate 90
	su nagios nagios
	compress
	delaycompress
	missingok
	notifempty
	create 644 nagios nagios
	# TODO: figure out how to get Icinga to re-open this log file
}

logrotate -d /etc/logrotate.d/icinga2

@dnsmichi dnsmichi modified the milestones: 2.10.2, 2.11.0 Nov 12, 2018
@sschindel
Copy link
Author

Has this already been released? There was an update some days ago, but this one still does not fix that issue.

@fladna9
Copy link

fladna9 commented Nov 26, 2018

@sschindel Same, crashed this morning as well.

@dnsmichi
Copy link
Contributor

Please add more logs from such crashes and everything else which helps reproduce the issue. I wasn't able to reproduce this yet.

@fladna9
Copy link

fladna9 commented Nov 26, 2018

Today's logs

File /var/log/daemon.log.1

Nov 26 06:25:46 log systemd[1]: Reloading The Apache HTTP Server.
Nov 26 06:25:48 log apachectl[11749]: AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Nov 26 06:25:48 log systemd[1]: Reloaded The Apache HTTP Server.
Nov 26 06:25:59 log fail2ban.server[30559]: INFO flush performed on SYSLOG
Nov 26 06:26:03 log systemd[1]: icinga2.service: Main process exited, code=killed, status=11/SEGV
Nov 26 06:26:03 log systemd[1]: icinga2.service: Unit entered failed state.
Nov 26 06:26:03 log systemd[1]: icinga2.service: Failed with result 'signal'.

File /var/log/icinga2/icinga2.log.1

[2018-11-26 06:24:37 +0100] information/WorkQueue: #14 (JsonRpcConnection, #5) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-26 06:24:43 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-26 06:24:44 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-26 06:25:13 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-26 06:25:14 +0100] information/WorkQueue: #12 (JsonRpcConnection, #3) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-26 06:25:23 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-26 06:25:43 +0100] information/RemoteCheckQueue: items: 0, rate: 0/s (6/min 30/5min 90/15min);
[2018-11-26 06:25:43 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-26 06:25:43 +0100] information/WorkQueue: #15 (JsonRpcConnection, #6) items: 0, rate:  0/s (0/min 0/5min 0/15min);

File /var/log/icinga2/crash/report.1541741165.911804

  Application version: r2.10.1-1

System information:
  Platform: Debian GNU/Linux
  Platform version: 9 (stretch)
  Kernel: Linux
  Kernel version: 4.9.0-7-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 6.3.0
  Build host: 1880c6977367

Application information:

General paths:
  Config directory: /etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /run/icinga2

Old paths (deprecated):
  Installation root: /usr
  Sysconf directory: /etc
  Run directory (base): /run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid
Stacktrace:

	(0) libc.so.6: gsignal (+0xcf) [0x7f20779d7fff]
	(1) libc.so.6: abort (+0x16a) [0x7f20779d942a]
	(2) libc.so.6: <unknown function> (+0x70c00) [0x7f2077a15c00]
	(3) libc.so.6: <unknown function> (+0x76fc6) [0x7f2077a1bfc6]
	(4) libc.so.6: <unknown function> (+0x7780e) [0x7f2077a1c80e]
	(5) icinga2: std::_Function_base::_Base_manager<std::_Bind<std::_Mem_fn<void (icinga::Timer::*)()> (boost::intrusive_ptr<icinga::Timer>)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) (+0x38) [0x55a98c8d6358]
	(6) icinga2: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x4ca) [0x55a98c8c716a]
	(7) libboost_thread.so.1.62.0: <unknown function> (+0x12116) [0x7f207a331116]
	(8) libpthread.so.0: <unknown function> (+0x7494) [0x7f2079353494]
	(9) libc.so.6: clone (+0x3f) [0x7f2077a8dacf]

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***

Failed to launch GDB: No such file or directory

File /var/log/messages.1

Nov 26 06:26:01 log kernel: [5674078.627855] traps: icinga2[4528] general protection ip:7f5712a4025b sp:7ffcaa2ff590 error:0
Nov 26 06:26:01 log kernel: [5674078.627860]  in libc-2.24.so[7f57129c9000+195000]
Nov 26 06:26:05 log liblogging-stdlog:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="11297" x-info="http://www.rsyslog.com"] rsyslogd was HUPed

File /var/log/syslog.1

Nov 26 06:26:01 log kernel: [5674078.627855] traps: icinga2[4528] general protection ip:7f5712a4025b sp:7ffcaa2ff590 error:0
Nov 26 06:26:01 log kernel: [5674078.627860]  in libc-2.24.so[7f57129c9000+195000]
Nov 26 06:26:03 log systemd[1]: icinga2.service: Main process exited, code=killed, status=11/SEGV
Nov 26 06:26:03 log systemd[1]: icinga2.service: Unit entered failed state.
Nov 26 06:26:03 log systemd[1]: icinga2.service: Failed with result 'signal'.
Nov 26 06:26:05 log liblogging-stdlog:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="11297" x-info="http://www.rsyslog.com"] rsyslogd was HUPed

File /var/log/apt/history.log

Start-Date: 2018-11-20  11:14:37
Commandline: apt full-upgrade
Requested-By: user (1000)
Upgrade: python3-josepy:amd64 (1.0.1-1~bpo9+1, 1.1.0-2~bpo9+1), icinga2-bin:amd64 (2.10.1-1.stretch, 2.10.2-1.stretch), certbot:amd64 (0.25.0-1~bpo9+1, 0.28.0-1~bpo9+1), icinga2-doc:amd64 (2.10.1-1.stretch, 2.10.2-1.stretch), python3-acme:amd64 (0.25.1-1~bpo9+1, 0.28.0-1~bpo9+1), python3-certbot-apache:amd64 (0.25.0-2~bpo9+1, 0.28.0-1~bpo9+1), icinga2-common:amd64 (2.10.1-1.stretch, 2.10.2-1.stretch), libmariadbclient18:amd64 (10.1.26-0+deb9u1, 10.1.37-0+deb9u1), python3-certbot:amd64 (0.25.0-1~bpo9+1, 0.28.0-1~bpo9+1), python-certbot-apache:amd64 (0.25.0-2~bpo9+1, 0.28.0-1~bpo9+1), icinga2:amd64 (2.10.1-1.stretch, 2.10.2-1.stretch)
End-Date: 2018-11-20  11:16:08

@dnsmichi
Copy link
Contributor

Does 2.10.2 change something here?

@fladna9
Copy link

fladna9 commented Nov 26, 2018

@dnsmichi : Doesn't seem so, because it crashed this morning.

How to reproduce: good question. Installed Debian Stretch, followed the GrayLog setup tutorial.
Nothing really special. As told many times before, i manage +30 Debian/Ubuntu machines, and this one is the only one having this problem.
All have latest Stable version (as many are production servers).

@dnsmichi
Copy link
Contributor

Application version: r2.10.1-1 is older than 2.10.2.

@dnsmichi
Copy link
Contributor

Thanks for your thoughts and input. I can only offer you to test things already available, everything else is work in progress.

I tend to disagree on "should be simple to fix", as the analysis in the issue has proven that the problem exists, but is not an easy one to reproduce. Since you're experiencing this in your environment, and obviously have some more deep knowledge about C++ and signal handling, it might be a good effort to patch this and send in a PR with analysis before and after.

I'm not sure how the UTF8 things relate here, that's just a different issue having been fixed with the JSON library replacement. The main thing is that memory corruption and leaks are hard to find, and at the time this crashes, they may have influenced this but the root cause lies in a different scope.

I don't believe that this is the true cause for logrotate either, but testing the removed YAJL problem allows us to mitigate the problem further. The stacktrace on the crash will definitely look different, and hopefully gives more insights on this. Therefore I am asking you and everyone else experiencing the problem to test the snapshot packages.

If we don't find anything here in a reasonable amount of time, there's two options: a) Block 2.11 release, meaning to say everyone else that's waiting for the network stack etc. b) Remove this from 2.11 and leave it for later. That's bad for everyone else here.

Cheers,
Michael

@Elias481
Copy link
Contributor

Regarding not wanting to test this: It's just not really testable for us because the last one, aborting in yayl, is not common in our environment and as the other ones. And if it is a separate issue, it thus had lower priority.
As the other ones are verified (including possible fix and some other ways to mitigate this) I just don't believe the yayl at logrotate time is a yayl fault. Accessing an already deleted object or it's memory while the object is still accessed (as it is the case often as verified) can possibly lead to such memory allocation errors in random other scopes, as You confirmed.
So testing the yayl replacement from #7019 is something that could make sense for miso231 who claimed in #6684 to get this error condition on logrotate frequently. Not sure as I did not verify but I think it's quite likely he then will get these errors (which directly point to the roots) or similiar error from other json parser if it does some similiar memory allocation. But this is only what I assume seeing relation logotate time which does not trigger any json parsing related stuff or influence it.

For the crashes that are reported in this issue I did a verification that it is related to unsafe refcounting. And I made one clean fix/patch changing as less as possible just fixing the issue with only a small "interface" change (OnTimerExpired callback does not return a Ptr object with reference but just a plain pointer - and thus had to be careful not to increment it's refcount or copy it byside for later use without knowing the object is still alive then - but this parameter to OnTimerExpired is nowhere used beside test case).
Another possibility would be a local fix in StreamLogger destructor but this does not fix the bug with the Timer module refcounting so I would not recommend.
The third option is now, seeing the other errors, something that I would strike-through (it's anyway a dirty solution and would not avoid such errors as can be seen at json parsing if they are related - it was just an example to verify bug).

One could prefer to do a bigger rework to keep the Timer-Scheduler keeping references on its Timers to not have the interface change described above, but that's not as simple. Sure I could do if that would be Your requirement for a fix but it would be some efford I would only spent if really needed.

Sure I can put it in a PR with analysis before and after - the one I have.

I could also try prove whether the yayl abort is reproducible if synchronizing that part to the other I already synchronized for the verification in such a way that the malloc from yayl occurs after one of the parties finished destruction completely (freed memory) but before the other one does the same and raises the double-free SEGV (so the one that is not the first destructor waits for a nitification from yayl).
Some efford and it has the side effect that even more extra code would need to be added so that it would be even more difficult to verify that it's just showing a race condition of original code and not an error introduced by these synchronization mechanisms.
So I would prefer to just implement a fix for this (the small and punctual one I already did or a bigger more complicated one that causes the timer-scheduler to be the Owner of all Timers that are in it's queue or currently executed by the scheduler) and see whether there are more - or if someone has the yayl error more often he could test with that version.

So what's Your requirement? Just a PR from what I already did or does it need to verify whether the yayl error is covered or is it required to provide OnTimerExpired callback with a Ptr object with valid refcount while at the same time keeping the auto-desctruction behaviour of the Timer objects which would make a fix much more complex?

I don't have a deep knowledge of C++ and signal-handling is also nothing I'm really deep in. But this is not really related to signal handling that far and I have quite deep understanding of programming/threading logic which is the same across all languages at the end/behind the curtins for the hardware we speak about (we don't speak about quantum-computers and such stuff where I expect some major differences).

So for the other Issues I can comment there (a bit shorter).

@Al2Klimov
Copy link
Member

Hello @Elias481 and thank you for the inspiration!

With #7129 (I think) I've found the root cause.
Please could you test #7130?

Best,
AK

@Elias481
Copy link
Contributor

Elias481 commented Apr 18, 2019

review done - #7130 does not change anything and is not addressing rootcause. the root-cause is fixed with #7129.

if you want a workarounf like that 7130 which does not address the root-cause I posted something that is working instead in comment above:
Elias481@9917e3d

As I'm not a fan to not fix the root-cause but just implement a local workaround I will not make a PR for that. But You could indeed make a merge of what's show there (it's already tested - and for the problem observiced within the issue the second one change would be the least needed - but then one more place stays unfixed....)

@Al2Klimov Al2Klimov removed the needs feedback We'll only proceed once we hear from you again label Apr 18, 2019
@dnsmichi
Copy link
Contributor

Hi,

thanks for the critical section patch to force the application crash, this really helped in analysing all the possible patches.

Here's what we've tested today.

kill -USR1 $(pidof icinga2)

Nothing

Git Master, 02db12a

Result: Segfault.

Only initialize Flush Timer once

From #7130.

Result: Icinga doesn't crash, but the log flush is running into a lock, enforced by the CS patch and different circumstances.

Timer const*

From #7129.

Result: Icinga doesn't crash and continues to log on stdout and mainlog.

Combine flush timer once with timer const*

Result: Works fine, doesn't have any other influences.

Conclusion, first part

Using a const pointer instead of an intrusive ptr fixes the problem, still the flush timer "re-newal" is totally unnecessary and needs to be fixed as well.

lambda function instead of std::bind

From #7004.

Result: Doesn't fix, crashes too.

lambda function and flush

Result: Doesn't crash.

Conclusion, second part

Somehow std::bind is involved here with intrusive_ptr, starting with the changes in 2.9.

A const* pointer or using a lambda bind is effectively keeping the pointer reference, and the call continues, no application crashes involved.

icinga2_logrotate_crash_analysis.patch.txt

@dnsmichi
Copy link
Contributor

After some research within #7006 I've come to the following conclusion:

  • lambdas capture by reference, still holding a reference on the timer object
  • lambdas are more likely to be inlined by the compiler than function objects with std::bind
  • std::bind resp. std::function require a copy or move constructor, always binding the value and using that when invoke happens later.
  • Timer::Call happens just once inside the application thread pool, there's no need to keep the returned function object around and re-use it multiple times.
  • The function object copies the pointer, and while another thread may already have freed it, it tries to access the address resulting in a crash
  • Lambdas hold a reference on the passed pointer and actually capture its value on execution (later in the TP).
  • Best practice are lambdas over std::bind with C++11 and future versions.

You can avoid the intrusive_ptr going out of scope by either making this a const raw pointer, or by passing its reference into a lambda function.

Since the Timer class explicitly uses raw pointers instead of intrusive_ptr's, I'd stay there and combine the three patches with 1) const Timer* 2) lambdas 3) avoid re-newing the timer.

In terms of moving away from std::bind to lambdas, follow my research here: #7006 (comment)

@dnsmichi dnsmichi self-assigned this Apr 24, 2019
dnsmichi pushed a commit to Elias481/icinga2 that referenced this issue Apr 24, 2019
…ashes at logrotate time

this is a direct fix of the issue revealing the problem that leads to crash

verification done with a patched icinga2 where the execution-order of the code lines of counter-parts involved in re-incrementing/decrementing Timer:Ptr is forced to be the one that leads to the obeserverd segfaults

refs Icinga#6737
dnsmichi pushed a commit that referenced this issue Apr 24, 2019
@dnsmichi
Copy link
Contributor

Snapshot packages for Debian/Ubuntu need to be fixed in our build infra, the rest is available.

@dnsmichi
Copy link
Contributor

ref/IP/13853

@dnsmichi dnsmichi modified the milestones: 2.11.0, 2.10.5 May 16, 2019
dnsmichi pushed a commit that referenced this issue May 16, 2019
…ashes at logrotate time

this is a direct fix of the issue revealing the problem that leads to crash

verification done with a patched icinga2 where the execution-order of the code lines of counter-parts involved in re-incrementing/decrementing Timer:Ptr is forced to be the one that leads to the obeserverd segfaults

refs #6737

(cherry picked from commit 52e3db2)
dnsmichi pushed a commit that referenced this issue May 16, 2019
dnsmichi pushed a commit that referenced this issue May 16, 2019
dnsmichi pushed a commit that referenced this issue May 16, 2019
refs #6737

(cherry picked from commit df25b18)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/crash Shouldn't happen, requires attention ref/IP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants