You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2022-05-03T11:55:04Z E! [inputs.exec] Failed to add nagios state: exec: get exit code: exit status 127
System info
Telegraf 1.22.3, Oracle Linux Server 8.5
Docker
No response
Steps to reproduce
Remove dependencies a nagios check needs to be executes (depends on check)
Let telegraf execute the check
Expected behavior
Metric 1: => nagios_state__tcp,host=home1 service_output="/some/path/telegraf/commands/check_tcp: error while loading shared libraries: libssl.so.10: cannot open shared object file: No such file or directory",state=3i 1651661042000000000
As seen in the 'actual behavior' the status code '127' is returned. This is a not supported status code and should be converted into a '3' unknown status. (https://nagios-plugins.org/doc/guidelines.html , Plugin Return Codes )
2. Missing error information:
The error message from this check execution is nowhere to be found. Not in the error.log file nor in service_output field from the metric. The message gets thrown away in the parsing process if it is not from type ExitError.
3. Error can only be found in error.log
The error can only be found in the telegraf error log when it occures. It should be shown in the service_output response itself.
Solution
The Solution would be to return a 'unknown' state in case of an error and to put the error message into the service output field. No errors needs to be logged into the telegraf error.log because a unknown state with proper information is a valid checkresult.
The text was updated successfully, but these errors were encountered:
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.22.3, Oracle Linux Server 8.5
Docker
No response
Steps to reproduce
Expected behavior
Metric 1: => nagios_state__tcp,host=home1 service_output="/some/path/telegraf/commands/check_tcp: error while loading shared libraries: libssl.so.10: cannot open shared object file: No such file or directory",state=3i 1651661042000000000
Metric 2: => nagios_state__tcp,host=home1 service_output="fork/exec /some/path/telegraf/commands/check_tcp: permission denied",state=3i 1651665933000000000
Actual behavior
Metric 1: => nagios_state__tcp,host=home1 service_output="",state=127i 1651661431000000000
Metric 2: => nagios_state__tcp,host=home1 service_output="" 1651665481000000000
Additional info
1. Return of not supported status codes:
As seen in the 'actual behavior' the status code '127' is returned. This is a not supported status code and should be converted into a '3' unknown status. (https://nagios-plugins.org/doc/guidelines.html , Plugin Return Codes )
2. Missing error information:
The error message from this check execution is nowhere to be found. Not in the error.log file nor in service_output field from the metric. The message gets thrown away in the parsing process if it is not from type ExitError.
3. Error can only be found in error.log
The error can only be found in the telegraf error log when it occures. It should be shown in the service_output response itself.
Solution
The Solution would be to return a 'unknown' state in case of an error and to put the error message into the service output field. No errors needs to be logged into the telegraf error.log because a unknown state with proper information is a valid checkresult.
The text was updated successfully, but these errors were encountered: