Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arch Linux, metalog 20220214: systemd service stops with error into failed state #30

Open
angrysquirell opened this issue Jul 26, 2023 · 5 comments

Comments

@angrysquirell
Copy link

angrysquirell commented Jul 26, 2023

My only concern is to catch events and send them as email notification.

It's standard installation from repository, pkgbuild and service unit file are available here:
https://gitlab.archlinux.org/archlinux/packaging/packages/metalog

Running metalog with -v parameter does not produce any additional information. The only logged messages after systemctl stop metalog.service are:
lip 25 22:06:58 arch systemd[1]: Stopping System Logger Daemon...
lip 25 22:06:58 arch systemd[1]: metalog.service: Main process exited, code=exited, status=1/FAILURE
lip 25 22:06:58 arch systemd[1]: metalog.service: Failed with result 'exit-code'.
lip 25 22:06:58 arch systemd[1]: Stopped System Logger Daemon.

Configuration file, as follows, nothing special, "accepted" for testing purposes:

maxsize = 1048576
maxtime = 86400
maxfiles = 5

ssh :
program = "sshd"
# regex = "preauth"
regex = "accepted"
logdir = "NONE"
command = "/usr/local/bin/metalog-email"

Logging itself works good, but how to diagnose and avoid systemd service stopping errors?

@m-kress
Copy link
Collaborator

m-kress commented Jul 26, 2023

Metalog always returns 1 when it gets killed by SIGTERM or SIGQUIT. Obviously it was presumed, that the system logger should always run after it has been started. Any stopping is considerd as an error. Eventually this could be changed.

The other thing is: What would you expect metalog to do? In your case metalog gets stopped, so who should listen to the messages metalog would produce? STDOUT? Or do you run a second syslogger?

@angrysquirell
Copy link
Author

angrysquirell commented Jul 26, 2023

Thank you very much for the response.

Metalog always returns 1 when it gets killed by SIGTERM or SIGQUIT. Obviously it was presumed, that the system logger should always run after it has been started. Any stopping is considerd as an error. Eventually this could be changed.

I undestand it, but I've just noticed that metalog.service does not exit with exit 0 / success (in contrast to syslog-ng and systemd-journald - examples below). Of course I don't need to manually stop / turn off the service - it should monitor logs from system start to poweroff. I'm just curious if it is an metalog error, my configuration error or handling metalog.service by systemd while stopping the service.

Error appears also before restart / poweroff while systemd is stopping service.

So if I understood correctly, this metalog's behavior is absolutely normal by desing and it's not to worry about?

Examples:

  1. Metalog stopping "error" before reboot / poweroff:
    journalctl -q -b -1 -u metalog
    lip 26 19:00:46 varch systemd[1]: Started System Logger Daemon.
    lip 26 19:03:54 varch systemd[1]: Stopping System Logger Daemon...
    lip 26 19:03:54 varch systemd[1]: metalog.service: Main process exited, code=exited, status=1/FAILURE
    lip 26 19:03:54 varch systemd[1]: metalog.service: Failed with result 'exit-code'.
    lip 26 19:03:54 varch systemd[1]: Stopped System Logger Daemon.

  2. Manually stopped syslog-ng service exit with 0 / success:
    systemctl status [email protected] -l --no-pager
    [email protected] - System Logger Daemon "default" instance
    Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled; preset: disabled)
    Drop-In: /etc/systemd/system/[email protected]
    └─override.conf
    Active: inactive (dead) since Wed 2023-07-26 19:11:28 CEST; 3min 30s ago
    Duration: 14min 56.433s
    Docs: man:syslog-ng(8)
    Process: 887 ExecStart=/usr/bin/syslog-ng -F $OTHER_OPTIONS --cfgfile $CONFIG_FILE --control $CONTROL_FILE --persist-file $PERSIST_FILE --pidfile $PID_FILE (code=exited, status=0/SUCCESS)
    Main PID: 887 (code=exited, status=0/SUCCESS)
    Status: "Shutting down... (Wed Jul 26 19:11:28 2023"
    CPU: 255ms

  3. Manually stopped systemd-journald service also exited with 0 / success:
    systemctl status systemd-journald
    ○ systemd-journald.service - Journal Service
    Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
    Active: inactive (dead) since Wed 2023-07-26 19:18:01 CEST; 6s ago
    Duration: 21min 41.315s
    TriggeredBy: ● systemd-journald.socket
    ● systemd-journald-dev-log.socket
    Docs: man:systemd-journald.service(8)
    man:journald.conf(5)
    Process: 252 ExecStart=/usr/lib/systemd/systemd-journald (code=exited, status=0/SUCCESS)
    Main PID: 252 (code=exited, status=0/SUCCESS)
    Status: "Processing requests..."
    CPU: 240ms

The other thing is: What would you expect metalog to do? In your case metalog gets stopped, so who should listen to the messages metalog would produce? STDOUT? Or do you run a second syslogger?

I'm using syslog-ng only as auxiliary monitoring tool, that can send warning by email. The main logger is Saint Systemd-Jounald Almighty - Daemon of the Century, which cannot talk to local postfix relay instalnce and send something to email :-D

Because of some recent syslog-ng maintenance issues, difficulties, etc., I've just tried to setup metalog.

So generally, this kind of exit status while service stopping may be a bit misleading. I'm rather quite experienced user, so that is the reason of my first post.

Maybe there is some way to hide / suppress exit error code to have clean logs without this kind of "false-positive" error?

Best regards!

@m-kress
Copy link
Collaborator

m-kress commented Jul 30, 2023

Metalog obviously seems to exit with 1 on purpose after being killed. As mentioned: I don't know the idea behind that, but only can assume, that the original authors of metalog didn't have your use case in mind.

It would be quite easy to change that behaviour into exit with 0. Could we have a problem with compatibility or other regressions by changing that?

@angrysquirell
Copy link
Author

angrysquirell commented Jul 30, 2023

Metalog obviously seems to exit with 1 on purpose after being killed. As mentioned: I don't know the idea behind that, but only can assume, that the original authors of metalog didn't have your use case in mind.

It's not any special use case. It's normal service "working" from system start to poweroff / reboot.

It's probably caused by security reasons, as warning when someone would stop logging daemon, nevermind.

Maybe good choice would be exit 0 / success status only triggered by system's init ( /proc/1/comm ), sysvinit, runit, systemd, etc.

The other thing is that messages catched by metalog, were not appear in systemd-journal simultaneously (ForwartToSyslog=yes, enabled in journald.conf). I suppose that may be difficult to implement it, but would be usable.

I know about KISS rules, but somehow we are doomed to using systemd-journal...

Thank you for your effort.

@orbea
Copy link
Collaborator

orbea commented Jul 31, 2023

There are different kill(1) signals, presumably metalog could pay attention to how its being killed and give the correct exit code accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants