Fix journald logging via "log stdout" #17775
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I tried to configure logging to systemd's journald using the method described in the documentation: Simply
log stdout
infrr.conf
. But besides log messages from frrinit.sh and watchfrr, I saw no log messages from the actual routing daemons.Debugging this revealed that FRR correctly notices that stdout is connected to journald (sd_stdout_is_journal==true), causing log_vty_init() to call zlog_5424_init() and zlog_5424_apply_dst(). But by the time zlog_5424_apply_dst() gets called, zt_stdout_journald.prio_min still is ZLOG_DISABLED, causing it to skip the call to zlog_5424_open(). And when frr.conf gets read later on, my
log stdout
statement sets .prio_min to LOG_DEBUG and causes a call to log_stdout_apply_level() and zlog_5424_apply_meta(), but the latter doesn't reattempt the zlog_5424_open(), so logs from routing daemons end up nowhere.This PR fixes zlog_5424_apply_dst() and causes it to retry zlog_5424_open() after setting .prio_min unless the log handler is already active.
This PR also fixes a cosmetic issue in zlog_5424_cycle() which I noticed while debugging this. When called with fd=-1, it potentially passes a bogus pointer &zlt->zt to zlog_target_replace() because zlt is NULL in that case. In practice, this didn't cause problems because zt is the first member of zlt, so &zlt->zt is NULL as well when zlt is NULL and zlog_target_replace() properly handles NULL pointers. But it's probably not such a good idea to rely on the offset of zt. The commit tries to make it a bit more obvious/explicit what's happening here.
NOTE: After applying this PR, you'll probably notice that some routing daemons crash during startup (pathd, staticd). This is due to an assertion failure here:
frr/lib/privs.c
Line 214 in 8ca4c3d
zprivs_change_caps() gets called indirectly by frr_with_privs (...) {...} in zlog_5424_open() and this triggers the assertion failure for routing daemons without special capabilities like here:
frr/staticd/static_main.c
Line 36 in 8ca4c3d
How is this supposed to work? Are functions in lib allowed to use frr_with_privs without knowing what capabilities the routing daemon they're linked to has? Should all routing daemons have a minimum set of capabilities so that the assertion doesn't fail (and zlog_5424_open() can actually perform the desired operation)? Should the assertion failure be converted to a "do nothing return"?
@eqvinox Maybe you have an idea here - you seem to be the author of most of the 5424 code?
For testing purposes, I avoided the assertion failure by patching zcaps2sys() such that it never returns NULL, but that's probably not the proper fix.