-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ossec: resolve journalist notification racing with reboots #3374
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #3374 +/- ##
========================================
Coverage 85.79% 85.79%
========================================
Files 34 34
Lines 2154 2154
Branches 238 238
========================================
Hits 1848 1848
Misses 250 250
Partials 56 56 Continue to review full report at Codecov.
|
@emkll the second commit of this PR ossec: never send more than one journalist notification per 24h is good to have but I'm not sure if we want it for 0.7.0. My concern is about adding code that does not fix a bug, that close to the release. In practice journalists who could be inconvenienced would be the one associated with a sysadmin that not only configures journalists notifications but also manually reboots for some reason. |
@@ -108,7 +108,7 @@ | |||
<localfile> | |||
<log_format>full_command</log_format> | |||
<command>head -1 /var/lib/securedrop/submissions_today.txt | grep '^[0-9]*$'</command> | |||
<frequency>86400</frequency> | |||
<frequency>90000</frequency> <!-- 25 hours --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question re: #3368 - does change mean that this localfile
won't be considered a daily notification and thus do_not_group
will not apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay nevermind, I see that the corresponding rule in local_rules.xml
is using rule_id=400600
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. It's a little confusing but becomes clearer when you realize the mail alerts preferences are only read and interpreted by ossec-maild when parsing the alert logs.
Thanks @dachary for the quick fixes! I agree that we should minimize changes as much as possible so close to release. I've tested the first commit of this PR(ed9f041), initial testing confirms that this will prevent duplicate alerts if by ensuring the scheduled notification never happens right before server reboot. Notifications are still sent after each reboot, if and only if Note that there is an |
@redshiftzero @emkll removed the extra commit (and test does not need to run because it already did on the first commit ;-) |
Okay after looking at this more and taking a step back, the choices are: First commit only: Minimal change that indeed resolves #3367. This is a clever fix. The thing is it also produces behavior worthy of another bug report "Additional reboots send submission notifications to journalists" (I know that a note in the documentation is added in this PR, but some admins will likely miss the note and it is odd and unexpected behavior unless one is familiar with the implementation). The other issue is that these additional notifications might actually be incorrect - indicating that submissions have or have not occurred in the last 24 hours when the opposite is true (since the Both commits: Much larger diff in functionality and thus high risk given the time we have to test before release. It does resolve both issues however. Am I understanding this correctly? If I am, I think we actually have to merge both commits to resolve all issues prior to release.... |
@redshiftzero I'd rather stick to the simpler fix.
That being said, I'm deferring to your better judgement :-) |
Totally agree. Well, the silver lining to pushing back the release by a week due to other significant issues (#3316), is that we now have time to carefully QA both commits, and this will fix all outstanding issues (if something doesn't work as we expect, we should have enough time to resolve it). Thanks for your work on this @dachary. |
The fix enforcing a 24h delay was re-pushed |
Thanks @dachary, seems like the first commit is working as intended, I receive emails upon every reboot, as well as at 4AM local time when my instance is scheduled to reboot. I've been testing the 2nd commit of this PR (856b180), but I still receive more than 1 notification per 24h I suspect this is due to permissions on the
It also seems like |
Added the make journalist notifications resilient to double ossec alerts commit to mitigate #3368 since we're unable to reproduce it. |
@emkll good catch! I moved it to the logs directory because it's already writable by the ossec user. |
docs/install.rst
Outdated
@@ -75,6 +75,10 @@ worth checking the *Journalist Interface*. For this you will need: | |||
the GPG private key, it is not possible to specify multiple | |||
GPG keys. | |||
|
|||
.. note:: The notification is sent after the daily reboot of the | |||
*Application Server*. If it is manually rebooted, additional | |||
notifications will be sent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: notification
-> journalist notification
(for clarity since this section of the documentation talks about OSSEC alerts nearby and people can get confused)
Also: We should remove "If it is manually rebooted, additional notifications will be sent." since it is resolved in 06b18b2
@redshiftzero thanks for catching the documentation inconsistency. Fixed and repushed :-) |
The app server is rebooted every 24h and will send a notification at boot time. The ossec server is also rebooted and will immediately send the email to the journalist, regardless of when the previous mail was sent (mail frequency is not a feature of ossec-maild). Always running the localfile command at boot time is an undocumented OSSEC behavior ossec/ossec-hids#1415 in 2.8.2 as well as 2.9.3. This guarantees exactly one mail will be sent daily. Setting the 25 hours frequency element is a safeguard: * against the following race a) command runs because the 24h period expires, b) the server reboots shortly after because it reboots every 24h, c) command runs again after the server is rebooted, causing two notifications to be sent in a row * in case the server does not reboot for some reason, the notification will still be sent every 25h Fixes: #3367
Under some circumstances daily journalist notifications may be grouped with other ossec alerts. In all cases where this transient error was observed, a well formed journalist notification alert was also included in the payload. By changing the regular expression we make the script resilient to payloads that contain unrelated content. Mitigates #3368
Thanks @dachary for the quick fixes. I tested the latest changes and can confirm this addresses the issues we've been seeing in the 0.7.0 RCs. One behavior that I've noticed is as follows: I'd appreciate if someone else could do a quick sanity-check/review of this PR, but consider it approved from my perspective. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. While I have not tested the resolution of the race in VMs, the changes appear well structured. Thanks very much for the verbose commit messages, @dachary—will make future changes manageable.
shopt -s -o xtrace | ||
PS4='${BASH_SOURCE[0]}:$LINENO: ${FUNCNAME[0]}: ' | ||
|
||
echo BUGOUS | main test_send_encrypted_alarm | \ | ||
echo BUGOUS | handle_notification test_send_encrypted_alarm | \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should this be "BOGUS", as in a fake submission? (I realize this came in via #2803, just noting for clarity's sake.)
grep --count 'notification suppressed' /var/log/syslog > /tmp/after | ||
test $(cat /tmp/before) -lt $(cat /tmp/after) | ||
""") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great tests! Thanks for the careful attention here!
Thanks for the quick review @conorsch , merging! |
@conorsch @emkll thanks for the careful review :-)
I second your opinion @emkll |
Status
Ready for review
Description of Changes
Mitigates: #3368
The app server is rebooted every 24h and will send a notification at
boot time. The ossec server is also rebooted and will immediately send
the email to the journalist, regardless of when the previous mail was
sent (mail frequency is not a feature of ossec-maild). Always running
the localfile command at boot time is an undocumented OSSEC behavior
ossec/ossec-hids#1415 in 2.8.2 as well as
2.9.3.
This guarantees exactly one mail will be sent daily.
Setting the 25 hours frequency element is a safeguard:
against the following race a) command runs because the 24h period
expires, b) the server reboots shortly after because it reboots
every 24h, c) command runs again after the server is rebooted,
causing two notifications to be sent in a row
in case the server does not reboot for some reason, the notification
will still be sent every 25h
Fixes: #3367
Testing
Deployment
N/A
Checklist
If you made changes to documentation:
make docs-lint
) passed locally