Aggregation not working when run on past events #1555
-
Elastalert version: 2.17.0 Hello everyone! The aggregation configuration section was the following: ...
aggregation:
hours: 4
aggregation_key: ["fields.host_guid", "winlog.event_data.SubjectUserName"]
aggregate_by_match_time: True
... As soon as I restarted the service, which led to the rule reload, it started sending a lot of alerts (~2000 emails in a few hours). Analyzing the alerts and the service logs, I noticed that the rule was running on past events without applying aggregation.
I found the following section in the code that could be the cause of the problem (#L1649): def add_aggregated_alert(self, match, rule):
""" Save a match as a pending aggregate alert to Elasticsearch. """
...
# This is a fallback option in case this change to using ts_now() interferes with the behavior current
# users are accustomed to. It is not documented because it likely won't be needed. If no one reports
# a problem we can remove this fallback option in a future release.
if rule.get('aggregation_alert_time_compared_with_timestamp_field', False):
compare_dt = lookup_es_key(match, rule['timestamp_field'])
else:
compare_dt = ts_now()
if (not rule['current_aggregate_id'].get(aggregation_key_value) or
('aggregate_alert_time' in rule and aggregation_key_value in rule['aggregate_alert_time'] and rule[
'aggregate_alert_time'].get(aggregation_key_value) < ts_to_dt(compare_dt))):
# ElastAlert may have restarted while pending alerts exist
pending_alert = self.find_pending_aggregate_alert(rule, aggregation_key_value)
... From what I understand, the code is always creating a new aggregation for each event, since the condition As mentioned in the comment, the problem might be resolved using the I'm not sure whether aggregation is intended to work in such cases, since an alert waiting for aggregation would be sent anyway by the next execution of the method Finally, a little off-topic about the aggregation code. In order to identify the problem, I read all the code that handles aggregations and found something that I think might be a bug.
If you agree on the points I raised, I can open a PR to fix the code. Thanks for your attention and for the great work you are doing with Elastalert! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thanks for reviewing the code in depth. I personally don't use aggregation so I don't run into these problems. So it's certainly possible what you described is correct.
I am reading this to say "please do not remove the For off-topic bug #1, I don't think this is a bug since the loop "breaks" upon the execution of the For off-topic bug #2, I think I agree with you. I'd have to leave it to you to verify that the functionality works as expected after changing this to add the key to the existing map (and create it if there is no existing map). |
Beta Was this translation helpful? Give feedback.
Thanks for reviewing the code in depth. I personally don't use aggregation so I don't run into these problems. So it's certainly possible what you described is correct.
I am reading this to say "please do not remove the
aggregation_alert_time_compared_with_timestamp_field
config option from future releases." Is that what you meant? If so, that's fine, and a PR that describes this option in the docs would be welcome.For off-topic bug #1, I don't think this is a bug since the loop "breaks" upon the executi…