Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing the unused calls to elasticsearch_client #303

Merged
merged 2 commits into from
Jun 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion elastalert/elastalert.py
Original file line number Diff line number Diff line change
Expand Up @@ -859,6 +859,14 @@ def enhance_filter(self, rule):
filters.append({'query': query_str_filter})
elastalert_logger.debug("Enhanced filter with {} terms: {}".format(listname, str(query_str_filter)))

def get_elasticsearch_client(self, rule):
key = rule['name']
es_client = self.es_clients.get(key)
if es_client is None:
es_client = elasticsearch_client(rule)
self.es_clients[key] = es_client
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be condensed into a single line,

self.es_clients[key] = es_client = elasticsearch_client(rule)

Not sure what your stylistic preference is.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer readability over condensed code, so what you have looks fine.

return es_client

def run_rule(self, rule, endtime, starttime=None):
""" Run a rule for a given time period, including querying and alerting on results.

Expand All @@ -868,7 +876,7 @@ def run_rule(self, rule, endtime, starttime=None):
:return: The number of matches that the rule produced.
"""
run_start = time.time()
self.thread_data.current_es = self.es_clients.setdefault(rule['name'], elasticsearch_client(rule))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setdefault results in the correct behaviour, but because it takes a concrete default value, elasticsearch_client is getting called every run and the result is disregarded. From what I see dictionary does not have a lazy option.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see where this will help improve efficiency. I think a unit test that proves the same elasticsearch_client instance is returned from the new getter function, when called twice with the same rule, would be good unit coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is especially bad for us running in AWS because inside Auth we call get_credentials which is an HTTP call to the Ec2 metdata service:

refreshable_credential=session.get_credentials(),

self.thread_data.current_es = self.get_elasticsearch_client(rule)

# If there are pending aggregate matches, try processing them
for x in range(len(rule['agg_matches'])):
Expand Down
17 changes: 17 additions & 0 deletions tests/base_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -1426,3 +1426,20 @@ def test_add_aggregated_alert_error(ea, caplog):
exceptd = "[add_aggregated_alert]"
exceptd += "Error parsing aggregate send time format unsupported operand type(s) for +: 'datetime.datetime' and 'dict'"
assert exceptd in message


def test_get_elasticsearch_client_same_rule(ea):
x = ea.get_elasticsearch_client(ea.rules[0])
y = ea.get_elasticsearch_client(ea.rules[0])
assert x is y, "Should return same client for the same rule"


def test_get_elasticsearch_client_different_rule(ea):
x_rule = ea.rules[0]
x = ea.get_elasticsearch_client(x_rule)

y_rule = copy.copy(x_rule)
y_rule['name'] = 'different_rule'
y = ea.get_elasticsearch_client(y_rule)

assert x is not y, 'Should return unique client for each rule'