-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS logs. Exit fast when 3 consecutive responses are returned from AWS Cloudwatch logs #30756
Conversation
…S Cloudwatch logs
# The issue with this approach is, it can take a huge amount of time (e.g. 20 seconds) to retrieve logs | ||
# As an intermediate solution, we decided to stop fetching logs if 3 consecutive responses are empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if that happens what user is expected to do / how would user know that this is the issue ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He would not be able to know. Before #20814, we were stopping fetching logs after one response from CW is empty. After #20814 being introduced, we now wait that the next token is the same in 2 consecutive responses. We now shoot in the middle to satisfy a bit of the two worlds. This fix is very empirical based on user experiences. They get a very frustrated experience where, sometimes, they need to wait 20 seconds to retrieve logs
else: | ||
token_arg = {} | ||
|
||
response = self.get_conn().get_log_events( | ||
response = self.conn.get_log_events( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for including this update ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could wrap it in a flag to let the user decide whether to wait or quit early, defaulting to quit early, but this seems like a decent compromise to me.
An issue had been introduced by #20814. Now, it takes customers 20 seconds or more to load task logs from past executions. As a solution we decided to exit when 3 consecutive empty responses are returned from AWS Cloudwatch logs
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.