-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECSOperator realtime logging #17626
ECSOperator realtime logging #17626
Conversation
3cb075a
to
a8879f4
Compare
try: | ||
yield from self.hook.get_log_events(self.log_group, self.log_stream_name, skip=skip) | ||
except ClientError as error: | ||
if error.response['Error']['Code'] != 'ResourceNotFoundException': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResourceNotFoundException
is usually the flag when ECS fails to provision tasks due to its own reasons (#15000 (comment)). If it's getting ignored here, how does Airflow react to such fail-to-start
ECS tasks? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zachliu that's a good question. Now i realise that this case is not handled at all. And looks like handling of fail-to-start
was implicitly relied the availability of the log stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zachliu i can only guess what describe_tasks
returns in case of the edge case (#15000), but i guess if arn is presented and task is not started, then there should be either empty list of tasks in response and it's handled and the proper error is thrown or there is a non empty list of tasks and their status is handled by current logic. The missing part of handling empty list of tasks can be added and i think it's more clear than relying on cloudwatch logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, describe_tasks
polling was also recommended by aws support engineers. But it may get a bit tricky when a task is supposed to be short-living.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, i can prepare a small pr with a check for existence of the log group of the task. Just to restore previous behaviour with better error message. Like if the task is stopped, logging is enabled and there is no log group, then we throw the error. But then we need to exclude the provider from release @potiuk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you sire! this thing has been under my radar for a while but i never got time to actually do it 😛
@codenamestif i'm testing out this feature in our integration env right now. IT IS AWESOME! |
New provider's release is coming for amazon - waiting for 2 more PRs :) |
@potiuk out of curiosity, which 2 PRs? |
Closes #16753.
The idea was to start a thread that would fetch Cloudwatch logs using a fetch interval once the ECS task is started.