Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS Operator: Realtime container execution logs in Airflow task execution log #22512

Closed
1 of 2 tasks
virendhar-aws opened this issue Mar 24, 2022 · 7 comments
Closed
1 of 2 tasks
Labels

Comments

@virendhar-aws
Copy link

Description

Objective:

Add functionality to Airflow ECSOperator to capture the task execution logs as they happen from AWS CloudWatch to the ECSOperator's task execution logs.

Use case/motivation

Present Challenge:
When executing Fargate tasks via ECSOperator the logs are not readily available in Airflow task during execution. The logs are shown after execution and final status of the container execution is reflected in Airflow ECS Operator task logs. As a result one should navigate the task logs in Airflow to get a handle of the container log path in AWS Cloudwatch and then access the log from AWS Cloudwatch in realtime.

The current behavior introduces the following challenges,

The ongoing execution logs are not captured in Airflow task ( ECS Operator ) log. This typically involves couple of additional steps - which can be avoided.
Also for long execution tasks, one has to wait till the task is executed to debug issues if any

Solution

Mechanism to relay the logs from AWS CloudWatch during execution to Airflow logs as and when they are happening in realtime (or near realtime < 2 mins ).

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@virendhar-aws virendhar-aws added the kind:feature Feature Requests label Mar 24, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 24, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented Mar 27, 2022

Hopefully someone will take it - but if you want to make it happen, the fastest and most certain way is to become a contributor and submit it as a PR (which I heartily encourage you to do). That should involve streaming the logs from Fargate (in a similar way we do it for K8S Pod Operator.

@potiuk potiuk added the provider:amazon-aws AWS/Amazon - related issues label Mar 27, 2022
@hankehly
Copy link
Contributor

hankehly commented Mar 29, 2022

#17626 seems to address this issue.

def run(self) -> None:
logs_to_skip = 0
while not self.is_stopped():
time.sleep(self.fetch_interval.total_seconds())
log_events = self._get_log_events(logs_to_skip)
for log_event in log_events:
self.logger.info(self._event_to_str(log_event))
logs_to_skip += 1

@hankehly
Copy link
Contributor

@virendhar-aws Could you check that you're using apache-airflow-providers-amazon 2.3.0 or above?

@virendhar-aws
Copy link
Author

@hankehly - I noticed this on an earlier version Airflow 2.0.0. Let me verify the same on a recent version later this week.

@shubham22
Copy link

@virendhar-aws - could you confirm if you were able to verify this? If yes, we should close this ticket.

@potiuk
Copy link
Member

potiuk commented Sep 19, 2022

Closing. We can always reopen if we find it's not fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants