Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace usages of task context logger with the log table #40867

Conversation

dstandish
Copy link
Contributor

Shipping messages to TI logs, which unfortunately in many cases means uploading to s3, is inefficient, and not really something the scheduler should be doing.

Instead we can use an old but underutilized feature: the Log table.

I have to do some trickery to save log events from executors, namely store them in a queue and consume them in the scheduler loop.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:Executors-core LocalExecutor & SequentialExecutor area:providers area:Scheduler including HA (high availability) scheduler area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues provider:amazon-aws AWS/Amazon - related issues provider:celery labels Jul 18, 2024
@dstandish dstandish requested a review from vincbeck July 18, 2024 14:04
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@vincbeck
Copy link
Contributor

Some tests need to be updated though

@dstandish dstandish force-pushed the replace-usages-of-task-context-logger-with-the-log-table branch from 167cf37 to 9225fbf Compare July 18, 2024 21:32
@dstandish dstandish added this to the Airflow 2.10.0 milestone Jul 18, 2024
@dstandish dstandish force-pushed the replace-usages-of-task-context-logger-with-the-log-table branch from a069dac to 764930e Compare July 18, 2024 22:43
@dstandish dstandish force-pushed the replace-usages-of-task-context-logger-with-the-log-table branch 2 times, most recently from c098177 to 26d5331 Compare July 18, 2024 23:45
@potiuk potiuk closed this Jul 19, 2024
@potiuk potiuk reopened this Jul 19, 2024
@dstandish dstandish merged commit f684a58 into apache:main Jul 19, 2024
53 checks passed
@dstandish dstandish deleted the replace-usages-of-task-context-logger-with-the-log-table branch July 19, 2024 19:41
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Jul 22, 2024
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
Use Log table instead of task context logger

The task context logger is inefficient; Log is better for this reason 

---------

Co-authored-by: Vincent <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:Executors-core LocalExecutor & SequentialExecutor area:providers area:Scheduler including HA (high availability) scheduler area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues provider:amazon-aws AWS/Amazon - related issues provider:celery type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants