Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making scheduled tasks run on the start_date #384

Closed
kimtore opened this issue Sep 11, 2015 · 2 comments
Closed

Making scheduled tasks run on the start_date #384

kimtore opened this issue Sep 11, 2015 · 2 comments

Comments

@kimtore
Copy link

kimtore commented Sep 11, 2015

It took me a while to figure out why my DAGs weren't run on schedule, but then I found this in the docs:

Note that if you run a DAG on a schedule_interval of one day, the run stamped 2016-01-01 will be trigger soon after 2016-01-01T23:59. In other words, the job instance is started once the period it covers has ended.

I find this counter-intuitive, and in my opinion it would be better to have the task run exactly on the start_date, and not delayed by the schedule_interval time delta.

Is it possible to configure Airflow so that this functionality is achieved?

@cgelman
Copy link

cgelman commented Sep 14, 2015

+1 this. If you look in models.py: is_queueable():
https://github.com/airbnb/airflow/blob/master/airflow/models.py#L642
There's this line:

  if self.execution_date > datetime.now() - self.task.schedule_interval:
            return False

This makes sense for hourly or daily jobs - you need to wait for the day to finish to start this. But if want to have a weekly job that runs at the end of the week - this will wait until 7 days after the end of the week.

I guess you can get around this by making your schedule based on a start date of the first day of the week...

@mistercrunch
Copy link
Member

Yes, that's the convention. Maybe it needs to be more prominent in the docs.

kaxil pushed a commit to YingboWang/airflow that referenced this issue Sep 8, 2020
* [AIRBNB][DI-3845] Build smart sensor operator

* [DI-3845][addendum]Support infra retry in smart sensor (apache#372)

* [DI-3845][addendum]Group distributed task log by host (apache#373)

* [DI-3845][addendum]Set the end_of_log only logs on all hosts end (apache#381)

Set end_of_log when logs from all know hosts have been fully loaded or
es query has lasted for 5 mins.

This is to fix the issue that sometime a second worker can pick up a running
task and exit after checking dependency. Checking only the end of log mark for
a host with later timestamp may violate the loading of real running worker.

* [DI-3845][addendum]Fix UI empty log return (apache#383)

* [DI-3845][addendum]Raise infra failure without retry for smart sensor (apache#384)

* [DI-3845][addendum]Add query infor for exception log (apache#387)

Apply Black formatting

Apply Black formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants