Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report Celery Job is always scheduled 12 hours ahead #27952

Open
3 tasks done
nathan-gilbert opened this issue Apr 9, 2024 · 5 comments
Open
3 tasks done

Report Celery Job is always scheduled 12 hours ahead #27952

nathan-gilbert opened this issue Apr 9, 2024 · 5 comments
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature

Comments

@nathan-gilbert
Copy link

Bug description

Problem is the title. No matter what timezone I select, the report job gets scheduled 12 hours ahead.

In the Superset UI, the timezone display on the created report job is always different than what I selected in the UI (which is typically, GMT -6):
image

In flower, I can see the jobs and their ETA looks correct but the job doesn't get executed then, but exactly 12 hours later:
image

I'm running Superset/Celery on a Ubuntu 22.04, Python 3.11. Not in docker, just directly.

Here's my celery config:

class CeleryConfig:
    """
    Celery worker configuration
    """

    broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/0"
    imports = (
        "superset.sql_lab",
        "superset.tasks.scheduler",
    )
    result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/0"
    worker_prefetch_multiplier = 10
    task_protocol = 2
    task_acks_late = True
    task_annotations = {
        "sql_lab.get_sql_results": {
            "rate_limit": "100/s",
        },
        "tasks.add": {
            "rate_limit": "10/s",
        },
        "email_reports.send": {
            "rate_limit": "1/s",
            "time_limit": 120,
            "soft_time_limit": 150,
            "ignore_result": True,
        },
    }
    beat_schedule = {
        "reports.scheduler": {
            "task": "reports.scheduler",
            "schedule": crontab(minute="*", hour="*"),
        },
        "reports.prune_log": {
            "task": "reports.prune_log",
            "schedule": crontab(minute="0", hour="0"),
        },
    }
    timezone = "America/Denver"
    enable_utc = False

How to reproduce the bug

Turn on Alerts/reports.

Add new report, select GMT -6 as the timezone. Create the job. See in the worker logs that the job wont execute for another 12 hours.

Screenshots/recordings

Celery worker logs:
image

Superset version

3.1.2

Python version

3.11

Node version

Not applicable

Browser

Chrome

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@nathan-gilbert nathan-gilbert changed the title Report is always scheduled 12 hours ahead Report Celery Job is always scheduled 12 hours ahead Apr 9, 2024
@tridoxx
Copy link

tridoxx commented May 6, 2024

hi @nathan-gilbert you found some solutión about this problem? i have the same issue, thanks for you comment

@nathan-gilbert
Copy link
Author

@tridoxx No, I have found no solution yet.

Copy link

dosubot bot commented May 6, 2024

It looks like the issue you're experiencing with Celery jobs in Superset being scheduled 12 hours ahead could be due to a timezone misalignment, particularly in how timezones are handled and converted within Superset's scheduling logic. This misalignment might stem from the timezone setting in the Superset UI not being correctly recognized or applied when calculating the next execution times for jobs.

To address this, ensure that the timezone you set in the Superset UI is correctly formatted and recognized by the pytz library, as any incorrect or unrecognized timezone will default to UTC. Additionally, you might need to review and possibly customize the scheduling logic in Superset, especially if the default behavior doesn't align with your expectations. This could involve adjusting how datetime values are localized to the specified timezone and ensuring accurate conversion back to UTC if necessary.

There's also a related issue and suggested solution that involves replacing datetime.utcnow() with datetime.now(timezone.utc) in the scheduler code, which has been discussed in the Superset GitHub repository. This change aims to correct the timezone handling and might be a good starting point for resolving the scheduling discrepancies you're facing.

If after these steps you're still encountering issues, expanding logging to include key variables like triggered_at, start_at, stop_at, and the final scheduled times can help identify where the mismatches occur, providing further insight into the problem and aiding in troubleshooting.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@tridoxx
Copy link

tridoxx commented May 7, 2024

hi @nathan-gilbert i did this correctión, first you need to modify the file "superset/task/schedule.py" in my case, i am using docker so, for some strange reason, i need to modify this file inside the "superset_worker" container, not the "superset_app" container. so i edit the to this.

`import pytz
utc_now = datetime.utcnow()

and change the line
#async_options = {"eta": schedule}
to
async_options = {"eta": utc_now} `

something like this.

image

and reboot the machine. for me this work, the superset wil take the real utc time, and subtract the correct UTC time from the timezone defined directly by the alert generator on the superset app.

for me is working now without problem, check that is correct using the crontab on superset, to execute the alert every minute, and use the next comand to check logs "docker logs superset_worker --since 1h" if you are using the docker "superset_worker"

@nicholaslimck
Copy link

nicholaslimck commented Dec 2, 2024

Can verify that I have the same issue; tested with a task scheduled to run at 4AM and the task ran at 4PM instead despite configuring all time zones correctly (GMT+08) on Superset 4.1.1.

Seems like the same issue was raised in #29797 and fixed by #29798.

@rusackas rusackas added the alert-reports Namespace | Anything related to the Alert & Reports feature label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature
Projects
None yet
Development

No branches or pull requests

4 participants