Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow failed ingestion tasks to retry where they left off #1653

Closed
1 task
AetherUnbound opened this issue Feb 18, 2022 · 0 comments · Fixed by WordPress/openverse-catalog#650
Closed
1 task
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟨 priority: medium Not blocking but should be addressed soon

Comments

@AetherUnbound
Copy link
Collaborator

AetherUnbound commented Feb 18, 2022

Problem

See WordPress/openverse-catalog#357

If an ingestion task fails, a retry can be attempted. This retry has no context for the previous run, and so it will begin reprocessing the data from the API at the very beginning, rather than where it left off.

Description

We should investigate the feasibility of allowing a task retry to pick up from the last set of query parameters.

We'll need to be careful how we implement this, as the ramifications might cause problems for us. If we attempt to retry within the same DAG, we'll create a situation where different task attempts actually have different outcomes. This goes a bit against the Airflow paradigm, particularly since Airflow displays the most recent log first and said log may not have all of the run information (if previous runs captured some amount of data but later failed). Additionally, the downstream loader steps will also require multiple attempts, each of which may result in separate successful data loads.

I think one alternative to this may be:

  1. Scheduled runs will ignore any preserved parameter information and start from the beginning
  2. Manual DAG runs will attempt to use previous parameter information

This would let us start another run after a failure manually, which would pick up the query parameters and continue from where the previous run left off. It would also encapsulate that in two (or more) distinct DAG runs, so we don't have to expect an Airflow user to look through multiple task retry logs to get the full picture.

Alternatives

Additional context

Implementation

  • 🙋 I would be interested in implementing this feature.
@AetherUnbound AetherUnbound added ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository 🟩 priority: low Low priority and doesn't need to be rushed labels Feb 18, 2022
@AetherUnbound AetherUnbound added 🟨 priority: medium Not blocking but should be addressed soon and removed 🟩 priority: low Low priority and doesn't need to be rushed labels Jul 26, 2022
@stacimc stacimc self-assigned this Aug 2, 2022
@obulat obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Apr 17, 2023
@obulat obulat moved this from 📋 Backlog to ✅ Done in Openverse Backlog Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟨 priority: medium Not blocking but should be addressed soon
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants