Move DAG status information into Airflow Variable #1368
Labels
💻 aspect: code
Concerns the software code in the repository
✨ goal: improvement
Improvement to an existing user-facing feature
🟩 priority: low
Low priority and doesn't need to be rushed
🧱 stack: catalog
Related to the catalog and Airflow DAGs
🔧 tech: airflow
Involves Apache Airflow
🐍 tech: python
Involves Python
Description
DAG status information presently exists in a public handbook page here: https://make.wordpress.org/openverse/handbook/openverse-catalog/dag-status-information/. This page is manually edited and we have frequently forgotten to come back and edit it after a DAG has been re-enabled.
It would be fantastic if we could capture this information in an Airflow Variable tied to one or several GitHub issues and use a mechanism similar to the one defined in WordPress/openverse-catalog#644 to regularly check that the GitHub issues remain open and whether the DAG should still be enabled/disabled. The structure of the variable could be similar:
This could then also serve as a means of generating the DAG status page automatically, so we still had an easy external reference for which DAGs were paused at any given time! We could add some additional checks, like ensuring a DAG shouldn't be paused without an associated record in this Variable, or that the Variable should not have a record for an unpaused DAG, etc.
Additional context
Implementation
The text was updated successfully, but these errors were encountered: