Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configuration to skip specific ingestion errors #1447

Closed
1 task
stacimc opened this issue Aug 26, 2022 · 2 comments · Fixed by WordPress/openverse-catalog#1011
Closed
1 task

Add configuration to skip specific ingestion errors #1447

stacimc opened this issue Aug 26, 2022 · 2 comments · Fixed by WordPress/openverse-catalog#1011
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: catalog Related to the catalog and Airflow DAGs 🔧 tech: airflow Involves Apache Airflow 🐍 tech: python Involves Python

Comments

@stacimc
Copy link
Collaborator

stacimc commented Aug 26, 2022

Problem

WordPress/openverse-catalog#650 added dagrun conf options to make a provider script skip ingestion errors. This is currently all-or-nothing; when skip_ingestion_errors is true, all errors (except for AirflowExceptions) are caught, allowing ingestion to continue, and then re-raised at the end of ingestion.

In reality, it's likely that we'll want to skip specific errors, but we may still want ingestion to be halted if an unexpected error is thrown.

Description

We can update the conf option to be similar to the silenced_slack_alerts configuration (#654), which allows matching an error predicate. Instead of a boolean value, skip_ingestion_errors could be a list of strings to match on.

Like silenced_slack_alerts, we should make sure this allows us to match specific error message text and error types (ie adding 'KeyError' to the list should allow us to skip all KeyErrors). It may be nice to make sure there's still a way to easily skip all errors if that's truly what we want.

Implementation

  • 🙋 I would be interested in implementing this feature.
@stacimc stacimc added 🟩 priority: low Low priority and doesn't need to be rushed ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository labels Aug 26, 2022
@stacimc
Copy link
Collaborator Author

stacimc commented Sep 14, 2022

Resolved in WordPress/openverse-catalog#654

@stacimc stacimc closed this as completed Sep 14, 2022
@stacimc stacimc reopened this Sep 23, 2022
@stacimc stacimc removed their assignment Sep 23, 2022
@stacimc
Copy link
Collaborator Author

stacimc commented Sep 23, 2022

I mistakenly closed this thinking it was referring to skipping Slack notifications for specific errors. This issue is for skipping the errors themselves during ingestion.

@AetherUnbound AetherUnbound added 🐍 tech: python Involves Python 🔧 tech: airflow Involves Apache Airflow labels Oct 19, 2022
@stacimc stacimc self-assigned this Feb 22, 2023
@obulat obulat added the 🧱 stack: catalog Related to the catalog and Airflow DAGs label Feb 24, 2023
@obulat obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: catalog Related to the catalog and Airflow DAGs 🔧 tech: airflow Involves Apache Airflow 🐍 tech: python Involves Python
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants