-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Europeana data ingester can return None
in URL
#2784
Comments
The plan was to update this issue to make it a good first issue. However, I looked into the code in Europeana, and I couldn't find how the I think we should refactor the Europeana script to use the pattern used in other scripts: it will be easier to debug. |
I'm able to reproduce this locally by using the following
I added a log for the URL processing step and came across this:
Two things to note:
So to me it seems like there might be two separate approaches here:
|
Description
Europeana ingestion workflow failed because a
None
was saved in the URL field, and then theMediaStore
class tried to extract the filetype from it.Reproduction
See the error in Airflow logs:
Additional context
Europeana script uses a
raise_if_empty
decorator to ensure that only items with valid required data are saved.The text was updated successfully, but these errors were encountered: