Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HEAD request in Freesound ingester for file size request #1578

Closed
1 task
AetherUnbound opened this issue Sep 30, 2022 · 5 comments · Fixed by WordPress/openverse-catalog#973
Closed
1 task
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users good first issue New-contributor friendly help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🐍 tech: python Involves Python

Comments

@AetherUnbound
Copy link
Collaborator

Description

Dependent on #1579 and WordPress/openverse-catalog#746

We should replace the requests.head call with the DelayedRequester::head call described in #1579:

https://github.com/WordPress/openverse-catalog/blob/c332465db5dfdff41f9f89cd6792c01732a570be/openverse_catalog/dags/providers/provider_api_scripts/freesound.py#L172

Implementation

  • 🙋 I would be interested in implementing this feature.
@AetherUnbound AetherUnbound added 🐍 tech: python Involves Python 💻 aspect: code Concerns the software code in the repository 🟨 priority: medium Not blocking but should be addressed soon 🧰 goal: internal improvement Improvement that benefits maintainers, not users labels Sep 30, 2022
@AetherUnbound AetherUnbound added 🟩 priority: low Low priority and doesn't need to be rushed and removed 🟨 priority: medium Not blocking but should be addressed soon labels Oct 19, 2022
@AetherUnbound AetherUnbound added 🟧 priority: high Stalls work on the project or its dependents and removed 🟩 priority: low Low priority and doesn't need to be rushed labels Dec 1, 2022
@AetherUnbound
Copy link
Collaborator Author

I've bumped this to high because we're now seeing failures of the DAG occurring at this request. Specifically, if the URL 404s, this raises a KeyError for content-length (example Airflow logs). Using the machinery of the DelayedRequester::head method might help us address this.

@AetherUnbound
Copy link
Collaborator Author

AetherUnbound commented Dec 1, 2022

Here's the failing sound in question: https://freesound.org/people/dl-jones/sounds/653078/

It looks like missing preview for a sound is indeed an issue with the Freesound data that we'll just need to account for. In these cases would we just want to skip the record entirely, since we can't preview it on our end? If that's the case I should probably make a new issue. CC @WordPress/openverse-catalog

@zackkrida
Copy link
Member

In these cases would we just want to skip the record entirely, since we can't preview it on our end?

This is an interesting question, and I think it's okay to skip these records. The audio player is broken in FreeSound's frontend as well, which is a terrible user experience.

@AetherUnbound
Copy link
Collaborator Author

AetherUnbound commented Dec 2, 2022

It's odd though, I just logged in and you can download the audio file itself 🤔 I'm going to make a new issue for this though since it seems separate from the HEAD request issue, we can continue discussion there.

@AetherUnbound
Copy link
Collaborator Author

I'm setting this priority to low since #1321 is the issue that's actually affecting the DAG's ability to run successfully.

@AetherUnbound AetherUnbound added 🟩 priority: low Low priority and doesn't need to be rushed good first issue New-contributor friendly help wanted Open to participation from the community and removed 🟧 priority: high Stalls work on the project or its dependents labels Dec 13, 2022
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Apr 17, 2023
@obulat obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023
@obulat obulat moved this from 📋 Backlog to ✅ Done in Openverse Backlog Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users good first issue New-contributor friendly help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🐍 tech: python Involves Python
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants