Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WordPressDataIngester::_get_filesize should not use get_response_json #1373

Closed
1 task
AetherUnbound opened this issue Nov 1, 2022 · 0 comments · Fixed by WordPress/openverse-catalog#865
Closed
1 task
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🐍 tech: python Involves Python

Comments

@AetherUnbound
Copy link
Collaborator

Description

The WordPress Photo Directory data ingester's _get_filesize method is presently attempting to use ProviderDataIngester::get_response_json on an endpoint which is not JSON:

https://github.com/WordPress/openverse-catalog/blob/33c29f28672e0ae75cfd410b38b5d0a6d6dfaf5a/openverse_catalog/dags/providers/provider_api_scripts/wordpress.py#L132

This is giving us the following error:

[2022-11-01T16:18:30.514+0000] {requester.py:115} WARNING - Could not get response_json.
Expecting value: line 1 column 1 (char 0)
[2022-11-01T16:18:30.514+0000] {requester.py:121} WARNING - Bad response_json:  None
[2022-11-01T16:18:30.514+0000] {requester.py:122} WARNING - Retrying:
_get_response_json(
    https://pd.w.org/2021/12/69761bd8fea55cd53.11044544-2048x1423.jpg,
    {},
    retries=-1)

Reproduction

  1. Trigger the DAG with the config: {"initial_query_params": {"format": "json", "page": 28, "per_page": 100, "_embed": "true"}}
  2. Observe the failure noted above

Additional context

Related to but separate from #1377

Resolution

  • 🙋 I would be interested in resolving this bug.
@AetherUnbound AetherUnbound added 🐍 tech: python Involves Python 💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents labels Nov 1, 2022
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Apr 17, 2023
@obulat obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023
@obulat obulat moved this from 📋 Backlog to ✅ Done in Openverse Backlog Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🐍 tech: python Involves Python
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant