Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[User Feedback] Searches page depth #668

Open
2 tasks
clementoriol opened this issue Sep 26, 2022 · 0 comments
Open
2 tasks

[User Feedback] Searches page depth #668

clementoriol opened this issue Sep 26, 2022 · 0 comments
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟨 priority: medium Not blocking but should be addressed soon 🧱 stack: api Related to the Django API 🧹 status: ticket work required Needs more details before it can be worked on 💬 talk: discussion Open for discussions and feedback

Comments

@clementoriol
Copy link

clementoriol commented Sep 26, 2022

Due date: yyyy-mm-dd

Assigned reviewers

  • TBD
  • TBD

Description

Context

Hello ! I'm a consumer of the OpenVerse API.
I've started building a free, open-source app that allows artists and art students to do timed drawing sessions for practice, using only Creative Common images.

I'm using the OpenVerse API (with a registered key) to search CC images and fetch their metadata, licence info and creator information for crediting them.

I wanted to provide some feedback on the /image_search endpoint, which is the one I'm currently working with.

The issue

The docs makes it clear that the API is not intended for in-depth search and will limit to return the first 10,000 results :

Although there may be millions of relevant records, only the most relevant several thousand records can be viewed. This is by design: the search endpoint should be used to find the top 10,000 most relevant results, not for exhaustive search or bulk download of every barely relevant result. As such, the caller should not try to access pages beyond page_count, or else the server will reject the query.

While more results would be more convenient for my usage, 10,000 still seems like a fair trade-off. Not too few, not too much.

However, What I expected reading this was to only receive 10,000 results max whatever my request is :

  •  page_size=20 -> 500 pages
  • page_size=100 -> 100 pages
  • page_size=500 -> 20 pages

But I quicky realized that the page_count is capped to 20, apparently by design (see WordPress/openverse-api#859).
So, if I'm asking for a page_size of 20, I'll only get 400 browsable results.

Which means if I want to access the 10,000 results, I have to set a page_size=500 which is, well, not optimal :

  • Heavy response payload, so slow request
  • Lot of thumbnails to download, resulting in slow page loading times

Discussion

I'm having trouble understanding why the page_count cap was necessary, and why consumers are not just allowed to browser freely through the first 10,000 records, setting page_size as they want ?

Is this something you would consider improving, or is the OpenVerse API not a good fit for what I'm trying to do ?

Thank you for your time,
looking forward to discuss this with you

@clementoriol clementoriol added the 💬 talk: discussion Open for discussions and feedback label Sep 26, 2022
@obulat obulat transferred this issue from WordPress/openverse-api Feb 22, 2023
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Feb 23, 2023
@AetherUnbound AetherUnbound added 🧱 stack: api Related to the Django API and removed 🧱 stack: backend labels May 15, 2023
@krysal krysal added the 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work label May 25, 2023
@dhruvkb dhruvkb added 🟨 priority: medium Not blocking but should be addressed soon ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository 🧹 status: ticket work required Needs more details before it can be worked on and removed 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟨 priority: medium Not blocking but should be addressed soon 🧱 stack: api Related to the Django API 🧹 status: ticket work required Needs more details before it can be worked on 💬 talk: discussion Open for discussions and feedback
Projects
Status: 📋 Backlog
Development

No branches or pull requests

5 participants