-
Notifications
You must be signed in to change notification settings - Fork 122
Issues with results when page count gets high #565
Comments
I tried to reproduce the same error for the first query = trees ,same sources , page 170 was failing sometimes with 502 bad gateway and sometime's it worked with Postman, and also sometimes it's failing with smaller page numbers like page =100 ,150 etc |
It looks like the cause of this is inefficient link rot validation. If you jump straight to page 170 before any other similar requests have been made and cached, the result is that the server walks through every image in the prior pages and sends a It is absolutely necessary to validate all of the prior images exist in order to prevent inconsistent result pagination. I would classify this as a minor bug, as the overwhelming majority of users do not paginate deeply or make cold jumps to the end of the search results. To fix it would requires overhauling image validation to be more efficient, which is feasible but will take some time due to the complexity in this area. Here are our options: Preferred
Less preferred
|
This was a bug that X5gon ran into, they are using our API to power their image search for OER. So someone cares. :) |
Yes, one exception! But we should have an alternative option for people who are looking to bulk scrape our whole catalog, such as a data dump or a bulk load endpoint. The search endpoint is optimized for finding the best results for your search query, not bulk downloads. I would also expect that the deeper you go, the worse the results are. You'll find that other search products often limit the result set as well. |
Email from an API consumer:
I found that we get some error messages from the CC Search API where the page parameter is high, e.g. +170. For page=169 we still get the results, but for bigger numbers it returns an Internal system error 500 (or 502) even though the page_count is not exceeded.
The example I was going with had the following parameters:
When page=169 I get the following metadata:
Another example is:
When page=200 I get the following metadata:
The text was updated successfully, but these errors were encountered: