Image source: justtakeitfree.com #1264

Aventurier · 2023-03-30T19:33:56Z

Source Site

https://justtakeitfree.com/

Value Provided

It's an independent project from ukrainian family. We host only our photos.

Licenses Provided

CC BY 4.0

Implementation

🙋 I would be interested in implementing this feature.

obulat · 2023-03-31T13:54:57Z

Thank you for the source suggestion, @Aventurier! Do you have an API that Openverse could use to get the images?

dhruvkb · 2023-04-06T14:14:02Z

Based on a little digging, the site does not have an API, but they do have a clean markup that can be used to run through and scrape the site. They do provide quite a bit of info (like tags) and all images credit "Justtakeitfree Free Photos" as the author. One thing that's missing is a title. None of the images are titled and only use a numeric ID as the identifier and in places like the HTML <title>, a concatenation of all the tags is used.

I'm not aware of the scraping policy of the catalog and if a REST API is a requirement but this site has a small collection of very high quality images that might make a nice addition to our content.

sarayourfriend · 2023-06-27T02:47:45Z

Without a response from @Aventurier regarding the API, I think we should plan to scrape. There are currently 6 pages of results and around 178 results (based on https://justtakeitfree.com/photo/178/ existing and anything beyond that like https://justtakeitfree.com/photo/179/ and https://justtakeitfree.com/photo/180/ returning a 404, though https://justtakeitfree.com/photo/1/ also 404s). If the DAG requested one page every two seconds it would only take around 6 minutes to ingest the entire provider. We could do that monthly to reduce impact from the scraping.

Seems doable and I think the assumption that we can scrape is safe considering the volume and lack of ToS.

As for the title to use, the site itself appears to use the first tag in the list of tags for the filename when you click the download link. We can do the same.

Regarding the attribution, the author should be as Dhruv mentioned "Justtakeitfree Free Photos". Based on the text of the issue ("We host only our photos") it sounds like that's an appropriate attribution to credit the creators. I dumped EXIF on one of the images and there is nothing to suggest otherwise.

So to clarify the DAG implementation:

Make iterative requests to https://justtakeitfree.com/photo/<i>/, incrementing i once every two seconds. Go until 10 404s are returned in a row.
DAG should run monthly
The license is CC-BY 4.0 for all works
Creator is "Justtakeitfree Free Photos" for all works.
Provider/source is "justtakeitfree.com"
Foreign landing URL is https://justtakeitfree.com/photo/<i>/ for the work
URL is https://justtakeitfree.com/photos/<i>_800.jpg for the work (full sized is available on the landing page)
No special thumbnail URL, the _800.jpg version is used as the thumbnail on the site

Aventurier · 2023-06-27T18:09:17Z

I'm sorry for long pause. Actually I did small API that can search for an images by tag and retrieve information about image.
https://justtakeitfree.com/api/api.php?key=vj45mub435v6bsdf90&query=search&tag=grass
It's an example how to find images by tag (you can use the key that is in example).
If you have any question or you want me to extend an API, please, write.
Thanks.

sarayourfriend · 2023-06-28T23:15:16Z

No worries, Aventurier! Thanks for letting us know. Can you share whether there is a way to paginate through the API? For Openverse's catalogue to be able to get all the images, we'd need to be able to use the API to paginate through all the images rather than for just particular tags. Something like:

https://justtakeitfree.com/api/api.php?page=1
https://justtakeitfree.com/api/api.php?page=2
https://justtakeitfree.com/api/api.php?page=3
https://justtakeitfree.com/api/api.php?page=4

etc., without any query terms.

Is the email on the privacy policy page the best location to get in touch regarding a key specifically for Openverse (to avoid the secret leaking publicly)?

Aventurier · 2023-07-13T17:58:35Z

Done
https://justtakeitfree.com/api/api.php?key=vj45mub435v6bsdf90&page=1
https://justtakeitfree.com/api/api.php?key=vj45mub435v6bsdf90&page=2

Please, leave me your mail, I will send a new key and then will delete this one

zackkrida · 2023-07-13T18:03:06Z

@Aventurier amazing! Thank you so much. You can email us at [email protected] with a new key.

krysal · 2023-07-17T20:55:54Z

API key received. Thank you, @Aventurier!

Aventurier added 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work 🧹 status: ticket work required Needs more details before it can be worked on labels Mar 30, 2023

zackkrida changed the title ~~<Source name here>~~ Image source: justtakeitfree.com Mar 30, 2023

obulat added the 🟩 priority: low Low priority and doesn't need to be rushed label Mar 31, 2023

dhruvkb added 🌟 goal: addition Addition of new feature 🧱 stack: catalog Related to the catalog and Airflow DAGs and removed 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels Apr 6, 2023

github-project-automation bot added this to Openverse Backlog Apr 17, 2023

github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Apr 17, 2023

obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023

obulat added ☁️ provider: images Image provider 💻 aspect: code Concerns the software code in the repository labels Jun 19, 2023

sarayourfriend removed the 🧹 status: ticket work required Needs more details before it can be worked on label Jun 27, 2023

obulat mentioned this issue Aug 7, 2023

Provider: justtakeitfree.com #2793

Merged

8 tasks

AetherUnbound moved this from 📋 Backlog to 🏗 In progress in Openverse Backlog Aug 17, 2023

sarayourfriend assigned obulat Aug 17, 2023

obulat closed this as completed in #2793 Sep 12, 2023

github-project-automation bot moved this from 🏗 In progress to ✅ Done in Openverse Backlog Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image source: justtakeitfree.com #1264

Image source: justtakeitfree.com #1264

Aventurier commented Mar 30, 2023

obulat commented Mar 31, 2023

dhruvkb commented Apr 6, 2023

sarayourfriend commented Jun 27, 2023

Aventurier commented Jun 27, 2023

sarayourfriend commented Jun 28, 2023

Aventurier commented Jul 13, 2023

zackkrida commented Jul 13, 2023

krysal commented Jul 17, 2023

Image source: justtakeitfree.com #1264

Image source: justtakeitfree.com #1264

Comments

Aventurier commented Mar 30, 2023

Source Site

Value Provided

Licenses Provided

Implementation

obulat commented Mar 31, 2023

dhruvkb commented Apr 6, 2023

sarayourfriend commented Jun 27, 2023

Aventurier commented Jun 27, 2023

sarayourfriend commented Jun 28, 2023

Aventurier commented Jul 13, 2023

zackkrida commented Jul 13, 2023

krysal commented Jul 17, 2023