Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iNaturalist #1608

Closed
6 of 7 tasks
sarayourfriend opened this issue Mar 24, 2022 · 8 comments · Fixed by WordPress/openverse-catalog#549
Closed
6 of 7 tasks

iNaturalist #1608

sarayourfriend opened this issue Mar 24, 2022 · 8 comments · Fixed by WordPress/openverse-catalog#549
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🏁 status: ready for work Ready for work

Comments

@sarayourfriend
Copy link
Collaborator

Provider API Endpoint / Documentation

https://www.inaturalist.org/pages/api%252Breference#get-observations

Provider description

iNaturalist is a joint initiative of the California Academy of Sciences and the National Geographic Society.

People upload their own images of plants, insects, and other parts of the natural world. Users can choose what license they want their image to be shared under. The userbase includes professional photographers as well as amateurs.

Licenses Provided

"none", "any" (these have special meanings for iNaturalist) and then also any of the CC licenses

Provider API Technical info

They have a list of best practices that includes a bulk requests section that we should probably make use of.

It would probably take a new kind of DAG to use the bulk export options like the GBIF archive: https://www.gbif.org/dataset/50c9509d-22c7-4a22-a47d-8c48425ef4a7

Photos are returned with a link to the square.jpg version but replacing square.jpg with original.jpg in the S3 object key retrieves the original photo.

https://inaturalist-open-data.s3.amazonaws.com/photos/184323039/square.jpg
https://inaturalist-open-data.s3.amazonaws.com/photos/184323039/original.jpg

You can see this in iNaturalist's code here: https://github.com/inaturalist/inaturalist/blob/main/app/models/photo.rb

It is also the photo they use for the observation landing page.

Example list response: https://api.inaturalist.org/v1/observations?license=CC-BY,CC-BY-NA,CC-BY-SA,CC-BY-ND,CC-BY-NC-SA,CC-BY-NC-ND

You have to dig into the photos list of the individual observation.

Example landing page: https://www.inaturalist.org/observations/109369089

Checklist to complete before beginning development

  • Verify there is a way to retrieve the entire relevant portion of the provider's collection in a systematic way via their API.
  • Verify the API provides license info (license type and version; license URL provides both, and is preferred)
  • Verify the API provides stable direct links to individual works.
  • Verify the API provides a stable landing page URL to individual works.
  • Note other info the API provides, such as thumbnails, dimensions, attribution info (required if non-CC0 licenses will be kept), title, description, other meta data, tags, etc.
  • Attach example responses to API queries that have the relevant info.

Implementation

  • 🙋 I would be interested in implementing this feature.
@sarayourfriend sarayourfriend added 🧹 status: ticket work required Needs more details before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work ✨ goal: improvement Improvement to an existing user-facing feature labels Mar 24, 2022
@AetherUnbound AetherUnbound added help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🏁 status: ready for work Ready for work 💻 aspect: code Concerns the software code in the repository and removed 🧹 status: ticket work required Needs more details before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 9, 2022
@rwidom
Copy link
Collaborator

rwidom commented May 10, 2022

🙋

@sarayourfriend
Copy link
Collaborator Author

Assigned @rwidom! Thanks as always 🎉 I'm super excited for this one personally 😁 I want more bird pictures!

@AetherUnbound
Copy link
Collaborator

Thanks @rwidom! Neither @stacimc nor I have written a provider API script yet, but there is some documentation for how to do it here: https://github.com/WordPress/openverse-catalog/tree/main/openverse_catalog/templates

We also have plenty of other maintainers who have written provider API scripts, in case you end up running into trouble 😄

@rwidom
Copy link
Collaborator

rwidom commented May 10, 2022

Thanks @AetherUnbound ! The templates look super helpful, and I'm also psyched about bird pictures and the opportunity to dig in to a meatier project. Now I feel a little self-conscious about word choice, but regardless, thank you and I'm psyched. More soon!

@rwidom
Copy link
Collaborator

rwidom commented May 11, 2022

It looks like iNaturalist has both images and audio. Just clarifying that this is to pull images to start, and we can circle back for audio, right? (@AetherUnbound , @sarayourfriend , @stacimc ) Asking as I'm thinking about provider/dag naming conventions.

@sarayourfriend
Copy link
Collaborator Author

I didn't realize they had audio when I added this issue, that's amazing!

Wikimedia is our only provider as far as I know that provides both image and audio, so that would probably be the place to look for an existing established pattern for how to handle that.

@stacimc
Copy link
Collaborator

stacimc commented May 11, 2022

That is exciting 🎉 It's certainly reasonable to focus on getting image working first, for example, but we'll want to make sure we can support both. As Sara said, you can look to Wikimedia for an example -- specifically that's the wikimedia_commons_workflow DAG, not the ingestion_worfklow.

@rwidom
Copy link
Collaborator

rwidom commented May 11, 2022

Awesome. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🏁 status: ready for work Ready for work
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants