Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Kiwix catalog should be automaticaly populated and updated #20

Open
letompouce opened this issue Apr 23, 2020 · 6 comments
Open

The Kiwix catalog should be automaticaly populated and updated #20

letompouce opened this issue Apr 23, 2020 · 6 comments
Assignees

Comments

@letompouce
Copy link
Member

We've been struggling with naming exceptions from the Kiwix library, which prevented us to automaticaly update our catalog's kiwix.yml.

Time has passed since then, Kiwix finally industrialized their processes, and there is an API that provides quite all of the informations we need.

Example with a new ZIM, nota-bene, for which the recipe has been published.

We can build quite everything from it, from the filename to its download link:

$ curl --silent https://api.farm.openzim.org/v1/schedules/nota_bene \
    | jq -r '.config.flags.name + "_" + ( .most_recent_task.updated_at | split("-")[0:2] | join("-"))'
nota-bene_fr_all_2020-04

We could use such informations to automaticaly update our kiwix.yml catalog.

We could even convert the whole Kiwix library to our catalog format, which would make anything published by Kiwix available to our devices. Some questions remain about what to cache (introduce a weigth field based on deploy frequency) and the large amount of entries that would probably blind out our content curators, but still, we can technicaly do it.

@letompouce letompouce self-assigned this Apr 23, 2020
@mgautierfr
Copy link
Member

Hello @letompouce :)

Note that we have a opds stream also to get the list of zim file :
https://library.kiwix.org/catalog/root.xml

@barbayellow
Copy link

Do you have something that would be human readable ? It's for our content team which is not very familiar with API manipulations or XML exploitation :)

@mgautierfr
Copy link
Member

There is this page : https://wiki.kiwix.org/wiki/Content_in_all_languages

I think it is up to date but I don't know how accurate it is. @kelson42 may give you some insight here.

@kelson42
Copy link

This is updated one time a day but goal is to get rid of that and use something based on the OPDS stream.

@letompouce
Copy link
Member Author

Follow up: since then, we made some progress on this issue.

kitom is a script that can add a new ZIM to our catalog: it parses the ZimFarm API and several other endpoints to fill the required fields in Omeka. The script is also able to update the ZIM files registered in Omeka.

zimeka2descriptor builds a new catalog from the ZIM files registered in Omeka.

(note: these two scripts are in private repositories for now, mostly because of aesthetics considerations; Good Friends have no secrets for each other, feel free to ask :p )

We found some corner cases that are worth a discussion with Kiwix people, stay tuned!

@kelson42
Copy link

kelson42 commented Feb 8, 2021

@letompouce Like I said a few times already, this is a bad idea to have considered the zimfarm as a publishing/distribution platform. It is not. Because things happen most of the time automatically today, a few assumptions can be made... but these assumptions might become wrong really quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants