-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add download Wikidata dump command to CLI #517
Labels
-priority-
High priority
feature
New feature or request
help wanted
Extra attention is needed
Outreachy
Available for Outreachy participants
Comments
This was referenced Dec 8, 2024
andrewtavis
added
help wanted
Extra attention is needed
-priority-
High priority
Outreachy
Available for Outreachy participants
labels
Dec 8, 2024
Apologize early, if those features are not planed.. 😞 |
No need to apologize, @axif0! You're a part of planning the features :) :) This seems good to me! We'll just take the next most recent one? If there's no most recent one we'll take the last most recent one? |
Yes.. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
-priority-
High priority
feature
New feature or request
help wanted
Extra attention is needed
Outreachy
Available for Outreachy participants
Terms
Description
Scribe-Data will be expanding its functionality to work from Wikidata dumps. The first step in this is to add the ability for the CLI to download Wikidata Lexeme dumps. The following command should be added in this issue:
The above will download the dumps from dumps.wikimedia.org/wikidatawiki/entities/. In the fist set of queries the latest
.json.bz2
file will be downloaded, and in the second the URL for the givenYYYYMMDD
stamp will be checked and a.json.bz2
dump will be downloaded to the PWD. The third would add in an output directory path as is done on the get command, but let's not change the file name. We'll just allow the user to put it in a directory 😊The functionality should be added in a file
src/scribe_data/cli/download.py
, with the option being added intosrc/scribe_data/cli/main.py
:)Contribution
Being worked on by @axif0 as a part of Outreachy! 📶🚀
The text was updated successfully, but these errors were encountered: