An attempt of unifying the various tools for batch uploads under one repo.
Heavily based on the LSH repo
You can install BatchUploadTools
via pip
using:
pip install git+https://github.com/lokal-profil/BatchUploadTools.git
If it is your first time running pywikibot you will also have to set up a
user-config.py
file.
To run as a different user to your standard pywikibot simply place a
modified user-config.py
-file in the top directory.
To use a different user for a particular batchupload place the user-config.py
in the subdirectory and run the script with -dir:<sub-directory>
.
Extend make_info
to create own methods for reading and processing the indata.
Any method marked as abstract must be implemented locally. You can make use
of the various helper functions in the other classes.
If you are making use of mappings lists on Wikimedia Commons then create a
MappingList
instance for each such list to manage the creation of the
mapping tables, the harvest of the tables when mapped and the preservation of
old mappings when new lists are needed for later uploads.
Alternatively you can make use of only the prep-uploader/uploader tools by creating your own indata file. This must then be a json file where each media file is represented by a dictionary entry with the original filename (without the file extension) as the key and the following values:
info
: the wikitext to be used on the description page (e.g. an information template)filename
: the filename to be used on Commons (without the file extension)cats
: a list of content categories (without "Category" prefix)meta_cats
: a list of meta categories (without "Category" prefix)
- Load indata to a dictionary
- Process the indata to generate mapping lists
- Load the indata and the mappings to produce a list of original filenames
(of media files) and their final filenames as well as json holding the
following for each file:
- Maintenance categories
- Content categories
- File description
- Output filename
- Input filename or url to file (as key)
- Run the prep-uploader to rename the media files and create the text file for the associated file description page. *
- Run the uploader to upload it all
* This step is not needed for upload by url.
To generate new tables:
- collect the data you wish mapped
- create a
MappingList
instance - use
mappings_merger()
ormulti_table_mappings_merger
to combine the collected data with pre-existing data. (setupdate=False
if there is no pre-existing data). - pass the result to
save_as_wikitext()
To make use of the mapping list data:
- create a
MappingList
instance - load the existing data using
load_old_mappings()
- pass the result to
consume_entries()
To make use of the post-upload processing tools use import batchupload.postUpload
.
For usage examples see lokal-profil/upload_batches. In particular SMM-images.
In most cases it is worth doing a second pass over any files which trigger an error since it is either a temporary hick-up or the file was actually uploaded. Below follows a list of of common errors and what to do about them (when known).
stashedfilenotfound: Could not find the file in the stash.
Seems to primarilly be due to larger files. Solution: Manually upload this using Upload Wizard.stashfailed: This file contains HTML or script code that may be erroneously interpreted by a web browser.
Either you really have html tags in your exif data or you have triggered T143610. Smaller files can often be uploaded unchunked (slow).stashfailed: Cannot upload this file because Internet Explorer would detect it as "$1", which is a disallowed and potentially dangerous file type
No clue yet. See T147720
Basic support for Structured data on Commons
is offered by passing expect_sdc
to the uploader and providing the data as
either a <basename>.sdc
file (where <basename>
is shared with the .info
text file holding the associated file description page) or under the sdc
-key
if the data is provided as a make_info json file.
The expected format of the data is described at pywikibot-sdc.