This script scrapes all photos and attached data to be used by Beit-Hatfutsot Open Databases. Images are *uploaded to an AWS s3 bucket.
*In prder to enable the scraping, valid access keys should be applied in the "bphotos/settings.py" file.
Using the command line, run:
scrapy crawl bphotos -o bphotos.json
After the prossess is completed, a .json file will be added to the folder, containg all the data for each photo, including Urls for stored original sized images and thumbnails.
run prsing.py
(don't forget to change the name of the input file to match the one produced by ceawler and make sure output is valid).
Using the output file from previous step, run merge_galleries.py
(and make sure that the output is valid).