- python 3.6.9
- selenium
scraping
contains codes to scrape the complete dataset in the form of csv filespreprocessing
contains codes to convert the csv files to construct a single json file after relevant preprocessingdata
contains all the csv files and preprocessed json filelibrary
contains json file storing coordinates of states to plot data on states of India- All the analysis are independent from each other and the codes are contained in their respective directories:
-
To scrape the data either do: 1.
cd scraping
- Download latest mozilla geckodriver from here
- Place the downloaded driver inside
drivers
directory with name:geckodriver
. So, final structure should bedrivers/geckodriver
. -
But this can take more than 24hrs depending on your internet connection.
chmod 700 *.sh bash scraping.sh
OR
- We have uploaded the scraped data to google drive, so you can download the zip of the dataset from here
mkdir -p data
- Extract the zip inside
data/
- Finally
data/dataset/
should contains just all the csv files.
-
To preprocess the data either do:
-
cd preprocessing chmod 700 *.sh bash preprocess.sh
This can take around 10 minutes depending on your system architecture.
OR
- We have uploaded the preprocessed data to google drive, so you can download the preprocessed zip from here
mkdir -p data
- Extract the zip inside
data/
- Finally
data.json
should be present insidedata/
directory.
-
-
After scraping and preprocessing, to get results from each of the analysis refer to READMEs of the corresponding analysis directories: