This is a repository for sharing the code and reports for the DA231 MTech (Online) class project.
Our goal is to find the correlation between literacy, rainfall, groundwater quality and crop production across India.
It is intuitive that good rainfall is conducive to good production. How do the regions perform during poor rainfall in the rainy season?
We also want to cross-match with the literacy rate of the region to see if higher levels of literacy affects crop production.
The project structure is as follows:
root |- code : Contains the iPython files exported from Google Colab |- data : Contains all the data which is used for running the code |- report : Contains PDF report for the project |- slides : Contains PDF file for presentation
Source: https://www.kaggle.com/anjali21/agricultural-production-india
Size : 2,46,091 records X 7 columns (~14MB)
Format: CSV file
Granularity: States and Districts in India. Years - 2000-2014
Quality of Groundwater, Lake/Tank Water, River Water: https://cpcb.nic.in/nwmp-data/
Size: ~1500 records x 18 columns * 8 years
Format: Individual PDF for each year
Granularity: Detailed upto individual water sources for Years 2012-2019
Challenges:
- PDF to json/csv conversion
- Converting location code/location information to district for correlation with other data sets
Source: https://www.kagxgle.com/rajanand/rainfall-in-india
Size: 4116 records X 19 columns
Format: CSV file
Granularity: States in India. Years - 1901-2015
Source: https://www.kaggle.com/doncorleone92/govt-of-india-literacy-rate
Size: 36 records x 8 columns
Format: csv
Granularity: State level from 2001 and 2011 census
The main code is available in RAW Crop.ipynb
The evaluation code is available in RAW Crop Evaluation.ipynb