I was interested in knowing more about how the literacy rate in Indian states was affected and what factors could we work on to help improve the literacy rate of different states. I found some data from kaggle which contained census data regarding the literacy rate in diffrent Indian states and the education indicators, eg: No of teachers, no. of schools, etc., for Indian states and Union territories for 2015-16. Thus, using that data, I tried to answer these 3 questions in depth:
- Which states have the highest and the lowest literacy rates?
- How are the top 3 states different from the bottom 3 states and what factors can the bottom 3 states work on to increase their literacy rate?
- Which class is the dropout rate maximum in ?
I have also created a medium post on this analysis that can be found here : https://medium.com/@ishannangia/a-statistical-look-at-the-indian-literacy-rate-80a0e541ba7d
- data : This folder contains the datasets.
- Education in India.ipynb : This is the jupyter notebook where all the analysis has been done.
Note: You can find more information about the dataset at https://www.kaggle.com/rajanand/education-in-india
I followed the CRISP-DM when working on the analysis:
I wanted to answer the following questions:
- Which states have the highest and the lowest literacy rates?
- How are the top 3 states different from the bottom 3 states and what factors can the bottom 3 states work on to increase their literacy rate?
- Which class is the dropout rate maximum in ?
I collected the relevant data from here : https://www.kaggle.com/rajanand/education-in-india
This data has the literacy rate of different Indian states and the education indicators of different states.
The data was already clean but to a lot of new features were made in order to answer the questions. There was a lot of data wrangling involoved.
As there was no model to use for prediction, the answers to my questions were as follows:
-
The top 3 states were Kerala, Lakshadweep and Mizoram and the bottom 3 were Bihar, Telngana and Arunachal Pradesh.
-
The difference in male and female literacy rates, rural population proportion and dropout rates from 8th to 9th class played a huge role in separating the top 3 and bottom 3 states
-
The dropout rates in different classes were explored and while the dropout rate for 6th class was the highest, more students had enrolled in class 4 than had dropped out.
A blog post has bee created on Medium here : https://medium.com/@ishannangia/a-statistical-look-at-the-indian-literacy-rate-80a0e541ba7d