Completed Data Mining project using PLACES data from the CDC
The Project report is a PDF summary. The Data_Mining_Final_Project file is the actual notebook.
Data Background The CDC and Robert Wood Johnson foundation have been partnering for the last few years to provide a huge repository of data and metrics on public health*. This data is known as the PLACES dataset.The purpose of such data is to help the CDC stay ahead of emerging disease trends. There are hundreds of cities and many disease metrics that are tracked, meaning a vast amount of information can be mined and gained from such a repository. My project hopes to do two things: First would be to navigate such a large dataset successfully, second to answer the question: For the States with more PLACES data locations, are disease metrics generally different. And how do they differ?