Cardiovascular disease, also known as heart disease, refers to conditions affecting the heart or blood vessels. It includes four entities: coronary artery disease (CAD), cerebrovascular disease, peripheral artery disease (PAD), and aortic atherosclerosis. CAD is caused by decreased myocardial perfusion, resulting in angina due to ischemia, and can lead to myocardial infarction and/or heart failure. It accounts for one-third to one-half of all cases of cardiovascular disease. Other problems that can arise within the cardiovascular system include endocarditis, rheumatic heart disease, and conduction system abnormalities. Read More
This notebook consists of analyzing the Cardiovascular Data and then with Machine Learning, predicting Heart Disease using different models.
This dataset has been taken from Kaggle consisting the following columns:
- General_Health - Would you say that in general, your health is?
- Checkup - About how long has it been since you last visited a doctor for a routine checkup?
- Exercise - During the past month, other than your regular job, did you participate in any physical activities or exercises such as running?
- Skin_Cancer - Respondents that reported having skin cancer.
- Other_Cancer - Respondents that reported having any other types of cancer.
- Depression - Respondents reported having a depressive disorder (including depression, major depression).
- Diabetes - Respondents reported having diabetes. If yes, what type of diabetes is/was it?
- Arthritis - Respondents reported having Arthritis.
- Sex - Respondent's Gender.
- Heart_Disease - Respondents that reported having coronary heart disease or myocardial infarction.
This notebook contains a complete analysis of the data in the form of multiple charts.
It has been created with five different machine-learning models to get the best results. The following are the models:
- Logistics Regression
- K Nearest Neighbors (KNN)
- Random Forest
- Decision Tree
- Extra Decision Tree
- Install Python version 3
Install the Anaconda from here.
Install the following libraries to complete the setup:
-
Pandas: To read the dataset.
conda install -c anaconda pandas
-
Matplotlib: To visualize data.
conda install -c conda-forge matplotlib
-
Seaborn: To visualize data.
conda install -c anaconda seaborn
-
Scikit-learn: For machine learning.
conda install -c anaconda scikit-learn