It's an end-to-end Machine Learning Project. The purpose of this project is to predict whether a person is suffering from a particular disease or not on the basis of his/her input data. The prediction has been done by using Machine Learning (ML) classification algorithms . Currently, this web app can predict 3 types of diseases (Diabetes, Parkinson's and Heart Disease).
The datasets that are used for training the ML models are:
- The diabetes dataset consists of 768 data points, with each datapoint having 8 features. This dataset is Pima Indians Diabetes Database found on the kaggle.
Features
Pregnancies
: Number of times pregnantGlucose
: Plasma glucose concentration a 2 hours in an oral glucose tolerance testBloodPressure
: Diastolic blood pressure (mm Hg)SkinThickness
: Triceps skin fold thickness (mm)Insulin
: 2-Hour serum insulin (mu U/ml)BMI
: Body mass index (weight in kg/(height in m)^2)DiabetesPedigreeFunction
: Diabetes pedigree functionAge
: Age (years)
Target Variable
9. Outcome
: Class variable (0 or 1) 268 of 768 are 1, the others are 0
- The heart dataset consists of 1025 data points, with each datapoint having 13 features. This dataset is Heart Disease Dataset found on the kaggle.
Features
age
: age in yearssex
: (1 = male; 0 = female)cp
: chest pain typetrestbps
: resting blood pressure (in mm Hg on admission to the hospital)chol
: serum cholestoral in mg/dlfbs
: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)restecg
: resting electrocardiographic resultsthalach
: maximum heart rate achievedexang
: exercise induced angina (1 = yes; 0 = no)oldpeak
: ST depression induced by exercise relative to restslope
: the slope of the peak exercise ST segmentca
: number of major vessels (0-3) colored by flourosopythal
: 0 = normal; 1 = fixed defect; 2 = reversable defect
Target Variable
14. target
: Class variable (0 or 1) 526 of 1025 are 1, the others are 0. Value 0 = no heart disease and 1 = heart disease
- The ParkinsonsDisease dataset consists of 195 data points, with each datapoint having 22 features. This dataset is Parkinsons Disease Dataset found on the kaggle.
Features
MDVP:Fo(Hz)
: Average vocal fundamental frequencyMDVP:Fhi(Hz)
: Maximum vocal fundamental frequencyMDVP:Flo(Hz)
: Minimum vocal fundamental frequencyMDVP:Jitter(%)
MDVP:Jitter(Abs)
MDVP:RAP
MDVP:PPQ
Jitter:DDP
: Several measures of variation in fundamental frequencyMDVP:Shimmer
MDVP:Shimmer(dB)
Shimmer:APQ3
Shimmer:APQ5
MDVP:APQ
Shimmer:DDA
:Several measures of variation in amplitudeNHR
HNR
: Two measures of ratio of noise to tonal components in the voiceRPDE
DFA
: Signal fractal scaling exponentspread1
spread2
PPE
: Three nonlinear measures of fundamental frequency variationD2
: Two nonlinear dynamical complexity measures
Target Variable
23. status
: Class variable (0 or 1) 147 of 195 are 1, the others are 0. Value 1 - Parkinson's, 0 - healthy