Select the predictors based on the importance rankings.
Data - Kaggle Titanic Dataset
Takeaway:
- Optimal number of features identified based on accuracy using a Random Forest Estimator
Statistically 59% of customers don’t return after a bad customer service experience.
Data - Kaggle Telco Dataset
Churn: Whether the customer churned or not (Yes or No)
Takeaway:
- Random Forests requires less preprocessing and the training process is also much simpler.
- Hyper-parameter tuning is easier with random forest when compared to neural networks
Predictive Model to help telemarketing team concentrate resources on more promising clients first.
Model Comparison:
- Linear Regression
- KNeighbors
- SVM: Support Vector Machines
- Decision Trees
- Random Forest Classifier
Data Source - 41,118 Bank Data between 2008 and 2013 and contains the results of a telemarketing campaign including customer’s response to the bank’s offer of a deposit contract.
Classification Model for breast cancer Using Random Forest, PCA & Hyperparameter Tuning (RandomizedSearchCV, GridSearchCV)
Model Comparison:
- Baseline RF
- Baseline RF with PCA
- Baseline RF with PCA and Hyperparameter Tuning
Data Source - Dataset = Scikit-learn “breast cancer” dataset. https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html