Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Customer Churn Prediction Model #1576

Merged
merged 9 commits into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Customer Churn Prediction

## Overview
This project implements a machine learning solution to predict customer churn using three different algorithms: Random Forest, XGBoost, and Logistic Regression. The model helps identify customers who are likely to discontinue services, enabling proactive retention strategies.

## Table of Contents
- [Features](#features)
- [Requirements](#requirements)
- [Project Structure](#project-structure)
- [Installation](#installation)
- [Model Comparison](#model-comparison)

## Features
- Data preprocessing and feature engineering
- Implementation of three machine learning algorithms:
- Random Forest Classifier
- XGBoost Classifier
- Logistic Regression
- Model performance comparison and evaluation
- Feature importance analysis
- Cross-validation for robust model validation

## Requirements
```
python>=3.8
pandas
numpy
scikit-learn
xgboost
matplotlib
seaborn
```

## Project Structure
```
customer-churn-prediction/
β”‚
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ Churn_Modelling.csv
β”‚
β”œβ”€β”€ saved models/
β”‚ β”œβ”€β”€ Gradient_Boosting_Classifier.joblib
β”‚ β”œβ”€β”€ scaler.joblib
β”‚
β”œβ”€β”€ notebooks/
β”‚ └── Model.ipynb
β”‚
└── README.md
```

## Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/customer-churn-prediction.git
cd customer-churn-prediction
```

2. Create a virtual environment and activate it:
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

3. Install required packages:
```bash
pip install -r requirements.txt
```

## Model Comparison

### Performance Metrics

| Model | Accuracy | Precision |
|--------------------|----------|-----------|
| Random Forest | 0.87 | 0.83 |
| XGBoost | 0.88 | 0.86 |
| Logistic Regression| 0.82 | 0.77 |

### Key Findings
- XGBoost performed best overall with highest accuracy and AUC-ROC scores
- Random Forest showed comparable performance with slightly lower metrics
- Logistic Regression provided a good baseline but was outperformed by both ensemble methods

Threshold vs Recall and Threshold vs Precision graph (XGBoost)
![image](https://github.com/user-attachments/assets/42be4ba5-052d-4e7c-8c16-be57bc929d80)

ROC Curve
![{05BAA722-2B5D-466B-94C4-5ECB09D9A904}](https://github.com/user-attachments/assets/3a3cacb5-15e2-4876-bf49-94d3d3515866)


Loading
Loading