Link to dataset : here
The problem deals with the notion of “churner”. A churner refers to the end of an agreement between a customer and a company. In this context, it is important to detect in advance, if a customer has a high risk of breaking his contract with a company and to make commercial benefit to keep it.
A notable point is to pay attention to the different offers that we make. On the one hand, it is necessary to avoid providing offers to customers who are not future churners (less profitable, frowned upon, waste of time, etc.) and in fact, prioritize action on churner customers. On the other hand, detectors should not be used at the last moment. This allows you to leave a margin of time to regain the customer's trust. This is why the data provided is based on 4 consecutives months dividing the life cycle of the client into three phases (good / action / termination).
In short, we must succeed in predicting, using machine learning tools, which customers are highly likely to leave the company by prioritizing the prediction of churners over that of non-churners by using the specificity metric.
Csv database of 99,999 observations for 226 variables (numeric and non-numeric).
- Numerical variables,
- Removal of duplicates,
- Transformation of missing values,
- Outlier management,
- Creation of the variable "churner",
- Management of irregular distribution,
- Split Train/Test.
- Dimensional reduction of the dataset with ACP,
- SVM classifier,
- SVM tuning,
- Perceptron,
- Analyses.
- Colab,
- Introduction to Pytorch,
- Creation of models,
- Train & test & evaluation,