This Repo Contains the notebook and presentation of Our Approach which we used and presented during the competition Cassandra organised by Udyam'21.
In the competition we have to identify the defaulter based on the demographics data and payment_history.
For more info about competition visit - Here
Presentation of Our Approach - Here
A brief description of our Approach
We used feature Aggregation on payment history and combined it with the original demographics data and trained LGBM model with optuna hyperparameter tuning.
Our method can serve as a good technique for handling class imbalance in cases of defaulter predictions.
We showed a method which utilised the concept of covariate shift and proved that the distribution of train/test is dissimilar to quite a good extent, which yield failures related to trusting of cross validation scores. Our this analyis about data stood among the others.
- Best notebook to start
- Train/Test Similarity Analysis using covariate shift
- Optuna Tuning
- LGBM
- Aggreation and Magic Features (Credits - Chris Deotte)
We got 1st position in the event while applying this approach.
Beta (left) and Beta with labels(right)
Alpha with labels(left) and Delta with labels(right)
Mainak Samanta |
Akshat Sood |
Aman Mishra |